Generate AI videos
Last updated: May 19, 2026
Generate AI videos
POST /api/v1/creative-hub/generate/video (verified apps/backend/src/routes/api/creative-hub-generate.route.ts). Five providers verified in apps/backend/src/providers/creative/types.ts: luma_ray3 (cinematic motion), runway_gen4 (text-to-video + image-to-video), kling_2_6 and kling_3_0 (Chinese-developed, strong physics), veo_3_1 (Google, high quality). Params: prompt, image_url (optional, for image-to-video), duration, aspect_ratio. Async via generate-video.worker.ts; output mp4 stored in your Drive folder.
Who is this for
Mediabuyers needing short-form video for Reels / TikTok / Stories / in-stream. UGC alternatives without filming.
The 5 providers
luma_ray3
Best for: cinematic motion, dramatic camera movements, dreamy / atmospheric scenes.
Strengths: smooth motion, cinematic look. Weaknesses: less suited for fast cuts.
runway_gen4
Best for: text-to-video starting points + image-to-video animation of existing assets.
Strengths: most flexible input (text OR image), good general-purpose quality. Weaknesses: occasional motion artifacts.
kling_2_6
Best for: physics-heavy scenes (water, fabric, fire), realistic body motion.
Strengths: strong physics simulation. Weaknesses: less stylized control.
kling_3_0
Newer Kling model. Generally improved over 2_6 with better motion coherence + longer durations supported.
veo_3_1
Best for: highest quality output, complex scenes, professional ad polish.
Strengths: Google's premium model — best quality. Weaknesses: most expensive.
How to generate
Step 1: Open the generator
/creative-hub → AI Generate → Video tab.
Step 2: Pick a provider
Dropdown of 5. Default may be luma_ray3 or workspace-configured.
Step 3: Pick input mode
Text-to-video: prompt only
Image-to-video: upload / select an image + prompt to describe animation
For consistent product / brand: image-to-video keeps your reference image as the starting frame.
Step 4: Write the prompt
Specify:
Subject + action ("A coffee cup being filled with steaming coffee")
Camera movement ("slow zoom in", "pan from left to right")
Style ("cinematic", "vlog handheld", "studio")
Setting / mood
Step 5: Set duration
Typically 4-10 seconds per generation. Longer = costlier + slower. Most ad placements: 5-6 sec ideal.
Step 6: Set aspect_ratio
9:16for Reels / Stories / TikTok1:1for feed16:9for in-stream / YouTube
Step 7: Submit
Click Generate. Returns 202 Accepted + job_id.
Step 8: Track + download
Jobs panel shows progress. Video generation is slower than image: 1-5 min typical depending on provider + duration.
Once completed: download mp4 or use directly in Campaign Creator (see ch-106).
Endpoint
POST /api/v1/creative-hub/generate/video (verified).
Body:
prompt(required)provider(one of 5 verified)image_url(optional, image-to-video)duration(in seconds)aspect_ratiofolder_id(optional)
Returns:
202 Accepted+{job_id, status: "pending"}
Worker writes result_urls (single mp4) + cost_cents on completion.
Cost
Higher than image generation. Cost scales with: provider tier (veo_3_1 most expensive), duration, resolution.
Failed jobs free. See ch-112.
Provider selection guide
Use case | Pick |
|---|---|
Product hero / cinematic launch |
|
Vlog / casual UGC vibe |
|
Animate an existing product photo |
|
Physics-heavy (fluid, fabric) |
|
First exploration / budget-conscious |
|
Prompt patterns that work
Anchor with subject + setting
"A {subject} in {setting}, {action}, {camera movement}, {mood}"
Example: "A jogger on a misty mountain trail, running uphill, slow tracking shot from behind, golden hour light, inspirational mood."
Specify motion vs static
If the AI generates a near-static frame: explicitly say "dynamic camera movement, subject in motion".
Match the placement format
For 9:16 Reels: prompts that work in vertical framing. Mention "vertical composition" if needed.
Best practices
Generate target duration directly
Don't generate 10s + trim to 6s. Generate 6s — cheaper + better composition.
Plan around generation latency
Video generation takes 1-5 min per attempt. Don't wait between iterations — submit 3 prompt variants, work on something else, come back when done.
Use image-to-video for product consistency
Brand product shot → image-to-video → consistent product on every variant. Reduces re-generation due to product inaccuracies.
Pair with TTS + compositing
Video alone is silent; TTS narration (ch-116) + compositing (ch-117) gives final ad-ready output.
Common mistakes
Generating 10s when ad is 6s: wasted credits + worse pacing
Wrong aspect_ratio for placement: 16:9 video on 9:16 Reels = letterboxed; doesn't perform
Single variant of expensive provider: generate 3 cheap then 1 of veo_3_1 for the winner concept
Skipping image-to-video for product ads: text-to-video may misrepresent your product; image-to-video locks it
Common issues
Motion artifacts (warping, weird limbs): typical for difficult scenes (multiple people, complex hand motion); try different provider or simplify prompt
Provider rejected prompt: content moderation. Reword.
Generation timed out (> 10 min): rare; check job error_message; usually re-submit succeeds
Output silent: expected — video providers don't generate audio. Pair with TTS via compositing.
Related
Generate images — image generation
Create avatars — spokesperson video alternative
Video compositing — combine video + TTS + text