Generate AI videos

Last updated: May 19, 2026

Generate AI videos

POST /api/v1/creative-hub/generate/video (verified apps/backend/src/routes/api/creative-hub-generate.route.ts). Five providers verified in apps/backend/src/providers/creative/types.ts: luma_ray3 (cinematic motion), runway_gen4 (text-to-video + image-to-video), kling_2_6 and kling_3_0 (Chinese-developed, strong physics), veo_3_1 (Google, high quality). Params: prompt, image_url (optional, for image-to-video), duration, aspect_ratio. Async via generate-video.worker.ts; output mp4 stored in your Drive folder.

Who is this for

Mediabuyers needing short-form video for Reels / TikTok / Stories / in-stream. UGC alternatives without filming.

The 5 providers

luma_ray3

Best for: cinematic motion, dramatic camera movements, dreamy / atmospheric scenes.

Strengths: smooth motion, cinematic look. Weaknesses: less suited for fast cuts.

runway_gen4

Best for: text-to-video starting points + image-to-video animation of existing assets.

Strengths: most flexible input (text OR image), good general-purpose quality. Weaknesses: occasional motion artifacts.

kling_2_6

Best for: physics-heavy scenes (water, fabric, fire), realistic body motion.

Strengths: strong physics simulation. Weaknesses: less stylized control.

kling_3_0

Newer Kling model. Generally improved over 2_6 with better motion coherence + longer durations supported.

veo_3_1

Best for: highest quality output, complex scenes, professional ad polish.

Strengths: Google's premium model — best quality. Weaknesses: most expensive.

How to generate

Step 1: Open the generator

/creative-hub → AI Generate → Video tab.

Step 2: Pick a provider

Dropdown of 5. Default may be luma_ray3 or workspace-configured.

Step 3: Pick input mode

Text-to-video: prompt only
Image-to-video: upload / select an image + prompt to describe animation

For consistent product / brand: image-to-video keeps your reference image as the starting frame.

Step 4: Write the prompt

Specify:

Subject + action ("A coffee cup being filled with steaming coffee")
Camera movement ("slow zoom in", "pan from left to right")
Style ("cinematic", "vlog handheld", "studio")
Setting / mood

Step 5: Set duration

Typically 4-10 seconds per generation. Longer = costlier + slower. Most ad placements: 5-6 sec ideal.

Step 6: Set aspect_ratio

9:16 for Reels / Stories / TikTok
1:1 for feed
16:9 for in-stream / YouTube

Step 7: Submit

Click Generate. Returns 202 Accepted + job_id.

Step 8: Track + download

Jobs panel shows progress. Video generation is slower than image: 1-5 min typical depending on provider + duration.

Once completed: download mp4 or use directly in Campaign Creator (see ch-106).

Endpoint

POST /api/v1/creative-hub/generate/video (verified).

Body:

prompt (required)
provider (one of 5 verified)
image_url (optional, image-to-video)
duration (in seconds)
aspect_ratio
folder_id (optional)

Returns:

202 Accepted + {job_id, status: "pending"}

Worker writes result_urls (single mp4) + cost_cents on completion.

Cost

Higher than image generation. Cost scales with: provider tier (veo_3_1 most expensive), duration, resolution.

Failed jobs free. See ch-112.

Provider selection guide

Use case	Pick
Product hero / cinematic launch	`veo_3_1` (premium) or `luma_ray3`
Vlog / casual UGC vibe	`runway_gen4`
Animate an existing product photo	`runway_gen4` (image-to-video)
Physics-heavy (fluid, fabric)	`kling_2_6` or `kling_3_0`
First exploration / budget-conscious	`runway_gen4` typically

Prompt patterns that work

Anchor with subject + setting

"A {subject} in {setting}, {action}, {camera movement}, {mood}"

Example: "A jogger on a misty mountain trail, running uphill, slow tracking shot from behind, golden hour light, inspirational mood."

Specify motion vs static

If the AI generates a near-static frame: explicitly say "dynamic camera movement, subject in motion".

Match the placement format

For 9:16 Reels: prompts that work in vertical framing. Mention "vertical composition" if needed.

Best practices

Generate target duration directly

Don't generate 10s + trim to 6s. Generate 6s — cheaper + better composition.

Plan around generation latency

Video generation takes 1-5 min per attempt. Don't wait between iterations — submit 3 prompt variants, work on something else, come back when done.

Use image-to-video for product consistency

Brand product shot → image-to-video → consistent product on every variant. Reduces re-generation due to product inaccuracies.

Pair with TTS + compositing

Video alone is silent; TTS narration (ch-116) + compositing (ch-117) gives final ad-ready output.

Common mistakes

Generating 10s when ad is 6s: wasted credits + worse pacing
Wrong aspect_ratio for placement: 16:9 video on 9:16 Reels = letterboxed; doesn't perform
Single variant of expensive provider: generate 3 cheap then 1 of veo_3_1 for the winner concept
Skipping image-to-video for product ads: text-to-video may misrepresent your product; image-to-video locks it

Common issues

Motion artifacts (warping, weird limbs): typical for difficult scenes (multiple people, complex hand motion); try different provider or simplify prompt
Provider rejected prompt: content moderation. Reword.
Generation timed out (> 10 min): rare; check job error_message; usually re-submit succeeds
Output silent: expected — video providers don't generate audio. Pair with TTS via compositing.

Generate images — image generation
Create avatars — spokesperson video alternative
Video compositing — combine video + TTS + text

Generate AI videos

Generate AI videos

Who is this for

The 5 providers

luma_ray3

runway_gen4

kling_2_6

kling_3_0

veo_3_1

How to generate

Step 1: Open the generator

Step 2: Pick a provider

Step 3: Pick input mode

Step 4: Write the prompt

Step 5: Set duration

Step 6: Set aspect_ratio

Step 7: Submit

Step 8: Track + download

Endpoint

Cost

Provider selection guide

Prompt patterns that work

Anchor with subject + setting

Specify motion vs static

Match the placement format

Best practices

Generate target duration directly

Plan around generation latency

Use image-to-video for product consistency

Pair with TTS + compositing

Common mistakes

Common issues

Related