Generate AI videos

Last updated: May 19, 2026

Generate AI videos

POST /api/v1/creative-hub/generate/video (verified apps/backend/src/routes/api/creative-hub-generate.route.ts). Five providers verified in apps/backend/src/providers/creative/types.ts: luma_ray3 (cinematic motion), runway_gen4 (text-to-video + image-to-video), kling_2_6 and kling_3_0 (Chinese-developed, strong physics), veo_3_1 (Google, high quality). Params: prompt, image_url (optional, for image-to-video), duration, aspect_ratio. Async via generate-video.worker.ts; output mp4 stored in your Drive folder.

Who is this for

Mediabuyers needing short-form video for Reels / TikTok / Stories / in-stream. UGC alternatives without filming.

The 5 providers

luma_ray3

Best for: cinematic motion, dramatic camera movements, dreamy / atmospheric scenes.

Strengths: smooth motion, cinematic look. Weaknesses: less suited for fast cuts.

runway_gen4

Best for: text-to-video starting points + image-to-video animation of existing assets.

Strengths: most flexible input (text OR image), good general-purpose quality. Weaknesses: occasional motion artifacts.

kling_2_6

Best for: physics-heavy scenes (water, fabric, fire), realistic body motion.

Strengths: strong physics simulation. Weaknesses: less stylized control.

kling_3_0

Newer Kling model. Generally improved over 2_6 with better motion coherence + longer durations supported.

veo_3_1

Best for: highest quality output, complex scenes, professional ad polish.

Strengths: Google's premium model — best quality. Weaknesses: most expensive.

How to generate

Step 1: Open the generator

/creative-hubAI GenerateVideo tab.

Step 2: Pick a provider

Dropdown of 5. Default may be luma_ray3 or workspace-configured.

Step 3: Pick input mode

  • Text-to-video: prompt only

  • Image-to-video: upload / select an image + prompt to describe animation

For consistent product / brand: image-to-video keeps your reference image as the starting frame.

Step 4: Write the prompt

Specify:

  • Subject + action ("A coffee cup being filled with steaming coffee")

  • Camera movement ("slow zoom in", "pan from left to right")

  • Style ("cinematic", "vlog handheld", "studio")

  • Setting / mood

Step 5: Set duration

Typically 4-10 seconds per generation. Longer = costlier + slower. Most ad placements: 5-6 sec ideal.

Step 6: Set aspect_ratio

  • 9:16 for Reels / Stories / TikTok

  • 1:1 for feed

  • 16:9 for in-stream / YouTube

Step 7: Submit

Click Generate. Returns 202 Accepted + job_id.

Step 8: Track + download

Jobs panel shows progress. Video generation is slower than image: 1-5 min typical depending on provider + duration.

Once completed: download mp4 or use directly in Campaign Creator (see ch-106).

Endpoint

POST /api/v1/creative-hub/generate/video (verified).

Body:

  • prompt (required)

  • provider (one of 5 verified)

  • image_url (optional, image-to-video)

  • duration (in seconds)

  • aspect_ratio

  • folder_id (optional)

Returns:

  • 202 Accepted + {job_id, status: "pending"}

Worker writes result_urls (single mp4) + cost_cents on completion.

Cost

Higher than image generation. Cost scales with: provider tier (veo_3_1 most expensive), duration, resolution.

Failed jobs free. See ch-112.

Provider selection guide

Use case

Pick

Product hero / cinematic launch

veo_3_1 (premium) or luma_ray3

Vlog / casual UGC vibe

runway_gen4

Animate an existing product photo

runway_gen4 (image-to-video)

Physics-heavy (fluid, fabric)

kling_2_6 or kling_3_0

First exploration / budget-conscious

runway_gen4 typically

Prompt patterns that work

Anchor with subject + setting

"A {subject} in {setting}, {action}, {camera movement}, {mood}"

Example: "A jogger on a misty mountain trail, running uphill, slow tracking shot from behind, golden hour light, inspirational mood."

Specify motion vs static

If the AI generates a near-static frame: explicitly say "dynamic camera movement, subject in motion".

Match the placement format

For 9:16 Reels: prompts that work in vertical framing. Mention "vertical composition" if needed.

Best practices

Generate target duration directly

Don't generate 10s + trim to 6s. Generate 6s — cheaper + better composition.

Plan around generation latency

Video generation takes 1-5 min per attempt. Don't wait between iterations — submit 3 prompt variants, work on something else, come back when done.

Use image-to-video for product consistency

Brand product shot → image-to-video → consistent product on every variant. Reduces re-generation due to product inaccuracies.

Pair with TTS + compositing

Video alone is silent; TTS narration (ch-116) + compositing (ch-117) gives final ad-ready output.

Common mistakes

  • Generating 10s when ad is 6s: wasted credits + worse pacing

  • Wrong aspect_ratio for placement: 16:9 video on 9:16 Reels = letterboxed; doesn't perform

  • Single variant of expensive provider: generate 3 cheap then 1 of veo_3_1 for the winner concept

  • Skipping image-to-video for product ads: text-to-video may misrepresent your product; image-to-video locks it

Common issues

  • Motion artifacts (warping, weird limbs): typical for difficult scenes (multiple people, complex hand motion); try different provider or simplify prompt

  • Provider rejected prompt: content moderation. Reword.

  • Generation timed out (> 10 min): rare; check job error_message; usually re-submit succeeds

  • Output silent: expected — video providers don't generate audio. Pair with TTS via compositing.

Related