Generate AI videos
POST /api/v1/creative-hub/generate/video accepts luma_ray3 / runway_gen4. Broader catalog (kling_2_6/3_0, veo_3_1, seedance_pro) lives in the studio suite. Async, mp4 output in Drive.
Written By Salvatore Sinigaglia
Last updated About 5 hours ago
POST /api/v1/creative-hub/generate/video accepts luma_ray3 / runway_gen4. Broader catalog (kling_2_6/3_0, veo_3_1, seedance_pro) lives in the studio suite. Async, mp4 output in Drive.
Generate AI videos
POST /api/v1/creative-hub/generate/video (verified
apps/backend/src/routes/api/creative-hub-generate.route.ts). This legacy route accepts two providers: luma_ray3 (cinematic motion) and runway_gen4 (text-to-video + image-to-video). The broader video catalog inapps/backend/src/providers/creative/types.ts—kling_2_6,kling_3_0(strong physics),veo_3_1(Google, high quality),seedance_pro— is exposed through the newer studio suite (see ch-101). Params:prompt,provider,aspect_ratio,drive_folder_id. Async viagenerate-video.worker.ts; output mp4 stored in your Drive folder.
Who is this for
Mediabuyers needing short-form video for Reels / TikTok / Stories / in-stream. UGC alternatives without filming.
Providers
On the /generate/video route you can pick luma_ray3 or runway_gen4. The models below (kling_2_6, kling_3_0, veo_3_1, seedance_pro) are part of the broader catalog exposed through the studio suite, not this route.
luma_ray3
Best for: cinematic motion, dramatic camera movements, dreamy / atmospheric scenes.
Strengths: smooth motion, cinematic look. Weaknesses: less suited for fast cuts.
runway_gen4
Best for: text-to-video starting points + image-to-video animation of existing assets.
Strengths: most flexible input (text OR image), good general-purpose quality. Weaknesses: occasional motion artifacts.
kling_2_6
Best for: physics-heavy scenes (water, fabric, fire), realistic body motion.
Strengths: strong physics simulation. Weaknesses: less stylized control.
kling_3_0
Newer Kling model. Generally improved over 2_6 with better motion coherence + longer durations supported.
veo_3_1
Best for: highest quality output, complex scenes, professional ad polish.
Strengths: Google's premium model — best quality. Weaknesses: most expensive.
How to generate
Step 1: Open the generator
/creative-hub → AI Generate → Video tab.
Step 2: Pick a provider
On this route: luma_ray3 or runway_gen4. (The studio suite exposes the fuller catalog.)
Step 3: Pick input mode
- Text-to-video: prompt only
- Image-to-video: upload / select an image + prompt to describe animation
For consistent product / brand: image-to-video keeps your reference image as the starting frame.
Step 4: Write the prompt
Specify:
- Subject + action ("A coffee cup being filled with steaming coffee")
- Camera movement ("slow zoom in", "pan from left to right")
- Style ("cinematic", "vlog handheld", "studio")
- Setting / mood
Step 5: Set duration
Typically 4-10 seconds per generation. Longer = costlier + slower. Most ad placements: 5-6 sec ideal.
Step 6: Set aspect_ratio
9:16for Reels / Stories / TikTok1:1for feed16:9for in-stream / YouTube
Step 7: Submit
Click Generate. Returns 202 Accepted + job_id.
Step 8: Track + download
Jobs panel shows progress. Video generation is slower than image: 1-5 min typical depending on provider + duration.
Once completed: download mp4 or use directly in Campaign Creator (see ch-106).
Endpoint
POST /api/v1/creative-hub/generate/video (verified).
Body:
prompt(required, 1-2000 chars)provider(optional,luma_ray3|runway_gen4)image_url(optional, image-to-video)duration(optional, 2-10 seconds)aspect_ratio(optional,16:9|9:16|1:1)drive_folder_id(optional)
Returns:
202 Accepted+{job_id, status: "pending"}
Worker writes result_urls (single mp4) + cost_cents on completion. Poll GET /api/v1/creative-hub/generate/jobs/:id.
Cost
Higher than image generation. Cost scales with: provider tier (veo_3_1 most expensive), duration, resolution.
Failed jobs free. See ch-112.
Provider selection guide
Prompt patterns that work
Anchor with subject + setting
"A {subject} in {setting}, {action}, {camera movement}, {mood}"
Example: "A jogger on a misty mountain trail, running uphill, slow tracking shot from behind, golden hour light, inspirational mood."
Specify motion vs static
If the AI generates a near-static frame: explicitly say "dynamic camera movement, subject in motion".
Match the placement format
For 9:16 Reels: prompts that work in vertical framing. Mention "vertical composition" if needed.
Best practices
Generate target duration directly
Don't generate 10s + trim to 6s. Generate 6s — cheaper + better composition.
Plan around generation latency
Video generation takes 1-5 min per attempt. Don't wait between iterations — submit 3 prompt variants, work on something else, come back when done.
Use image-to-video for product consistency
Brand product shot → image-to-video → consistent product on every variant. Reduces re-generation due to product inaccuracies.
Pair with TTS + compositing
Video alone is silent; TTS narration (ch-116) + compositing (ch-117) gives final ad-ready output.
Common mistakes
- Generating 10s when ad is 6s: wasted credits + worse pacing
- Wrong aspect_ratio for placement: 16:9 video on 9:16 Reels = letterboxed; doesn't perform
- Single variant of expensive provider: generate 3 cheap then 1 of veo_3_1 for the winner concept
- Skipping image-to-video for product ads: text-to-video may misrepresent your product; image-to-video locks it
Common issues
- Motion artifacts (warping, weird limbs): typical for difficult scenes (multiple people, complex hand motion); try different provider or simplify prompt
- Provider rejected prompt: content moderation. Reword.
- Generation timed out (> 10 min): rare; check job error_message; usually re-submit succeeds
- Output silent: expected — video providers don't generate audio. Pair with TTS via compositing.
FAQ
Which video provider should I choose in Wevion?
On the /generate/video route you choose between luma_ray3 (cinematic motion) and runway_gen4 (the most flexible, with both text-to-video and image-to-video). The broader video catalog — kling_2_6 and kling_3_0 (physics-heavy scenes like water and fabric), veo_3_1 (Google's premium, highest quality), and seedance_pro — is available through the newer studio suite rather than this route.
What is the difference between text-to-video and image-to-video?
Text-to-video generates a clip from a prompt alone, while image-to-video starts from an image you upload or select plus a prompt describing the animation. In Wevion, image-to-video keeps your reference image as the starting frame, which is the recommended choice for product ads because it locks the product's appearance and avoids misrepresentation across variants.
Do AI-generated videos include sound?
No. Video providers in Wevion generate silent output, which is expected behavior. To add audio, pair the video with text-to-speech narration and combine them using video compositing. This lets you produce a finished, ad-ready clip with voiceover from otherwise silent generated footage.
How long can generated videos be and how long does generation take?
Each generation is typically 4-10 seconds, and most ad placements work best at 5-6 seconds, since longer durations cost more and run slower. Generate your target length directly rather than trimming later. Video generation is slower than image generation, usually taking 1-5 minutes depending on the provider and duration, so submit several variants and return when they finish.