Video compositing — assembling multi-element videos

Last updated: May 19, 2026

Video compositing — assembling multi-element videos

POST /api/v1/creative-hub/generate/composite (verified apps/backend/src/routes/api/creative-hub-generate.route.ts). Provider: creatomate (only one verified in apps/backend/src/providers/creative/types.ts). Combines: video clips, images, text overlays, audio (TTS-generated or uploaded), transitions — into a single output mp4. Template-based: Creatomate defines the layout; you populate slots. Async via composite-video.worker.ts. Use case: assemble TTS narration + B-roll video + product image + CTA text in one pipeline.

Who is this for

Mediabuyers building polished video ads from multiple AI-generated or uploaded elements. Especially: anyone producing localized variants (same template, different audio per locale).

What compositing does

Stitches multiple media elements + text + audio into a final video, per a defined template:

Final mp4 = template(
  video clip slot,
  image overlay slot,
  text title slot,
  text CTA slot,
  audio track slot (TTS or uploaded),
  transitions per Creatomate spec
)

Templates live in Creatomate's library. Each template defines:

  • Layout (positions of elements over time)

  • Animation / transitions

  • Slot definitions (variable inputs)

  • Total duration

The single provider

Wevion uses creatomate only for compositing. Creatomate's API supports template-driven video composition; Wevion calls it with the template + slot values.

How to composite

Step 1: Pick or create a template

Templates are managed in Creatomate's own UI (Wevion references them by template ID). Common templates:

  • Product hero: title text + product video + CTA + outro logo

  • Testimonial: avatar video + caption overlay + brand bumper

  • Multi-product carousel: 3-4 product slots cycled with transitions

  • Animated text: bold typography over background video

Each template exposes named slots (e.g. headline_text, product_video, cta_text).

Step 2: Open the generator in Wevion

/creative-hubAI GenerateComposite tab.

Step 3: Pick template + populate slots

Per the template's slot definitions, provide values:

  • Text slots: type the copy

  • Video slots: pick from Creative Hub (URL reference)

  • Image slots: pick from Creative Hub

  • Audio slot: pick a TTS output (from ch-116) or uploaded audio

Wevion's UI surfaces the slot list dynamically based on the template selected.

Step 4: Submit

Click Generate. Returns 202 Accepted + job_id.

Step 5: Track + download

Compositing time: 30 sec - 5 min depending on template complexity + element count + duration.

Once completed: final mp4 in your Drive folder. Use in Campaign Creator or download for external use.

Endpoint

POST /api/v1/creative-hub/generate/composite (verified).

Body:

  • template_id (Creatomate template ID)

  • modifications (JSON object of slot name → value)

  • folder_id (optional, target output folder)

Returns 202 + job_id. Worker calls Creatomate API, polls for completion, downloads mp4, stores.

Cost

Compositing cost depends on duration + complexity (more elements = higher rendering cost). Generally lower than video generation since no AI inference is happening — just rendering.

Failed jobs free. See ch-112.

Typical compositing recipes

Product hero (15-30 sec)

template: product-hero-template-id
modifications:
  headline_text: "Introducing the X-2000"
  product_video: <Creative Hub URL to AI-generated product video>
  cta_text: "Shop now →"
  audio: <Creative Hub URL to TTS narration>

Output: branded video ad ready for Reels / feed.

Testimonial (20-45 sec)

template: testimonial-template-id
modifications:
  avatar_video: <Creative Hub URL to HeyGen avatar video>
  caption_text: "From a verified customer"
  brand_logo: <Creative Hub URL to logo image>

Output: testimonial-format ad with avatar + captions + branding.

Multi-locale variants

For N languages: same template + same video slots + different audio slot per language (each pre-generated via TTS in target language). N compositing jobs = N localized ad variants.

Faster than re-generating videos per language.

The composition pipeline

Typical full ad creation pipeline using Wevion's tools:

  1. Generate image (ch-113) — product hero shot

  2. Generate video (ch-114) — product in motion (image-to-video using the hero)

  3. Generate avatar (ch-115) — spokesperson saying the script

  4. Generate TTS (ch-116) — additional narration per locale

  5. Composite (this article) — assemble everything into final ad mp4

Each step adds polish; full pipeline takes 10-30 min plus generation latencies.

Best practices

Build templates once, reuse forever

Template setup in Creatomate is one-time work. Once set: just populate slots. For agencies: build per-client templates that match brand guidelines, then populate per-campaign.

Match template duration to placement

  • Reels / Stories: 6-15 sec

  • Feed: 6-30 sec

  • In-stream: 15-30 sec

Choose templates whose intrinsic duration matches placement. Trimming a 30-sec template down to 6 sec usually looks rushed.

Use compositing for localization

Generate localization-ready assets:

  • Audio variants per language (cheap)

  • Same video + image slots (no re-generation)

  • Composite per language = many localized ad variants from one template

Preview before generating final variants

For new templates: composite once with placeholder content to validate slot mapping. Then commit to real content.

Common mistakes

  • Template doesn't match brand: spend time on template setup; everything cascades from it

  • Skipping slot validation: typos in slot names = generation fails or produces unexpected output

  • Generating 10 variants without preview: validate template behavior first

  • Forgetting audio length must match video length: TTS too long = audio cut off; too short = silent gap. Plan duration up front.

  • Using compositing for simple cuts: if you just need to splice 2 clips, video editor faster

Common issues

  • Generation failed (slot mismatch): check error_message; common: typo in slot name vs template definition

  • Output missing element: template expected slot you didn't populate; populate with placeholder or update template to make optional

  • Audio out of sync: total audio duration doesn't match template's expected duration; adjust TTS length or pick different template

  • Output quality lower than inputs: compositing re-encodes; some loss expected; verify template's output resolution settings

Related