Video compositing — assembling multi-element videos

POST /api/v1/creative-hub/generate/composite. Provider creatomate. Template-based: video clips + images + text + audio + transitions → final mp4.

Written By Salvatore Sinigaglia

Last updated About 5 hours ago

POST /api/v1/creative-hub/generate/composite. Provider creatomate. Template-based: video clips + images + text + audio + transitions → final mp4.

Video compositing — assembling multi-element videos

POST /api/v1/creative-hub/generate/composite (verified apps/backend/src/routes/api/creative-hub-generate.route.ts). Provider: creatomate (only one verified in apps/backend/src/providers/creative/types.ts). Combines: video clips, images, text overlays, audio (TTS-generated or uploaded), transitions — into a single output mp4. Template-based: Creatomate defines the layout; you populate slots. Async via composite-video.worker.ts. Use case: assemble TTS narration + B-roll video + product image + CTA text in one pipeline.

Who is this for

Mediabuyers building polished video ads from multiple AI-generated or uploaded elements. Especially: anyone producing localized variants (same template, different audio per locale).

What compositing does

Stitches multiple media elements + text + audio into a final video, per a defined template:

Final mp4 = template(  video clip slot,  image overlay slot,  text title slot,  text CTA slot,  audio track slot (TTS or uploaded),  transitions per Creatomate spec)

Templates live in Creatomate's library. Each template defines:

  • Layout (positions of elements over time)
  • Animation / transitions
  • Slot definitions (variable inputs)
  • Total duration

The single provider

Wevion uses creatomate only for compositing. Creatomate's API supports template-driven video composition; Wevion calls it with the template + slot values.

How to composite

Step 1: Pick or create a template

Templates are managed in Creatomate's own UI (Wevion references them by template ID). Common templates:

  • Product hero: title text + product video + CTA + outro logo
  • Testimonial: avatar video + caption overlay + brand bumper
  • Multi-product carousel: 3-4 product slots cycled with transitions
  • Animated text: bold typography over background video

Each template exposes named slots (e.g. headline_text, product_video, cta_text).

Step 2: Open the generator in Wevion

/creative-hubAI GenerateComposite tab.

Step 3: Pick template + populate slots

Per the template's slot definitions, provide values:

  • Text slots: type the copy
  • Video slots: pick from Creative Hub (URL reference)
  • Image slots: pick from Creative Hub
  • Audio slot: pick a TTS output (from ch-116) or uploaded audio

Wevion's UI surfaces the slot list dynamically based on the template selected.

Step 4: Submit

Click Generate. Returns 202 Accepted + job_id.

Step 5: Track + download

Compositing time: 30 sec - 5 min depending on template complexity + element count + duration.

Once completed: final mp4 in your Drive folder. Use in Campaign Creator or download for external use.

Endpoint

POST /api/v1/creative-hub/generate/composite (verified).

Body:

  • template_id (optional, Creatomate template ID)
  • elements (array of media elements, each { type: video | image | audio | text, ... })
  • drive_folder_id (optional, target output folder)

Note the body uses an elements[] array, not a modifications object. Returns 202 + job_id. Worker calls the Creatomate API, polls for completion, downloads the mp4, and stores it. Poll GET /api/v1/creative-hub/generate/jobs/:id.

Cost

Compositing cost depends on duration + complexity (more elements = higher rendering cost). Generally lower than video generation since no AI inference is happening — just rendering.

Failed jobs free. See ch-112.

Typical compositing recipes

Product hero (15-30 sec)

template_id: product-hero-template-idelements:  - { type: text,  ... "Introducing the X-2000" }  - { type: video, ... <Creative Hub product video> }  - { type: text,  ... "Shop now →" }  - { type: audio, ... <Creative Hub TTS narration> }

Output: branded video ad ready for Reels / feed.

Testimonial (20-45 sec)

template_id: testimonial-template-idelements:  - { type: video, ... <Creative Hub avatar video> }  - { type: text,  ... "From a verified customer" }  - { type: image, ... <Creative Hub logo image> }

Output: testimonial-format ad with avatar + captions + branding.

Multi-locale variants

For N languages: same template + same video slots + different audio slot per language (each pre-generated via TTS in target language). N compositing jobs = N localized ad variants.

Faster than re-generating videos per language.

The composition pipeline

Typical full ad creation pipeline using Wevion's tools:

  1. Generate image (ch-113) — product hero shot
  2. Generate video (ch-114) — product in motion (image-to-video using the hero)
  3. Generate avatar (ch-115) — spokesperson saying the script
  4. Generate TTS (ch-116) — additional narration per locale
  5. Composite (this article) — assemble everything into final ad mp4

Each step adds polish; full pipeline takes 10-30 min plus generation latencies.

Best practices

Build templates once, reuse forever

Template setup in Creatomate is one-time work. Once set: just populate slots. For agencies: build per-client templates that match brand guidelines, then populate per-campaign.

Match template duration to placement

  • Reels / Stories: 6-15 sec
  • Feed: 6-30 sec
  • In-stream: 15-30 sec

Choose templates whose intrinsic duration matches placement. Trimming a 30-sec template down to 6 sec usually looks rushed.

Use compositing for localization

Generate localization-ready assets:

  • Audio variants per language (cheap)
  • Same video + image slots (no re-generation)
  • Composite per language = many localized ad variants from one template

Preview before generating final variants

For new templates: composite once with placeholder content to validate slot mapping. Then commit to real content.

Common mistakes

  • Template doesn't match brand: spend time on template setup; everything cascades from it
  • Skipping slot validation: typos in slot names = generation fails or produces unexpected output
  • Generating 10 variants without preview: validate template behavior first
  • Forgetting audio length must match video length: TTS too long = audio cut off; too short = silent gap. Plan duration up front.
  • Using compositing for simple cuts: if you just need to splice 2 clips, video editor faster

Common issues

  • Generation failed (slot mismatch): check error_message; common: typo in slot name vs template definition
  • Output missing element: template expected slot you didn't populate; populate with placeholder or update template to make optional
  • Audio out of sync: total audio duration doesn't match template's expected duration; adjust TTS length or pick different template
  • Output quality lower than inputs: compositing re-encodes; some loss expected; verify template's output resolution settings

FAQ

What is video compositing in Wevion?

Video compositing in Wevion's Creative Hub uses the creatomate provider to assemble multiple elements — video clips, images, text overlays, audio, and transitions — into a single final mp4. It is template-based: Creatomate defines the layout and you populate the slots. A typical use is combining TTS narration, B-roll video, a product image, and CTA text into one finished ad in a single pipeline.

Where do compositing templates come from?

Templates are managed in Creatomate's own UI, and Wevion references them by template ID. Each template defines the layout, animation and transitions, slot definitions, and total duration, and exposes named slots such as headline_text, product_video, or cta_text. Wevion's UI surfaces the slot list dynamically based on the template you select, so you just fill in the values.

How do I create localized video variants with compositing?

Keep the same template and the same video and image slots, then swap only the audio slot per language, using TTS pre-generated in each target language. In Wevion, this means N compositing jobs produce N localized ad variants from one template. It is faster than re-generating videos per language, since only the audio changes.

Does compositing cost as much as video generation?

No. Compositing cost depends on duration and complexity, with more elements raising the rendering cost, but it is generally lower than video generation because no AI inference happens — it is just rendering. Failed compositing jobs in Wevion are free. Rendering time ranges from about 30 seconds to 5 minutes depending on template complexity, element count, and duration.

Steps

  1. Templates are managed in Creatomate's own UI (Wevion references them by template ID). Common templates: Product hero: title text + product video + CTA + outro logo Testimonial: avatar video + caption overlay + brand bumper Multi-product carousel: 3-4 product slots cycled with transitions Animated text: bold typography over background video Each template exposes named slots (e.g. headlinetext, productvideo, cta_text).
  2. /creative-hub → AI Generate → Composite tab.
  3. Per the template's slot definitions, provide values: Text slots: type the copy Video slots: pick from Creative Hub (URL reference) Image slots: pick from Creative Hub Audio slot: pick a TTS output (from ch-116) or uploaded audio Wevion's UI surfaces the slot list dynamically based on the template selected.
  4. Click Generate. Returns 202 Accepted + job_id.
  5. Compositing time: 30 sec - 5 min depending on template complexity + element count + duration. Once completed: final mp4 in your Drive folder. Use in Campaign Creator or download for external use.

Last updated: 2026-05-17