Create AI avatars (UGC videos)

Last updated: May 19, 2026

Create AI avatars (UGC videos)

POST /api/v1/creative-hub/generate/avatar (verified apps/backend/src/routes/api/creative-hub-generate.route.ts). Provider: heygen (only one verified in apps/backend/src/providers/creative/types.ts). Params: script (text the avatar speaks), avatar_id (from HeyGen avatar library), voice_id (paired voice), language. Output: lip-synced video. Async via generate-avatar.worker.ts. Use case: UGC-style testimonial ads without filming a real person.

Who is this for

Mediabuyers running UGC-style ads, testimonial campaigns, or spokesperson explainers without filming real talent.

What an avatar generation produces

A short video of a chosen avatar (stock or custom) speaking your script. Lip-sync is automatic. Voice can be paired to the avatar (HeyGen avatar library has matched voices) or substituted (different language / different vocal style).

Typical outputs: 5-30 second clips of an avatar saying scripted lines, often used as testimonial cuts or product explainer hooks.

The single provider

Wevion uses heygen only. No alternative providers wired in. HeyGen's avatar library is large (stock avatars + you can upload custom in HeyGen UI).

How to generate

Step 1: Pick or upload an avatar in HeyGen

Avatar selection is done in HeyGen's own UI (Wevion references HeyGen avatars by avatar_id):

  • Browse HeyGen's stock library — many ethnicities, ages, styles

  • Or upload custom avatar in HeyGen UI (you / your spokesperson)

  • Copy the avatar_id from HeyGen

Wevion may surface a picker that loads HeyGen's library, but ultimately the ID lookup happens against HeyGen.

Step 2: Open the generator in Wevion

/creative-hubAI GenerateAvatar tab.

Step 3: Configure

Field

What

script

What the avatar should say (text). Max length typically 1000-3000 chars (~30-90 sec).

avatar_id

HeyGen avatar ID

voice_id

Voice (HeyGen-paired or custom from voice library)

language

Language code (en, it, es, fr, de, etc.)

Step 4: Write the script

Conversational, short sentences. Avatar performance is better with natural speech patterns vs corporate marketing language. Examples that work:

  • "Honestly, I tried five different headphones before this one — and the difference is night and day."

  • "Look at how easy this is — just click, pick, done."

Examples that don't work:

  • Long compound sentences with subordinate clauses

  • Marketing buzzwords ("synergize", "leverage", "ecosystem")

  • Technical jargon without context

Step 5: Submit

Click Generate. Returns 202 Accepted + job_id.

Step 6: Track + download

Avatar generation is slower: 2-10 min typical. Longer scripts = longer wait.

Once completed: download mp4 or use in Campaign Creator.

Endpoint

POST /api/v1/creative-hub/generate/avatar (verified).

Body:

  • script (required)

  • avatar_id (required)

  • voice_id (required)

  • language (e.g. en, it)

  • folder_id (optional)

Returns 202 + job_id. Worker calls HeyGen API, polls until done, downloads mp4, stores in Drive folder, marks creative_job.status: completed.

Cost

Heygen generation is higher-cost than image / TTS. Cost varies by:

  • Script duration (longer = more)

  • Resolution

  • Custom avatar vs stock (sometimes different pricing)

See ch-112 AI credits.

Custom avatars

For custom (you, your spokesperson, an actor):

  1. Upload video footage to HeyGen UI (their consent + capture flow)

  2. HeyGen creates a custom avatar

  3. Use that avatar_id in Wevion

Custom avatars often perform best (audience sees a real person, not stock).

Multi-language strategy

Same script → same avatar → different voice_id + language per locale → N video variants for N languages.

Use cases:

  • One spokesperson, 5 languages for international campaigns

  • Cheaper than re-recording with native actors per locale

Watch for: lip-sync accuracy may degrade in some languages; preview each output.

Best practices

Test with short script first

Run a 10-second script first to validate avatar + voice fit. Then commit to longer scripts.

Match avatar to audience

Audience demographic should see relatable avatar (age, ethnicity, style). Stock library helps; custom is best.

Avoid uncanny scenarios

  • Don't generate medical / authority claims with stock avatar (audience instinctively distrusts)

  • Don't make avatar look at extreme angles (HeyGen may not handle gracefully)

  • Don't over-script (long monologues feel off; cut to multiple short clips)

Pair with B-roll via compositing

Avatar talking head + B-roll footage cuts via compositing (ch-117) feels more produced than talking head alone.

Common mistakes

  • Long monologues: cut into 10-15 sec clips for ad pacing

  • Wrong language for voice / script: mismatch tanks performance — match language to script

  • Stock avatar for high-credibility claim: audience can spot stock; use custom or different ad format

  • Skipping preview test: generate short test first, then commit

Common issues

  • "avatar_id not found": verify the ID in HeyGen UI; copy-paste error common

  • Lip-sync looks off: language mismatch OR HeyGen model limitation; try different voice

  • Script too long error: HeyGen has script length limits; split into multiple shorter generations

  • Custom avatar not ready: HeyGen processes custom avatar uploads asynchronously; wait until HeyGen marks ready before referencing in Wevion

Related