Question 1

What kind of photo works best?

Accepted Answer

Front-facing or 3/4-facing portrait, clearly lit face, one person only, 512px+ on the short edge. Group photos, side profiles, and heavily stylized art (cartoons, paintings) produce weaker results on HeyGen V3; Omnihuman handles stylized inputs slightly better but still prefers photoreal portraits.

Question 2

Can I animate old / historical photos?

Accepted Answer

Yes. Both models are trained on diverse face data and can animate vintage photos, scanned family portraits, or upscaled old images. Best results come from photos that have been gently upscaled (4K is overkill — 1080p on the short edge is plenty) and color-corrected first.

Question 3

Do I have to write a script, or can I just speak?

Accepted Answer

Either. HeyGen V3 takes a written script and renders with built-in TTS. Omnihuman takes an audio file (your voice, a pro VO, or any licensed track) and drives motion from it. Recording a 20-second voice memo is often the fastest path when you want a specific voice.

Question 4

How realistic does the output look?

Accepted Answer

On a 1080p portrait with clean lighting and a natural-voice audio track, the output passes most casual viewing. At close inspection you'll notice subtle stiffness around cheeks and a slightly off-rhythm blink cadence — state-of-the-art in April 2026, not indistinguishable-from-real.

Question 5

Can I use this output commercially?

Accepted Answer

Yes on paid plans. Commercial rights are included with every paid tier on gVideo. But: if the photo is of a real person, you still need their permission to publish a talking video of them (rights-of-publicity laws apply to AI-animated portraits the same as actual video).

Question 6

How long can the talking video be?

Accepted Answer

Both HeyGen V3 and Omnihuman render 30 seconds per invocation on gVideo. For longer content, render multiple 30-second clips and stitch them in an editor — the models are frame-consistent so cuts feel natural.

Turn a Photo Into a Talking Video

Generate your avatar video in Studio

See it in action

Built for results

Any decent portrait works

Two audio paths

Tight lip-sync, believable motion

Iterate without re-shooting

Best model for this use case

Common questions