Talking Head

Talking head generation is the fastest way to turn a character into a speaking presenter. It combines a source image, text, and the configured character voice pipeline to produce a video job you can preview or queue.

Inputs

Talking head requests support:

content_item_id
model_id
source_image_url
text

You can use the preview endpoint for quick validation or the generation endpoint when you are ready to create a full queued job.

Preview vs. full generation

Preview

Use preview when you want to test:

whether the script sounds right
whether the source image works
whether the overall motion feels natural

Preview returns a generation object immediately without sending it through the full queue flow.

Full generation

Use full generation when you want the result to enter the normal content pipeline. The system creates a talking-head generation and a linked job, then enqueues it.

[screenshot: Talking head generator showing text script, source image picker, and preview player]

Good scripts

Talking head videos work best with script text that sounds spoken, not written:

short sentences
natural rhythm
one clear message
direct audience framing

Example:

Three things I do before every morning Pilates session, because how you start changes everything.

Linking to content items

Attach a content_item_id when the talking head is part of a larger content plan. This keeps the speech asset tied to the queue, approvals, and scheduling logic you already use elsewhere.

Reviewing history

Influgen stores talking-head generations and lets you:

list them per character
filter by content_item_id
include or exclude previews
fetch a specific generation by ID

This is especially useful when you are iterating on the same script with multiple source images or voices.

Inputs​

Preview vs. full generation​

Preview​

Full generation​

Good scripts​

Linking to content items​

Reviewing history​