Skip to main content

A prompt guide to get the most out of HeyGen's new Video Agent

A guide on getting the most out of HeyGen's Video agent through prompting.

David Z avatar
Written by David Z
Updated this week

The HeyGen Video Agent is built for Knowledge-Based Creators to create the best explainer videos.

This Prompt Guide is aimed at helping users learn how to best prompt our Video Agent to get the best results. It includes the most basic information your prompts should include, the more advanced controls that you can insert through prompting, as well as a few example sessions that might be useful.

The basics: what video agent actually needs

Before you type a single word, understand the three controls at your fingertips:

Avatar Selection — Pick a specific avatar or let Auto mode find one that fits your content. Pro tip: You can also go avatar-free with voice-over only, but you must explicitly say "no avatar" in your prompt.

Duration — Let Auto decide based on your content or choose 30s, 1min, 2min, etc. (note that agent still follows your prompt / script when it comes to length, so this is not a forced control)

Aspect Ratio — Portrait or landscape. or also leave as Auto.These are your baseline. The magic happens in the prompt itself.

The prompt: more context = better videos

For Video Agent to build high-quality videos for you, at the minimum, use the prompt box to describe the content you are trying to deliver. Here's what a basic prompt looks like:

The more context and intent you provide, the better the Video Agent can structure scenes, pacing, and visuals.

The pro move: use your script directly

This is the single biggest upgrade most people miss. You can paste a full video script into the prompt. The Video Agent will largely follow it scene‑by‑scene, while improving flow, timing, and visuals.

Video Agent will follow it scene-by-scene while improving flow, timing, and visuals automatically. Here's a script-driven prompt in action:

Intro (A-roll, motion graphics overlay)

VO: "If your work is mostly explaining things — updates, ideas, decisions — video usually helps, but making it takes too much time."What is HeyGen (motion-graphics B-roll)

VO: "HeyGen helps introverts turn ideas into production-ready videos — without cameras, editing, or studios."Talking Avatars (A-roll + demo cut)

VO: "Our talking avatar models offer realistic and natural delivery using your own digital identity."Use-cases (Motion Graphics list)

VO: "Teams use it for internal training, online education, product explanations, and knowledge sharing."Introducing Video Agent (end beat)

VO: "And with our new Video Agent — one prompt becomes a structured, animated video, end to end."

End card: HeyGen · Empower Knowledge-Based Creators

Note: The agent may make small edits (grammar, pacing) to improve clarity and video flow.

Attachments: give Video Agent reference material

You can upload files to help Video Agent understand your content:

Images & Videos: Product screenshots, existing assets, diagrams, or any media you want included.

Pro tip: Upload your own photo and ask the agent to use it as your talking avatar.

PDFs & Documents: Training materials, research papers, or product docs. The agent will extract key information.

Pro tip: When uploading references, add context about how you want them used. For example:

  • "Use the attached product screenshots as B-roll when discussing features"

  • "Reference the attached PDF for accurate technical specifications"

Advanced prompting:

PRO Tip: Here's my personal favorite prompt addition. I add this to almost everything:


Try adding these to all your prompts and see if you like the results!

But why does this work? Let's break it down!

Define your visual style & colors

Our Video Agent is capable of executing your style requirements consistently. Use style descriptors to guide the visual direction of your entire video.

Example style descriptors:

Table listing visual styles, their best uses, and prompt additions.

Defining colors:

You can specify exact color codes and font families for consistent branding:


Why this matters: Defining visual style is critical because it allows the agent to produce consistently styled video end-to-end. Without it, sometimes the visuals can look a bit off from scene to scene.

Motion Graphics, AI Image & Videos & Stock Media
Animated graphic elements: text animations, icons, charts, shapes, transitions.

Best for:

  • A-roll overlays: Lower thirds, bullet points alongside avatar, animated callouts

  • B-roll scenes: Full-screen animated explanations, data visualizations

  • Chapter cards: Section breaks, intros, outros

  • Information display: Statistics, comparisons, timelines

Example: "Use motion graphics to display the 5 key benefits as animated bullet points appearing one by one while the avatar speaks."

AI-Generated Images & Videos

Created by generative AI based on your descriptions.

Best for:

  • Conceptual illustrations

  • Custom scenarios that stock footage won't cover

  • Stylized visuals in a particular artistic style

  • Product mockups in various contexts

Example: "Generate an AI image showing a futuristic office where humans and AI work together: use this as B-roll for the 'future of work' section."

Stock Media

Real-world footage from stock libraries.

Best for:

  • Authentic scenes (real offices, cities, people)

  • Industry-specific content (medical, manufacturing, retail)

  • Emotional moments

  • Establishing shots

Example: "Use stock footage of a busy corporate office for B-roll when discussing workplace productivity."

Quick Reference: The Media Type Matrix

Table comparing the effectiveness of Motion Graphics, AI Generated, and Stock Media for various content types.

Scene-by-Scene Prompting: Maximum Control

When you need precise output, prompt each scene individually.

Basic structure:

Here's a detailed product launch video example:

Real Prompts You Can Steal

Compliance Training:

Educational Explainer (Voice-Over Only):

Brand Story (Animated):

The Bottom Line

Video Agent isn't magic; it's a production partner that executes your creative direction.

The more specific you are about content, style, media types, and scene structure, the closer you'll get to exactly what you envision.

Start with a script. Define your visual style. Match media types to content types. Prompt scene-by-scene when precision matters.

You had the message. Now you own the production.

Try these prompts with your next video. Then tell us what worked.

Did this answer your question?