The HeyGen Video Agent is built for Knowledge-Based Creators to create the best explainer videos.
This Prompt Guide is aimed at helping users learn how to best prompt our Video Agent to get the best results. It includes the most basic information your prompts should include, the more advanced controls that you can insert through prompting, as well as a few example sessions that might be useful.
The basics: what video agent actually needs
Before you type a single word, understand the three controls at your fingertips:
Avatar Selection — Pick a specific avatar or let Auto mode find one that fits your content. Pro tip: You can also go avatar-free with voice-over only, but you must explicitly say "no avatar" in your prompt.
Duration — Let Auto decide based on your content or choose 30s, 1min, 2min, etc. (note that agent still follows your prompt / script when it comes to length, so this is not a forced control)
Aspect Ratio — Portrait or landscape. or also leave as Auto.These are your baseline. The magic happens in the prompt itself.
The prompt: more context = better videos
For Video Agent to build high-quality videos for you, at the minimum, use the prompt box to describe the content you are trying to deliver. Here's what a basic prompt looks like:
The more context and intent you provide, the better the Video Agent can structure scenes, pacing, and visuals.
The pro move: use your script directly
This is the single biggest upgrade most people miss. You can paste a full video script into the prompt. The Video Agent will largely follow it scene‑by‑scene, while improving flow, timing, and visuals.
Video Agent will follow it scene-by-scene while improving flow, timing, and visuals automatically. Here's a script-driven prompt in action:
Intro (A-roll, motion graphics overlay)
VO: "If your work is mostly explaining things — updates, ideas, decisions — video usually helps, but making it takes too much time."What is HeyGen (motion-graphics B-roll)
VO: "HeyGen helps introverts turn ideas into production-ready videos — without cameras, editing, or studios."Talking Avatars (A-roll + demo cut)
VO: "Our talking avatar models offer realistic and natural delivery using your own digital identity."Use-cases (Motion Graphics list)
VO: "Teams use it for internal training, online education, product explanations, and knowledge sharing."Introducing Video Agent (end beat)
VO: "And with our new Video Agent — one prompt becomes a structured, animated video, end to end."
End card: HeyGen · Empower Knowledge-Based Creators
Note: The agent may make small edits (grammar, pacing) to improve clarity and video flow.
Attachments: give Video Agent reference material
You can upload files to help Video Agent understand your content:
Images & Videos: Product screenshots, existing assets, diagrams, or any media you want included.
Pro tip: Upload your own photo and ask the agent to use it as your talking avatar.
PDFs & Documents: Training materials, research papers, or product docs. The agent will extract key information.
Pro tip: When uploading references, add context about how you want them used. For example:
"Use the attached product screenshots as B-roll when discussing features"
"Reference the attached PDF for accurate technical specifications"
Advanced prompting:
PRO Tip: Here's my personal favorite prompt addition. I add this to almost everything:
Try adding these to all your prompts and see if you like the results!
But why does this work? Let's break it down!
Define your visual style & colors
Our Video Agent is capable of executing your style requirements consistently. Use style descriptors to guide the visual direction of your entire video.
Example style descriptors:
Defining colors:
You can specify exact color codes and font families for consistent branding:
Why this matters: Defining visual style is critical because it allows the agent to produce consistently styled video end-to-end. Without it, sometimes the visuals can look a bit off from scene to scene.
Motion Graphics, AI Image & Videos & Stock Media
Animated graphic elements: text animations, icons, charts, shapes, transitions.
Best for:
A-roll overlays: Lower thirds, bullet points alongside avatar, animated callouts
B-roll scenes: Full-screen animated explanations, data visualizations
Chapter cards: Section breaks, intros, outros
Information display: Statistics, comparisons, timelines
Example: "Use motion graphics to display the 5 key benefits as animated bullet points appearing one by one while the avatar speaks."
AI-Generated Images & Videos
Created by generative AI based on your descriptions.
Best for:
Conceptual illustrations
Custom scenarios that stock footage won't cover
Stylized visuals in a particular artistic style
Product mockups in various contexts
Example: "Generate an AI image showing a futuristic office where humans and AI work together: use this as B-roll for the 'future of work' section."
Stock Media
Real-world footage from stock libraries.
Best for:
Authentic scenes (real offices, cities, people)
Industry-specific content (medical, manufacturing, retail)
Emotional moments
Establishing shots
Example: "Use stock footage of a busy corporate office for B-roll when discussing workplace productivity."
Quick Reference: The Media Type Matrix
Scene-by-Scene Prompting: Maximum Control
When you need precise output, prompt each scene individually.
Basic structure:
Here's a detailed product launch video example:
Real Prompts You Can Steal
Compliance Training:
Educational Explainer (Voice-Over Only):
Brand Story (Animated):
The Bottom Line
Video Agent isn't magic; it's a production partner that executes your creative direction.
The more specific you are about content, style, media types, and scene structure, the closer you'll get to exactly what you envision.
Start with a script. Define your visual style. Match media types to content types. Prompt scene-by-scene when precision matters.
You had the message. Now you own the production.
Try these prompts with your next video. Then tell us what worked.












