All Collections
How to create and edit an Avatar
Guidelines for Video Shooting High-Quality Studio Avatars
Guidelines for Video Shooting High-Quality Studio Avatars
Yuumi avatar
Written by Yuumi
Updated over a week ago

The Studio Avatar and generative outfit features are only available for customization by enterprise users.

To create a studio avatar, you'll need to record two videos: the consent statement and the official footage. The entire recording process won't take up much of your time, taking approximately 5 to 10 minutes, provided that you follow the guidelines. And the creation of a high-quality avatar depends on the quality of the source material. We are providing the following guidelines to ensure a successful video shoot.

These instructions will help you capture the best possible footage for your studio avatar. Please review and follow these guidelines carefully.

Consent Video

Before proceeding with the official production, we need to have the actor of the avatar record a video, giving consent to authorize HeyGen to use his / her video for creating the avatar. If there is no video of the person appearing on camera providing their authorization, we will not proceed with the studio avatar production process.

When recording the consent declaration, we need you to pay attention to the following:

  1. The consent declaration must be recorded by the person for whom the avatar is being created.

  2. The consent declaration should not have any editing or audio-visual synchronization issues.

  • An example is given as follows:

    • I, John Doe, allow HeyGen to use the footage of me to build a HeyGen Avatar for use on the HeyGen platform.

Official Video Recording

Equipment and Techniques

To create a high-quality Avatar, high-quality footage is necessary. We created this guide to share the equipment typically used to shoot great footage.

Recommended Equipment:

Camera: Professional Full-frame camera

(Reference: SONY FX6, Red KOMODO, etc.)

Microphone: Wireless Microphone

  • Shooting Equipment:

    • Professional studio environment:

      • You need to stand 6.5 feet (2 meters) in front of the green screen.

      • The green screen needs to be flat and free of wrinkles.

    • Professional Camera:

      • Always make sure to use a tripod to hold the camera in place.

      • M4/3 format or APS-C format is OK, Full-frame is better.

      • Your camera needs to support 4K(3840×2160) resolution.

      • Your footage should be at least 25 fps, 50 fps, or 60 fps would be even better.

      • We recommend that you use a prime lens with a 50mm focal distance(Full-frame). If you use an APS-C format camera, the focal distance should be 35mm. If you use an M4/3 format camera, the focal distance should be 25mm.

      • We recommend that you set your aperture to about F/8 when shooting. Ensure that the subject is fully clear, making sure the edges are not blurred.

      • The footage video bit depth is 8bit, 10bit is better.

      • The chroma sampling of the video footage is YCbCr 4:2:2.

      • The color gamut of the video footage is Rec.709 or SRGB.

      • ISO settings would be better within 400-800.

      • You can shoot video footage in Raw format or log format. However, please send the footage with the camera model information and LUT file.

Please do not shoot HDR video or footage with a color gamut of BT2020, AdobeRGB, DCI-P3, or Display P3.

  • Sound:

    • Ensure clear audio with a quiet shooting environment.

    • We recommend that you use a wireless microphone mounted on your clothing or a boom microphone to collect sound.

  • Lighting: Ensure even lighting with the three-point lighting technique. Avoid excessive brightness or darkness.

    • Three-point lighting technique:

    • To achieve a good shooting effect, you need to ensure that the shooting site is well-lit. You'll need at least three lights. Two lights on either side of the green screen background, at least one light for your subject.

    • We recommend that people stand 6.5ft(2m) in front of the green screen. This prevents green screen reflections or shadows on the background.

Video Shooting Requirements

To obtain qualified and high-quality source material, you will need to follow these 4 points during the shooting process:

  • Position the actor in the center of the frame.

  • Keep the camera stable throughout the shooting process, avoiding any camera shake.

  • Shoot a continuous 2-minute video without splicing or editing. If possible, we recommend shooting at least two 2-minute videos as your avatar footage.

  • We recommend using footage with a file size of around 5GB, in the MOV format.

Actor's Appearance and Clothing

The overall appearance of the actor plays a significant role in driving the overall performance of the avatar and the effectiveness of post-production editing. Before recording, it is recommended to prepare based on the following four points:

  • Wear clothing that contrasts noticeably with the green screen, avoiding stripes and colors close to green to facilitate background removal in post-production.

  • Avoid wearing earrings and necklaces, as they can affect the training effectiveness of the avatar.

  • Keep hair neat to improve background removal quality, please try to avoid having unnecessary stray hairs or large wavy curls.

  • Maintain a clear and visible face without any obstructions such as hair, gestures, props, or accessories. It's best to avoid having a thick and heavy beard.

Actor Shooting Guidelines

To capture high-quality source material, it is important for the actor to pay attention to six aspects during the recording process: speaking habits, lip movements, body gestures, eye contact, facial expressions, and overall mannerisms. By being mindful of these aspects, the algorithm can capture sufficient data for generating a high-quality avatar.

A little tip:

When recording, you can imagine yourself conducting a new product launch in front of the audience.

Speaking patterns:

  • Choose a topic for a 2-minute impromptu speech, avoiding repetition of phrases and numbers.

  • Maintain a normal speaking pace, neither too fast nor too slow.

  • Close the mouth naturally for 2-3 seconds after each speech segment.

  • It's okay if you make mistakes or mispronounce words. The purpose of speaking is to capture the data related to lip movements.

Lip Movements:

  • Enrich lip movements during filming, speak clearly, accurately, and loudly.

  • Avoid pouting, pursing lips, or sticking out the tongue.

Body Movements:

  • Keep the head stable, allowing slight hand and head movements, but no directional hand movements such as scissors' hands, numbers, OK signs, and so on.

  • Avoid walking or making exaggerated head movements.

  • Avoid excessive forward or backward movements of the head.

  • Don't place your hands above the chest area.

Eye Movements:

  • Maintain direct eye contact with the camera, and avoid looking away or rolling your eyes.

  • Blink naturally without excessive blinking.

Facial Expressions:

  • Maintain a slightly pleasant smile during the shooting process. If you record the source material with a smiling expression, you will receive an avatar that appears friendly and approachable. Conversely, if you record with a serious expression, you will receive an avatar that appears more serious in nature.

  • Avoid yawning, touching the face, or bursts of laughter.

These guidelines are crucial for achieving the desired results. We appreciate your cooperation in adhering to these instructions.

When you have finished shooting, please upload your footage to Google Drive and then share it with your dedicated HeyGen Support Manager.

Please do not hesitate to reach out if you have any questions or require further assistance.

Did this answer your question?