For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
SupportStatusTry now
DocumentationAPI Reference
DocumentationAPI Reference
    • Studio
    • Discord
    • Blog
    • Changelog
  • Getting Started
    • Introduction
    • Quickstart
    • Free Trial
  • Product
    • How AI Lip Sync Works
    • Use Cases
    • Billing
    • Integrations
    • Experimental features
    • Generation Times & Performance
    • Troubleshooting
  • Compatibility and Tips
    • Web Browser Support
    • Media Formats Support
    • Media Content Tips
    • Improving Lip Sync Quality
  • WebApp Guides
    • Speaker Selection
    • Dubbing
  • Developer Guides
    • SDKs
    • Python SDK Guide
    • TypeScript SDK Guide
    • Segments
    • Error Handling
    • Speaker Selection
    • Example Projects
  • Tutorials
    • Dubbing
    • Video Dubbing API Guide
    • Video Translation API Guide
    • Text-to-Speech Lip Sync
    • Personalized Video Messaging
    • Translation/Dubbing
  • Plugins & Extensions
    • MCP Server
    • ComfyUI
LogoLogo
SupportStatusTry now
On this page
  • Tips for Video Content
  • Tips for Audio Content
  • Sync Labs Mode Options
  • Related Resources
Compatibility and Tips

Media Content Tips

Was this page helpful?
Edit this page

Last updated May 15, 2026

Previous

Improving Lip Sync Quality

Next
Built with

The quality of your input media has a direct impact on the final lipsync output. For optimal results, please follow these tips for preparing your video and audio content.

Tips for Video Content

Avoid profile views and obstructions

For best performance, avoid full profile (side-view) shots and obstructions covering the face.

Ensure Natural Talking Motion

The model performs best when the character in the video appears to be talking naturally. It will preserve the speaker’s style during lipsync.

Tip for AI-Generated Video: When creating videos with third-party AI video generation models, include this instruction in the text prompt: "the character should be speaking naturally". The generated AI video will have some random mouth movements, which are necessary to get the best results from our lipsync model.

Tips for Audio Content

Use clear audio

For best performance, avoid audio with music, background noise, or multiple simultaneous speakers.

Sync Labs Mode Options

When your video and audio have different durations, you can choose how to handle the mismatch using the sync_mode parameter. Here’s a brief overview of each option:

bounce

When video is shorter than audio, the video will reverse playback at the end to match audio duration. Otherwise, video is cropped to match audio.

loop

When video is shorter than audio, the video will loop from the beginning to match audio duration. Otherwise, video is cropped to match audio.

cut_off

When audio is longer than video, the audio will be cut off to match video duration. Otherwise, video is cropped to match audio.

silence

When video is longer than audio, silence will be added to the audio to match video duration. Otherwise, video is cropped to match audio.

remap

The video playback speed will be adjusted (sped up or slowed down) to exactly match the audio duration, preserving all content from both.

Default Sync Labs Mode: The default depends on your generation type and video/audio durations:

  • Non-segmented generations:
    • Video longer than audio: cut_off is the default
    • Audio longer than video: bounce is the default
  • Segmented generations (using segments_secs or segments_frames): remap is the default, recommended to avoid abrupt cuts mid-video.

Choosing the Right Sync Labs Mode: Use bounce or loop for short videos with longer audio, cut_off when you want to prioritize video length, silence when you want to preserve the full video, and remap when you need to preserve all content from both video and audio.

Related Resources

  • Media Formats Support — supported video and audio formats, codecs, and recommended input properties
  • Lipsync Model — learn about available lip sync models and how input quality affects output