Media Formats Support

Supported Formats

Video

MIME Type	Extension	Format
`video/mp4`	`.mp4`	MP4
`video/quicktime`	`.mov`, `.qt`	QuickTime
`video/webm`	`.webm`	WebM
`video/x-msvideo`	`.avi`	AVI

Audio

Full Support

MIME Type	Extension	Format
`audio/wav`	`.wav`	WAV
`audio/mpeg`	`.mp3`	MP3
`audio/ogg`	`.ogg`	OGG
`audio/flac`	`.flac`	FLAC
`audio/alac`	`.alac`	ALAC
`audio/mp4`	`.mp4`	MP4 Audio

Limited Support

The following formats have partial support due to licensing or legal restrictions:

MIME Type	Extension	Format
`audio/x-ms-wma`	`.wma`	WMA
`audio/x-m4a`	`.m4a`	M4A
`audio/x-m3a`	`.m3a`	M3A
`audio/aac`	`.aac`	AAC

For best compatibility, use MP4 for video and WAV or MP3 for audio.

Output Quality

All output is re-encoded to H.264 using libx264 (-crf 17 -preset slow), regardless of the input codec. Frames are processed in RGB color space internally during generation, so the original bitrate, frame rate, and color grading may differ in the output.

Don’t rely on bitrate for quality preservation: Outputs are re-encoded and bitrate may change.

HDR is not fully supported: HDR videos are normalized to SDR, which may affect color grading in the output.

Alpha channels are removed: The H.264/RGB pipeline does not support transparency. Alpha channels are replaced with a solid background.

Preserving Color

For color-sensitive H.264 workflows - especially when compositing generated output back onto source footage - use explicit SDR color metadata and 4:4:4 chroma sampling.

Tag SDR BT.709 metadata explicitly: Set color_space, color_transfer, color_range, and color_primaries. Untagged or partially tagged files can be interpreted differently across decoders. The pipeline uses ffmpeg 7.1 for color metadata detection.
Prefer yuv444p when color accuracy matters: The pipeline operates in RGB. 4:2:0 and 4:2:2 inputs require chroma upsampling during YUV→RGB conversion, which can cause color shifts or compositing seams. 4:4:4 preserves full chroma resolution through that conversion.
Export 4:4:4 from your source tool: Converting an existing 4:2:0 file to 4:4:4 cannot restore discarded chroma detail.
For lossless or pixel-level workflows, contact support.

Example FFmpeg command for a tagged SDR BT.709 H.264 export:

$ ffmpeg -i input.mov \
>   -c:v libx264 -pix_fmt yuv444p \
>   -color_range tv -colorspace bt709 -color_primaries bt709 -color_trc bt709 \
>   -crf 17 -preset slow \
>   -c:a aac -b:a 192k \
>   output.mp4

Recommended Input Properties

Video

Property	Recommended Value
Codec	H.264 High Profile for general use; H.264 4:4:4 for color-sensitive work
Resolution	1920×1080
Average Bitrate	≥ 10 Mbps
Frame Rate	24, 25, or 30 fps (constant)
Color Space	8-bit SDR BT.709
Color Metadata	Explicitly tag range, primaries, transfer, and matrix
Chroma Sampling	`yuv420p` or `yuv422p` for general use; `yuv444p` when color preservation is critical

4K maximum: Videos above 4096×2160 are rejected. Downscale to 4K or below before uploading.

Audio

Sample rate: 44.1 kHz or 48 kHz. Higher rates are downsampled to 48 kHz, which may reduce quality.
Bit depth: Up to 32-bit float.
Channels: Up to 7.1. Spatial audio is not supported.
Multiple streams: Only the first audio stream is processed; all others are discarded.

Codec Quality Comparison

All input codecs are transcoded to a standard format, so processing speed is consistent. Quality loss varies by codec, measured using VMAF:

Input Codec	Output Quality
H.264	Best (least quality loss)
MPEG-2	Good (up to 15% quality loss)
H.265	Good (up to 15% quality loss)
VP9	Fair (up to 20% quality loss)
AV1	Fair (over 20% quality loss)

Frequently Asked Questions

What is the maximum file size for uploads?

Direct file uploads are limited to 20 MB. If your file exceeds this limit:

Host the file at a publicly accessible URL (S3 bucket, CDN, or any web server).
Pass the URL in the url field of your video or audio input instead of uploading directly.

There is no file size limit for URL-based inputs - the file is downloaded from your URL during processing. For production pipelines, URL-based inputs are recommended regardless of file size, as they avoid upload timeouts and are more reliable.

Hosted files must be publicly accessible without authentication headers.

What is the maximum video duration?

Maximum duration depends on your plan:

Plan	Maximum Duration
Free	20 seconds
Hobbyist	1 minute
Scale+	30 minutes

Check the pricing page for your plan’s specific limit.

react-1 has a hard limit of 15 seconds regardless of plan - it is designed for short-form expressive content.
For videos exceeding your plan’s limit, use the Segments API to split and process them in shorter chunks.

Does Sync Labs support vertical or portrait videos?

Yes - any aspect ratio is supported, including vertical (9:16), horizontal (16:9), square (1:1), and custom dimensions.

The pipeline extracts the face region at 512×512 for processing, then composites it back into the original frame. The output always matches the input dimensions and orientation.

For best face detection in vertical videos, ensure the speaker’s face is clearly visible and well-lit.

Media Content Tips — best practices for preparing your video and audio content for optimal lip sync results
Lipsync Model — learn about supported models and their input requirements
Quickstart — get started with your first Sync Labs generation

Supported Formats

Video

Audio

Full Support

Limited Support

Output Quality

Preserving Color

Recommended Input Properties

Video

Audio

Codec Quality Comparison

Frequently Asked Questions

What is the maximum file size for uploads?

What is the maximum video duration?

Does Sync Labs support vertical or portrait videos?

Related Resources