Generation Times & Performance
Processing time for lip sync generations depends on the model you choose, the duration of your input video, the video’s resolution, and current system load. Most generations complete within a few minutes. This page covers what to expect, what influences speed, how to monitor progress, and what to do when things take longer than expected.
Expected Generation Times
The table below shows typical generation times for each model. These are estimates based on standard input conditions and may vary during peak usage or with complex inputs.
These are estimates. Actual times vary based on system load, input complexity, and queue depth. During peak hours, generations may take longer.
What Affects Generation Speed
Several factors influence how long a generation takes to complete:
- Video duration — Longer videos require more frames to process. A 2-minute video takes roughly 4x longer than a 30-second video on the same model.
- Resolution — Higher resolution inputs require more compute. While all models process faces at 512x512, the overall pipeline still handles the full frame for composition.
- Model choice — lipsync-1.9.0-beta is the fastest model, designed for high-volume use. lipsync-2 balances speed and quality. lipsync-2-pro uses diffusion-based super resolution and is 1.5-2x slower than lipsync-2.
- System load — During peak usage periods, jobs may queue longer before processing begins. The PENDING status reflects time spent in the queue.
- Audio complexity — Audio with overlapping speakers, background noise, or unusual patterns may require additional processing cycles for accurate lip-to-audio alignment.
How to Check Generation Status
You can monitor the progress of a generation using any of these methods:
Polling
Call the GET /v2/generate/{id} endpoint at regular intervals. We recommend polling every 10 seconds.
Webhooks
Configure a webhook URL in your generation request to receive an HTTP POST notification when the job completes or fails. This is more efficient than polling for production workflows.
Studio
If you submitted the generation through Sync Studio, the current status is shown in the generations list. Refresh the page to see updated statuses.
Terminal Statuses
A generation ends in one of three terminal statuses:
Troubleshooting Slow Generations
If your generation is taking longer than expected, here are steps to diagnose and resolve the issue:
- Stuck in PENDING for more than 5 minutes — The processing queue may be busy during peak hours. Wait and monitor the status. Avoid resubmitting the same job, as this adds more jobs to the queue without canceling the original.
- Generation FAILED — Check the error response from the GET /v2/generate/{id} endpoint. The
erroranderror_codefields describe what went wrong. See the Error Handling guide for a full list of error codes. - Time-sensitive workloads — Use lipsync-1.9.0-beta for the fastest processing. It is 2-3x faster than lipsync-2 and costs less per second.
- Large batch processing — For high-volume jobs, use the Batch API to submit up to 500 generations in a single request with a 24-hour turnaround time.
- Check service status — Visit status.sync.so to see if there are any ongoing incidents or degraded performance affecting the platform.
- Visit the support KB — For additional troubleshooting, see Why is my generation taking so long or stuck in processing? in the support knowledge base.
Frequently Asked Questions
Why is my generation stuck in PENDING?
The PENDING status means your generation is in the processing queue waiting for available compute resources. During peak usage periods, the queue may be deeper than usual, causing longer wait times. Generations typically move from PENDING to PROCESSING within 1-2 minutes under normal conditions. If your generation has been in PENDING for more than 5 minutes, the system may be experiencing higher-than-normal demand. You can check status.sync.so for any reported incidents. Avoid resubmitting the same generation request, as this creates additional queue entries without canceling the original job. If the generation remains stuck after 10 minutes, contact support at support@sync.so with your generation ID.
Can I speed up generation time?
The most effective way to reduce generation time is to choose a faster model. lipsync-1.9.0-beta is the fastest option, processing a 30-second video in roughly 30-60 seconds. If you need higher quality, lipsync-2 offers a good balance of speed and output quality. You can also reduce processing time by shortening your input video, lowering the resolution, or splitting long videos into shorter segments using the Segments API. For batch workloads where latency is less critical, the Batch API processes up to 500 jobs efficiently with a 24-hour turnaround. Submitting jobs during off-peak hours can also help reduce queue wait times.
What happens if a generation takes too long?
Generations have internal processing time limits that vary by video duration and model. If a generation exceeds its time limit, it will automatically transition to a FAILED status with a timeout error. Long videos are divided into 30-40 second chunks for processing, and generations can timeout if the video has too many scene changes within these chunks or if many scenes lack detectable faces. If your generation fails due to a timeout, try reducing the video length, ensuring faces are visible throughout the video, and minimizing rapid scene cuts. You can also retry the generation, as transient infrastructure issues occasionally cause timeouts that do not recur on subsequent attempts.
Does video resolution affect processing time?
Yes, higher resolution inputs require more processing time. The Sync pipeline re-encodes all input video using the H.264 codec, and higher resolution frames take longer to decode, process, and re-encode. While the face region is processed at 512x512 resolution regardless of input dimensions, the overall composition step that blends the generated face back into the original frame scales with resolution. For the fastest processing times, 1080p input is recommended. Videos up to 4K (4096x2160) are supported but will take longer. Videos above 4K are rejected. If speed is a priority and your use case allows it, downscale your input video to 1080p or 720p before submitting.

