Generation Times & Performance

Processing time for lip sync generations depends on the model you choose, the duration of your input video, the video’s resolution, and current system load. Most generations complete within a few minutes. This page covers what to expect, what influences speed, how to monitor progress, and what to do when things take longer than expected.

Expected Generation Times

The table below shows typical generation times for each model. These are estimates based on standard input conditions and may vary during peak usage or with complex inputs.

Model	Typical Time (30s video)	Typical Time (2min video)	Notes
lipsync-2	~1-3 min	~4-8 min	Best general-purpose quality
lipsync-2-pro	~2-5 min	~6-12 min	Enhanced facial detail, diffusion-based super resolution
sync-3	~10-15 min	~45-55 min	Most powerful model, 4K native output
react-1	~1-2 min	N/A (max 15s input)	Adds emotion and facial expressions
lipsync-1.9.0-beta	~30-60s	~2-4 min	Legacy model

These are estimates. Actual times vary based on system load, input complexity, and queue depth. During peak hours, generations may take longer.

What Affects Generation Speed

Several factors influence how long a generation takes to complete:

Video duration — Longer videos require more frames to process. A 2-minute video takes roughly 4x longer than a 30-second video on the same model.
Resolution — Higher resolution inputs require more compute. While all models process faces at 512x512, the overall pipeline still handles the full frame for composition.
Model choice — lipsync-2 balances speed and quality. lipsync-2-pro uses diffusion-based super resolution and is 1.5-2x slower than lipsync-2. sync-3 delivers significantly higher quality with longer generation times reflecting its more advanced processing pipeline.
System load — During peak usage periods, jobs may queue longer before processing begins. The PENDING status reflects time spent in the queue.
Audio complexity — Audio with overlapping speakers, background noise, or unusual patterns may require additional processing cycles for accurate lip-to-audio alignment.

How to Check Generation Status

You can monitor the progress of a generation using any of these methods:

Polling

Call the GET /v2/generate/{id} endpoint at regular intervals. We recommend polling every 10 seconds.

1 import time
2 from sync import Sync
3 
4 sync = Sync()
5 
6 generation = sync.generations.get(job_id)
7 while generation.status not in ["COMPLETED", "FAILED", "REJECTED"]:
8     time.sleep(10)
9     generation = sync.generations.get(job_id)
10 
11 print(f"Final status: {generation.status}")

Webhooks

Configure a webhook URL in your generation request to receive an HTTP POST notification when the job completes or fails. This is more efficient than polling for production workflows.

Studio

If you submitted the generation through Sync Labs Studio, the current status is shown in the generations list. Refresh the page to see updated statuses.

Terminal Statuses

A generation ends in one of three terminal statuses:

Status	Meaning
COMPLETED	Generation finished successfully. The output video URL is available.
FAILED	Generation encountered an error. Check the error response for details.
REJECTED	Generation was rejected due to invalid input or policy violation.

Troubleshooting Slow Generations

If your generation is taking longer than expected, here are steps to diagnose and resolve the issue:

Stuck in PENDING for more than 5 minutes — The processing queue may be busy during peak hours. Wait and monitor the status. Avoid resubmitting the same job, as this adds more jobs to the queue without canceling the original.
Generation FAILED — Check the error response from the GET /v2/generate/{id} endpoint. The error and error_code fields describe what went wrong. See the Error Handling guide for a full list of error codes.
Time-sensitive workloads — Use lipsync-2 for a good balance of speed and quality.
Large batch processing — For high-volume jobs, use the Batch API to submit up to 500 generations in a single request with a 24-hour turnaround time.
Check service status — Visit status.sync.so to see if there are any ongoing incidents or degraded performance affecting the platform.
Visit the support KB — For additional troubleshooting, see Why is my generation taking so long or stuck in processing? in the support knowledge base.

Frequently Asked Questions

Why is my generation stuck in PENDING?

The PENDING status means your generation is in the processing queue waiting for available compute resources. During peak usage periods, the queue may be deeper than usual, causing longer wait times. Generations typically move from PENDING to PROCESSING within 1-2 minutes under normal conditions. If your generation has been in PENDING for more than 5 minutes, the system may be experiencing higher-than-normal demand. You can check status.sync.so for any reported incidents. Avoid resubmitting the same generation request, as this creates additional queue entries without canceling the original job. If the generation remains stuck after 10 minutes, contact support at [email protected] with your generation ID.

Can I speed up generation time?

The most effective way to reduce generation time is to choose a faster model. lipsync-2 offers a good balance of speed and output quality. You can also reduce processing time by shortening your input video, lowering the resolution, or splitting long videos into shorter segments using the Segments API. For batch workloads where latency is less critical, the Batch API processes up to 500 jobs efficiently with a 24-hour turnaround. Submitting jobs during off-peak hours can also help reduce queue wait times.

What happens if a generation takes too long?

Generations have internal processing time limits that vary by video duration and model. If a generation exceeds its time limit, it will automatically transition to a FAILED status with a timeout error. Long videos are divided into 30-40 second chunks for processing, and generations can timeout if the video has too many scene changes within these chunks or if many scenes lack detectable faces. If your generation fails due to a timeout, try reducing the video length, ensuring faces are visible throughout the video, and minimizing rapid scene cuts. You can also retry the generation, as transient infrastructure issues occasionally cause timeouts that do not recur on subsequent attempts.

Does video resolution affect processing time?

Yes, higher resolution inputs require more processing time. The Sync Labs pipeline re-encodes all input video using the H.264 codec, and higher resolution frames take longer to decode, process, and re-encode. While the face region is processed at 512x512 resolution regardless of input dimensions, the overall composition step that blends the generated face back into the original frame scales with resolution. For the fastest processing times, 1080p input is recommended. Videos up to 4K (4096x2160) are supported but will take longer. Videos above 4K are rejected. If speed is a priority and your use case allows it, downscale your input video to 1080p or 720p before submitting.

1	import time
2	from sync import Sync
3
4	sync = Sync()
5
6	generation = sync.generations.get(job_id)
7	while generation.status not in ["COMPLETED", "FAILED", "REJECTED"]:
8	time.sleep(10)
9	generation = sync.generations.get(job_id)
10
11	print(f"Final status: {generation.status}")