Generation Times & Performance
Generation Times & Performance
Generation Times & Performance
Processing time for lip sync generations depends on the model you choose, the duration of your input video, the video’s resolution, and current system load. Most generations complete within a few minutes. This page covers what to expect, what influences speed, how to monitor progress, and what to do when things take longer than expected.
The table below shows typical generation times for each model. These are estimates based on standard input conditions and may vary during peak usage or with complex inputs.
These are estimates. Actual times vary based on system load, input complexity, and queue depth. During peak hours, generations may take longer.
Several factors influence how long a generation takes to complete:
You can monitor the progress of a generation using any of these methods:
Call the GET /v2/generate/{id} endpoint at regular intervals. We recommend polling every 10 seconds.
Configure a webhook URL in your generation request to receive an HTTP POST notification when the job completes or fails. This is more efficient than polling for production workflows.
If you submitted the generation through Sync Labs Studio, the current status is shown in the generations list. Refresh the page to see updated statuses.
A generation ends in one of three terminal statuses:
If your generation is taking longer than expected, here are steps to diagnose and resolve the issue:
error and error_code fields describe what went wrong. See the Error Handling guide for a full list of error codes.The PENDING status means your generation is in the processing queue waiting for available compute resources. During peak usage periods, the queue may be deeper than usual, causing longer wait times. Generations typically move from PENDING to PROCESSING within 1-2 minutes under normal conditions. If your generation has been in PENDING for more than 5 minutes, the system may be experiencing higher-than-normal demand. You can check status.sync.so for any reported incidents. Avoid resubmitting the same generation request, as this creates additional queue entries without canceling the original job. If the generation remains stuck after 10 minutes, contact support at [email protected] with your generation ID.
The most effective way to reduce generation time is to choose a faster model. lipsync-2 offers a good balance of speed and output quality. You can also reduce processing time by shortening your input video, lowering the resolution, or splitting long videos into shorter segments using the Segments API. For batch workloads where latency is less critical, the Batch API processes up to 500 jobs efficiently with a 24-hour turnaround. Submitting jobs during off-peak hours can also help reduce queue wait times.
Generations have internal processing time limits that vary by video duration and model. If a generation exceeds its time limit, it will automatically transition to a FAILED status with a timeout error. Long videos are divided into 30-40 second chunks for processing, and generations can timeout if the video has too many scene changes within these chunks or if many scenes lack detectable faces. If your generation fails due to a timeout, try reducing the video length, ensuring faces are visible throughout the video, and minimizing rapid scene cuts. You can also retry the generation, as transient infrastructure issues occasionally cause timeouts that do not recur on subsequent attempts.
Yes, higher resolution inputs require more processing time. The Sync Labs pipeline re-encodes all input video using the H.264 codec, and higher resolution frames take longer to decode, process, and re-encode. While the face region is processed at 512x512 resolution regardless of input dimensions, the overall composition step that blends the generated face back into the original frame scales with resolution. For the fastest processing times, 1080p input is recommended. Videos up to 4K (4096x2160) are supported but will take longer. Videos above 4K are rejected. If speed is a priority and your use case allows it, downscale your input video to 1080p or 720p before submitting.