For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
SupportStatusTry now
DocumentationAPI Reference
DocumentationAPI Reference
    • API Overview
  • API Guides
    • Authentication
    • Concurrency & Rate Limits
    • Batch Processing
    • Webhooks
  • Generate API
    • POSTCreate Generation
    • POSTCreate Generation with Files
    • GETGet Generation
    • GETList Generations
    • POSTEstimate Cost
  • Batch API
    • POSTCreate Batch
    • GETGet Batch
    • GETList Batches
  • Assets API
    • GETList Assets
    • GETGet Asset
  • Models API
    • GETList Models
  • Webhooks Payload Reference
LogoLogo
SupportStatusTry now
On this page
  • Rate Limiting
  • API Endpoints
  • Auth Endpoints
  • Concurrency
  • Handling 429 Errors
  • Retry Strategies
  • Optimizing Concurrency
API Guides

Concurrency & Rate Limiting

Understanding rate limiting and concurrency
Was this page helpful?
Edit this page

Last updated June 1, 2026

Previous

Batch Processing

Process multiple lipsync generations efficiently
Next
Built with

Rate Limiting

Rate limits are enforced per IP address.

API Endpoints

EndpointLimit
POST /v2/generate100 requests/min (prod), 60 requests/min (dev)
GET /v2/generate/*600 requests/min

Auth Endpoints

EndpointLimit
POST /api/auth/sign-in/*, POST /api/auth/sign-up/*, POST /api/auth/otp/*10 requests/min
GET /api/auth/session60 requests/min

When rate limited, the API returns a 429 response:

1{
2 "statusCode": 429,
3 "message": "Rate limit exceeded. Please try again later.",
4 "error": "Too Many Requests"
5}

Concurrency

Concurrency refers to the number of generations that can be submitted/processed concurrently. Requests to create new generations will fail with a 429 error if the concurrency limit is exceeded.

To check your generations currently in PENDING/PROCESSING state, you can use the List Generations endpoint.

Concurrency limits are defined in the subscription plan. Current limits are:

PlanConcurrent Requests
Free1
Hobbyist1
Creator3
Growth6
Scale15
EnterpriseCustom

Handling 429 Errors

When you exceed a rate limit or concurrency limit, the API returns a 429 Too Many Requests response. This applies to both per-minute rate limits and concurrent generation limits.

What to do when you hit a 429:

  • Rate limit (requests per minute): Wait briefly and retry. The limit resets every minute, so a short pause is usually enough.
  • Concurrency limit: You already have the maximum number of generations in PENDING or PROCESSING state for your plan. Wait for an existing generation to complete before submitting a new one, or upgrade your plan for higher limits.

Do not retry 429 responses immediately in a tight loop. This wastes requests and delays recovery. Use the retry strategy below instead.

Retry Strategies

Exponential backoff is the recommended approach for handling both rate limit and transient errors from the Sync Labs lip sync API. Each retry waits progressively longer, reducing pressure on the API and improving your success rate.

Python
1import time
2from sync import Sync
3from sync.common import Audio, Video
4from sync.core.api_error import ApiError
5
6sync = Sync()
7
8def create_with_retry(video_url: str, audio_url: str, max_retries: int = 5):
9 for attempt in range(max_retries):
10 try:
11 return sync.generations.create(
12 input=[Video(url=video_url), Audio(url=audio_url)],
13 model="lipsync-2",
14 )
15 except ApiError as e:
16 if e.status_code == 429 and attempt < max_retries - 1:
17 wait = 2 ** attempt # 1s, 2s, 4s, 8s, 16s
18 print(f"Rate limited. Retrying in {wait}s...")
19 time.sleep(wait)
20 else:
21 raise
22
23response = create_with_retry(
24 "https://assets.sync.so/docs/example-video.mp4",
25 "https://assets.sync.so/docs/example-audio.wav",
26)
27print(f"Job submitted: {response.id}")
TypeScript
1import { SyncClient, SyncError } from "@sync.so/sdk";
2
3const sync = new SyncClient();
4
5async function createWithRetry(videoUrl: string, audioUrl: string, maxRetries = 5) {
6 for (let attempt = 0; attempt < maxRetries; attempt++) {
7 try {
8 return await sync.generations.create({
9 input: [
10 { type: "video", url: videoUrl },
11 { type: "audio", url: audioUrl },
12 ],
13 model: "lipsync-2",
14 });
15 } catch (err) {
16 if (err instanceof SyncError && err.statusCode === 429 && attempt < maxRetries - 1) {
17 const wait = 2 ** attempt * 1000; // 1s, 2s, 4s, 8s, 16s
18 console.log(`Rate limited. Retrying in ${wait / 1000}s...`);
19 await new Promise((r) => setTimeout(r, wait));
20 } else {
21 throw err;
22 }
23 }
24 }
25}
26
27const response = await createWithRetry(
28 "https://assets.sync.so/docs/example-video.mp4",
29 "https://assets.sync.so/docs/example-audio.wav",
30);
31console.log(`Job submitted: ${response.id}`);

Optimizing Concurrency

To get the most out of your plan’s concurrent generation slots:

  • Monitor active jobs. Use the List Generations endpoint to check how many jobs are currently in PENDING or PROCESSING state before submitting new ones.
  • Queue on your side. Maintain a local queue and only submit a new generation when a slot frees up. This avoids wasted 429 responses.
  • Use webhooks. Configure a webhook to get notified when a generation completes, so you can immediately submit the next job without polling.
  • Batch when possible. If you are on a Scale or Enterprise plan, the Batch API handles queueing and concurrency for you — submit up to 500 generations in one call.

For API rate limiting best practices and general error handling, see the Error Handling guide.