Concurrency & Rate Limiting

Rate Limiting

Rate limits are enforced per authenticated user when an API key is present, and per IP address for unauthenticated requests.

API Endpoints

Endpoint	Limit
`POST /v2/generate`	100 requests/min
`GET /v2/generate/*`	600 requests/min

Auth Endpoints

Endpoint	Limit
`POST /api/auth/sign-in/`, `POST /api/auth/sign-up/`, `POST /api/auth/otp/*`	10 requests/min
`GET /api/auth/session`	60 requests/min

When rate limited, the API returns a 429 response:

1 {
2   "statusCode": 429,
3   "message": "Rate limit exceeded. Please try again later.",
4   "error": "Too Many Requests"
5 }

Concurrency

Concurrency refers to the number of generations that can be submitted/processed concurrently. Requests to create new generations will fail with a 429 error if the concurrency limit is exceeded.

To check your generations currently in PENDING/PROCESSING state, you can use the List Generations endpoint.

Concurrency limits are defined in the subscription plan. Current limits are:

Plan	Concurrent Requests
Free	1
Hobbyist	1
Creator	3
Growth	6
Scale	15
Enterprise	Custom

Handling 429 Errors

When you exceed a rate limit or concurrency limit, the API returns a 429 Too Many Requests response. This applies to both per-minute rate limits and concurrent generation limits. If you are authenticated, retrying from another IP address will not bypass user-level rate limits.

What to do when you hit a 429:

Rate limit (requests per minute): Wait briefly and retry. The limit resets every minute, so a short pause is usually enough.
Concurrency limit: You already have the maximum number of generations in PENDING or PROCESSING state for your plan. Wait for an existing generation to complete before submitting a new one, or upgrade your plan for higher limits.

Do not retry 429 responses immediately in a tight loop. This wastes requests and delays recovery. Use the retry strategy below instead.

Retry Strategies

Exponential backoff is the recommended approach for handling both rate limit and transient errors from the Sync Labs lip sync API. Each retry waits progressively longer, reducing pressure on the API and improving your success rate.

Python

1 import time
2 from sync import Sync
3 from sync.common import Audio, Video
4 from sync.core.api_error import ApiError
5 
6 sync = Sync()
7 
8 def create_with_retry(video_url: str, audio_url: str, max_retries: int = 5):
9     for attempt in range(max_retries):
10         try:
11             return sync.generations.create(
12                 input=[Video(url=video_url), Audio(url=audio_url)],
13                 model="lipsync-2",
14             )
15         except ApiError as e:
16             if e.status_code == 429 and attempt < max_retries - 1:
17                 wait = 2 ** attempt  # 1s, 2s, 4s, 8s, 16s
18                 print(f"Rate limited. Retrying in {wait}s...")
19                 time.sleep(wait)
20             else:
21                 raise
22 
23 response = create_with_retry(
24     "https://assets.sync.so/docs/example-video.mp4",
25     "https://assets.sync.so/docs/example-audio.wav",
26 )
27 print(f"Job submitted: {response.id}")

TypeScript

1 import { SyncClient, SyncError } from "@sync.so/sdk";
2 
3 const sync = new SyncClient();
4 
5 async function createWithRetry(videoUrl: string, audioUrl: string, maxRetries = 5) {
6     for (let attempt = 0; attempt < maxRetries; attempt++) {
7         try {
8             return await sync.generations.create({
9                 input: [
10                     { type: "video", url: videoUrl },
11                     { type: "audio", url: audioUrl },
12                 ],
13                 model: "lipsync-2",
14             });
15         } catch (err) {
16             if (err instanceof SyncError && err.statusCode === 429 && attempt < maxRetries - 1) {
17                 const wait = 2 ** attempt * 1000; // 1s, 2s, 4s, 8s, 16s
18                 console.log(`Rate limited. Retrying in ${wait / 1000}s...`);
19                 await new Promise((r) => setTimeout(r, wait));
20             } else {
21                 throw err;
22             }
23         }
24     }
25 }
26 
27 const response = await createWithRetry(
28     "https://assets.sync.so/docs/example-video.mp4",
29     "https://assets.sync.so/docs/example-audio.wav",
30 );
31 console.log(`Job submitted: ${response.id}`);

Optimizing Concurrency

To get the most out of your plan’s concurrent generation slots:

Monitor active jobs. Use the List Generations endpoint to check how many jobs are currently in PENDING or PROCESSING state before submitting new ones.
Queue on your side. Maintain a local queue and only submit a new generation when a slot frees up. This avoids wasted 429 responses.
Use webhooks. Configure a webhook to get notified when a generation completes, so you can immediately submit the next job without polling.
Batch when possible. If you are on a Scale or Enterprise plan, the Batch API handles queueing and concurrency for you — submit up to 500 generations in one call.

For API rate limiting best practices and general error handling, see the Error Handling guide.