Troubleshooting
Running into an issue? Check the common problems and solutions below. Most issues can be resolved quickly with the right steps.
Why am I getting a 401 Unauthorized error?
A 401 error means your API key is missing or invalid. To fix this:
- Make sure you’re including the
x-api-keyheader in every API request - Verify your API key is correct — copy it directly from Settings > API Keys
- If your key still doesn’t work, regenerate a new one in the Sync Studio
See the Authentication guide for full details on setting up your API key.
My generation is stuck in PENDING
Generations typically complete in 30 to 120 seconds depending on video length and model. If your generation appears stuck:
- Use polling or webhooks to check the current status
- Wait at least 2 minutes before assuming something is wrong — longer videos take more time
- If a generation is still in PENDING after 5 minutes, contact support at support@sync.so
Avoid re-submitting the same job repeatedly, as this may increase queue times.
I'm hitting 429 Rate Limit errors
A 429 status code means you’ve exceeded your plan’s concurrency limits. To handle this:
- Implement exponential backoff — wait progressively longer between retries
- Check your plan’s concurrency limits on the Concurrency & Rate Limits page
- Consider upgrading your plan for higher concurrency limits
- For large workloads, use the Batch API to queue multiple jobs efficiently
The lip sync quality is poor
Output quality depends heavily on your input video, audio, and model choice. For best results, ensure the speaker’s face is front-facing, well-lit, and occupies a reasonable portion of the frame at a minimum of 480p resolution. Avoid obstructions like hands, microphones, or hair covering the mouth area. Use clean audio without background music or overlapping speakers, as noise degrades lip-to-audio alignment. For model selection, lipsync-2 handles the majority of videos well and preserves natural speaking style, while lipsync-2-pro uses diffusion-based super resolution for the best results with beards, teeth, and fine facial detail. For longer videos, audio-video duration mismatches can cause drift — use the sync_mode parameter (e.g., cut_off, bounce, or remap) to control how mismatches are handled.
See Media Content Tips for detailed guidance on input quality.
My media format is not supported
Sync supports a wide range of video and audio formats. If your file isn’t accepted:
- Check the Media Formats Support page for the full list of supported formats
- Convert your file using FFmpeg:
My webhook isn't receiving events
If your webhook endpoint isn’t getting called:
- HTTPS required — Your endpoint must be a publicly accessible HTTPS URL
- Check firewall rules — Make sure incoming POST requests from Sync’s servers aren’t blocked
- Verify the URL — Double-check the webhook URL you passed in your API call
- Test locally — Use a tool like ngrok to expose a local server for testing
- Check response codes — Your endpoint must return a 2xx status code to acknowledge receipt
Python SDK installation fails
If you’re having trouble installing the Python SDK:
- Check your Python version — The SDK requires Python 3.8 or higher
- Upgrade the package — Try reinstalling with the latest version:
- Use a virtual environment — Avoid conflicts with other packages:
See the Python SDK Guide for full setup instructions.
TypeScript SDK issues
If the TypeScript SDK isn’t working as expected:
- Check your Node.js version — The SDK requires Node.js 18 or higher
- Reinstall the package:
- Check your package.json — Make sure
@sync.so/sdkis listed in your dependencies - TypeScript version — Ensure you’re using TypeScript 4.7 or higher if using TypeScript
See the TypeScript SDK Guide for full setup instructions.
Output has watermarks
Watermarks appear on videos generated with free or Hobbyist accounts. To remove watermarks:
- Upgrade to the Creator plan or higher — watermark removal is included on Creator+
- See our Billing page for plan details and pricing
Existing videos generated on a free or Hobbyist plan will retain their watermarks. Generate new videos after upgrading to get unwatermarked output.
Face not detected or wrong face selected
If Sync can’t detect a face or selects the wrong person:
- Ensure face is clearly visible — The face should be unobstructed, well-lit, and occupy a reasonable portion of the frame
- Check face angle — Frontal or near-frontal faces work best; extreme side profiles may not be detected
- Multi-person videos — If there are multiple faces in the frame, use the Speaker Selection feature to target the correct person
- Resolution — Very low-resolution video may make face detection unreliable; use at least 480p
See the Speaker Selection guide for details on selecting specific faces in multi-person videos.
How long does generation take and what if it seems stuck?
Generation time depends on the model, video length, and resolution. As a general guide: lipsync-1.9.0-beta is the fastest model, typically completing a 30-second clip in well under a minute. lipsync-2 takes a few minutes for most videos and is the recommended default. lipsync-2-pro is 1.5—2x slower than lipsync-2 due to its diffusion-based super resolution step, so expect longer waits for premium quality. Higher resolution inputs and longer video durations increase processing time proportionally. To monitor progress, use polling (check the status field on GET /v2/generate/{id}) or set up webhooks for real-time status callbacks when the job completes. If your generation remains in PENDING or PROCESSING for more than 10 minutes, the job may have encountered an infrastructure issue. Avoid resubmitting the same request repeatedly, as this creates duplicate queue entries and slows processing further. Instead, contact support@sync.so with your generation ID for investigation.
I have a billing or payment issue — what should I do?
Sync uses a subscription-plus-usage billing model processed through Stripe. Common payment issues include declined cards, unexpected charges appearing after cancellation (usage charges still apply until the end of your billing cycle), and unpaid usage invoices blocking new generations. If your card is declined, update your payment method at sync.so/billing/subscription — Stripe retries failed charges for up to 5 days before automatically cancelling the subscription. If you see a charge you do not recognize, check your usage history at sync.so/billing/usage — usage invoices are generated automatically each time your accumulated spend hits your tier’s threshold (6 dollars for Hobbyist, 20 for Creator, 50 for Growth, 250 for Scale). For refund requests, go to your billing page and click Manage billing to access Stripe’s Cancel + refund flow directly. For any billing issue not resolved through the dashboard, email support@sync.so. See the Billing page for full pricing and payment details.
Why does the lip sync look mismatched or out of sync on longer videos?
Lip sync drift on longer videos typically happens when the audio and video durations do not match precisely, or when the video contains segments where the speaker is not actively talking. Sync processes long videos in 30—40 second chunks internally, so scene changes or cuts within those chunks can confuse face tracking and cause brief misalignment. To fix drift, set the sync_mode parameter to cut_off (trims audio to video length) or remap (adjusts video speed to match audio). For videos over 1 minute with multiple scenes, consider splitting them into segments using the Segments API, where each segment gets its own audio input for tighter control. Using lipsync-2-pro also improves quality in challenging footage. Ensure the input video shows the speaker actively talking throughout — static or still frames cannot produce good lip movements.
My text-to-speech generation failed or produced no audio
TTS works out of the box on all plans using Sync’s built-in ElevenLabs integration — no setup required. If you want more control, you can optionally bring your own ElevenLabs API key on Creator plans and above by configuring it at Integrations settings. TTS failures most commonly stem from an invalid ElevenLabs voice ID or exceeding the 5,000-character script limit. Verify your voiceId is a valid ElevenLabs voice ID string (not a voice name or display label), and keep your script under 5,000 characters per generation request. For longer scripts, use the Segments API to split text across multiple TTS inputs with different time ranges. If the generation completes but produces audio-only output without video, ensure you included a valid video input in your request alongside the TTS input. For persistent generation_text_length_exceeded or generation_input_validation_failed errors, see the Error Handling page for detailed resolution steps.
My video or audio file won't upload or is rejected
Sync accepts MP4, MOV, WebM, and AVI for video, and WAV, MP3, OGG, FLAC, ALAC, and MP4 audio with full support (WMA, M4A, and AAC have limited support due to licensing restrictions). If your file is rejected, first check the format against the Media Formats Support page. For API uploads, all media must be hosted at a publicly accessible URL — private, authenticated, or expired URLs will fail silently. The recommended video codec is H.264 (High Profile) at a maximum resolution of 4K (4096x2160 pixels); videos above 4K are rejected outright. Audio should use a 44.1kHz or 48kHz sample rate for best results. If your file uses an unsupported codec, convert it with FFmpeg: ffmpeg -i input.avi -c:v libx264 -c:a aac output.mp4. Videos missing an audio track or required metadata fields (duration, frame rate) will return a generation_media_metadata_missing error. Note that HDR (10-bit color) video is automatically normalized to SDR, which may alter your color grading.
How do I lip sync a video with multiple speakers?
For videos with multiple people visible in the frame, use the Speaker Selection feature to target the correct face. Set options.active_speaker_detection.auto_detect to true to let Sync automatically identify the active speaker, or provide a manual frame_number and coordinates pointing to the target speaker’s face for fully deterministic control. You can also supply per-frame bounding_boxes if you already run your own face detection. If your video has multiple speakers talking at different times (such as a two-person podcast or interview), use the Segments API to assign different audio inputs to different time ranges within the video — each segment can target a different speaker with its own audio. For best results, ensure each speaker’s face is clearly visible and front-facing during their speaking segment. If Sync selects the wrong face, provide explicit coordinates rather than relying on auto-detection. See the Speaker Selection API guide and Segments guide for complete code examples.
Still Need Help?
If your issue isn’t covered above, here are more resources:
- Error Handling Guide — Detailed error code reference and resolution steps
- Billing — Subscription plans, usage-based billing, refunds, and payment troubleshooting
- Media Formats Support — Full list of supported video and audio formats
- Media Content Tips — Best practices for input video and audio quality
- Text-to-Speech Guide — ElevenLabs integration setup and TTS troubleshooting
- Speaker Selection — Target specific faces in multi-person videos
- Segments Guide — Multi-segment lip sync for long-form and multi-speaker content
- API Reference — Complete endpoint documentation
- Email support — Reach us at support@sync.so
- Discord Community — Get help from the Sync community and team
Support Knowledge Base
For step-by-step walkthroughs and additional troubleshooting, visit the Sync Support Knowledge Base:
- Why is my lip sync not working? — No mouth movement or static face
- How to use TTS with Sync — Step-by-step text-to-speech setup
- Generation taking too long? — Processing times and stuck jobs
- Upload errors — File format and upload troubleshooting
- Failed payments — Card declined and payment issues
- Unpaid invoices — Debt payment and blocked generations
- How to cancel — Step-by-step cancellation guide
- How to get a refund — Refund eligibility and process

