Personalized Video Messaging
Personalized Video Messaging
Personalized video messaging uses AI lip sync to create unique video content for individual recipients at scale. By combining text-to-speech with Sync’s lip sync API, a single recorded video becomes thousands of personalized messages — each with the speaker appearing to address the recipient by name. This approach leads to higher engagement, better conversion rates, and a more personal connection compared to generic video messages.
Follow these steps to run the personalized video messaging example:
Configure API Keys
You will need Sync_API_KEY and ELEVEN_LABS_API_KEY to run the example. Update constants.py file with your own API keys:
SYNC_API_KEY: Your API key for SyncELEVEN_LABS_API_KEY: Your API key for ElevenLabs (used for voice generation).
Ensure the file is saved after adding your keys.
Prepare Input Data (Optional)
The repository includes a sample input file example.csv for quickstart.
You can modify it with your own inputs if desired. Each row typically represents one recipient.
The input file should have the following columns:
video: URL of the video to be personalized.text: Text to be personalized in the video.segment_start: start time of the video segment to be personalized.segment_end: end time of the video segment to be personalized.output_format: output format of the video.sync_mode: mode to sync text to video segment if lengths don’t match. optionsvoice_id: elevenlabs voice id to use for the video. If empty, audio from the video will be cloned.lipsync_model: sync.so lipsync model to use for the video. default:lipsync-2tts_model: elevenlabs text to speech model. default:eleven_multilingual_v2
Frequently Asked Questions
How many personalized videos can I generate at once?
Sync’s Batch API handles up to 500 video generations per request. You can submit a CSV of recipients and the system processes them in parallel, returning output URLs for each completed video. For volumes above 500, submit multiple batch requests.
What TTS providers work with Sync?
The example project includes built-in ElevenLabs integration for text-to-speech. However, any TTS provider that outputs standard WAV or MP3 audio files works with Sync’s lip sync API — simply pass the generated audio URL as the audio_url parameter.
How much does personalized video messaging cost?
Cost depends on the lip sync model and video duration. Use the /v2/generate/estimate-cost endpoint to get exact pricing before submitting a batch. lipsync-2 is the most cost-effective option for high-volume personalized messaging.
Related Resources
- Quickstart — get up and running with your first Sync API call
- Python SDK — use the official Python SDK for streamlined API integration
- Webhooks — receive real-time notifications when your video generations complete

