Voice Cloning
The Voices API lets you list the voices available to your account, clone a new voice from an audio or video sample, and delete clones you no longer need. A cloned voice returns a voiceId that you reuse anywhere a voice is accepted — in POST /v2/tts and in text inputs on POST /v2/generate.
The headline flow: clone a speaker’s voice from a talking-head video, synthesize a brand-new line in that voice, then lip sync the result onto a different video — all on a single API key. See The flagship flow below.
Listing voices
GET /v2/voices returns every voice available to you: sync. labs’ built-in voices plus any clones you have created. Use a voice’s id as the voiceId in text-to-speech and in generation text inputs.
The response is an array of voice objects:
The voice identifier. Pass this as voiceId in POST /v2/tts and in generation text inputs.
sync. labs’ internal identifier for the voice. Present on some voices; prefer id for API calls.
Provider-side voice identifier. Present on some voices.
Human-readable voice name.
The voice provider. Always "elevenlabs".
A URL to a short audio preview of the voice, when available.
Cloning a voice
POST /v2/voices clones a new voice from a sample and returns a voiceId you can use immediately. Provide a name plus either a sync. labs-hosted url or an assetId — not both.
The source sample must be hosted in sync. labs storage. Public third-party URLs are not accepted. Upload local files first with POST /v2/assets/upload and pass the returned assetId, or pass the url of an asset already in sync. labs storage.
Both audio and video sources are supported. For video sources, the audio track is extracted automatically and the first 2 minutes are used for cloning.
Clone from an uploaded asset
The recommended path: upload the sample with the Asset Uploads flow, then clone from the returned assetId.
Clone from a hosted URL
If your sample already lives in sync. labs storage, pass its url instead of an assetId.
Request body
A label for the cloned voice.
URL of an audio or video sample hosted in sync. labs storage. Provide either url or assetId, not both.
ID of an asset previously uploaded via POST /v2/assets/upload. Provide either assetId or url, not both.
Response
A 201 response returns the new voice:
The cloned voice identifier. Use it as voiceId in POST /v2/tts and in generation text inputs.
The name you supplied for the clone.
sync. labs’ internal identifier for the voice, when available.
Clone slots are limited by your plan. When you hit the limit, POST /v2/voices returns a 403. Delete a voice you no longer need to free a slot, then retry the clone.
Deleting a voice
DELETE /v2/voices/{id} removes a clone and frees a clone slot. Use the voice’s id (the voiceId returned at clone time).
A 200 response confirms the voice was deleted and the slot is available for a new clone.
The flagship flow
Clone a voice from a talking-head video, synthesize a new line in that voice with text-to-speech, then lip sync that audio onto a different video. The entire pipeline runs on one API key.
Upload the source video (if it's a local file)
Voice sources must be hosted in sync. labs storage. If your talking-head video lives locally, upload it first with the Asset Uploads flow and keep the returned assetId. If it already lives in sync. labs storage, skip ahead and use its url.
Clone the voice from the video
Call POST /v2/voices with the assetId (or url). The audio track is extracted from the video automatically — the first 2 minutes are used — and you get back a voiceId.
Synthesize a new line in the cloned voice
Pass the returned voiceId to POST /v2/tts to generate audio of a brand-new script in that voice.
Poll the TTS job until it completes, then take the resulting synthesizedAudioUrl.
The end-to-end version of this pipeline in Python and TypeScript:
FAQ
What sources can I clone from?
Audio and video samples hosted in sync. labs storage. For video, the audio track is extracted automatically and the first 2 minutes are used. Sources hosted outside sync. labs storage are not accepted — upload local files via POST /v2/assets/upload first and pass the returned assetId, or pass the url of an asset already in sync. labs storage.
Why did my clone return a 403?
Clone slots are limited per plan. A 403 from POST /v2/voices means you have reached your clone limit. Delete a voice you no longer need with DELETE /v2/voices/{id} to free a slot, then retry. Deleting a voice frees the slot immediately.
Where do I use the returned voiceId?
Anywhere a voice is accepted: as voiceId in POST /v2/tts to synthesize speech, and in text inputs on POST /v2/generate. You can also retrieve it later from GET /v2/voices, where it appears as the voice’s id.
Do I pass both url and assetId?
No — provide exactly one. Use assetId when you have uploaded the sample through the Asset Uploads flow, or url when the sample already lives in sync. labs storage.
Related
- Text-to-Speech — synthesize speech with a cloned
voiceId. - Asset Uploads — upload local audio or video into sync. labs storage before cloning.
- Voices API reference — full request and response schemas for list, clone, and delete.

