sync. x Sieve: powering ai dubbing
sync. is partnering with Sieve to power their AI dubbing pipeline with sync-1.9.0-beta, voice cloning, culturally relevant translation, and natural lip sync, available as one API.
Sieve builds the infrastructure layer for AI video. Their work turns models into composable tools that any developer can use to understand, manipulate, and generate video. They’re SF-based, used by some of the biggest video platforms in the world, and backed by Matrix Partners, Swift Ventures, Y Combinator, and AI Grant.
We just partnered with them to power their ai video dubbing pipeline with sync-1.9.0-beta, the most natural lip sync model on the market. It’s one of the steps that gets us closer to a world where language is no longer the barrier between someone and a story they want to watch.
Professional grade dubbing with flawless lipsync
The promise of ai dubbing is millions of people accessing knowledge, entertainment, and connection without their language being the wall.
Traditional dubbing is the opposite of fast. The chain looks like this: translator, script adaptation writer, voice actor, casting director, recording engineer, sound editor, dubbing director, QA. Translated scripts get bent so the words land near the right lip shapes. Cultural references get reworked. When any link in that chain breaks, you get a “bad dub”, and anyone who grew up watching western media outside the West knows exactly what that feels like.
A few examples below of how lipsynced translation drives noticeably more engagement than the version without it:
President of Ukraine talking with Lex Fridman in fluent English (from Ukrainian)
Sieve’s AI dubbing pipeline gives developers the highest-quality, most flexible API to build dubbing experiences on. The headline features:
- natural voice cloning that preserves the original speaker in the target language
- precise, culturally relevant translations
- lipsync automatically applied to whoever is actively speaking in a scene
- styling controls and custom vocab for fine-grained translation control
- output modes and custom transcript inputs for human-in-the-loop workflows
“sync. has the most natural video-to-video lipsyncing models in the world, and the best part is there’s no training data required to use them. This opens up many possibilities with the types of content our workflows can target, and we’re excited to see what developers create with this new capability.” – Mokshith Voodarla, CEO of Sieve
How we move the world forward, together
ai dubbing is the first surface we’re partnering on. Once you can edit the recorded word and combine it with strong video editing primitives, you can compose workflows that touch nearly every kind of content production. Excited to keep deepening this partnership across the Sieve ecosystem.