
Audio-Visual Face Reenactment
An identity-aware talking head video method that combines a dense motion field from learnable keypoints with audio conditioning on the mouth region, and ships state-of-the-art results across unseen faces, languages, and voices.

