Character: AI Girl
Vocal Source:
Celine Dion - My Heart Will Go On. Covered by Emma Heesters
Character: AI Marilyn Monroe
Vocal Source: Unholy - Sam
Smith, Kim Petras
Character: AI Storm Trooper
Vocal Source:
INTRO CINEMATIC - HELLDIVERS™ 2
Character: Dr. Emmett Brown in "Back to the Future"
Vocal Source: Rick and Morty
we propose a novel audio-driven talking head method capable of simultaneously generating highly expressive facial expressions and hand gestures. Unlike existing methods that focus on generating full-body or half-body poses, we investigate the challenges of audio-driven ges- ture generation and identify the weak correspon- dence between audio features and full-body ges- tures as a key limitation. To address this, we re- define the task as a two-stage process. In the first stage, we generate hand poses directly from audio input, leveraging the stronger correlation between audio signals and hand movements. In the second stage, we employ a diffusion model to synthe- size video frames, incorporating the hand poses generated in the first stage to produce realistic facial expressions and body movements.
The motivation behind our method. Human motion, similar to that of robots, involves planning the "end-effector" (EE), typically the hands, towards the target situation. The rest of the body then cooperates accordingly with the EE, using inverse kinematics principles.
By inputting a single character image and vocal audio, such as singing, our method can generate vocal avatar videos featuring not only expressive facial expressions but also a variety of body poses.
Character: Karina
Vocal Source: Yonezu
Kenshi 「LOSER」┃Cover by Raon Lee
Character:AI girl
Vocal Source: Charlie Puth
- Attention (Emma Heesters Cover)
Character: Elon Musk
Vocal Source: OneRepublic
- Apologize (Live)
Character: Arthur Morgan from Red Dead Redemption
Vocal Source: Black
Myth:
Wukong - Headless Guy Singing Scene
Character: AI Queen
Vocal Source: IMAGINEDRAGON&JID
- Enemy. Covered by 东京炸市松
Character: AI girl
Vocal Source: YUQI -
Freak
Our method supports voice in multiple languages and brings images to life by intuitively recognizing tonal variations in audio, enabling the creation of dynamic, richly performing avatars.
Character: KA KA
Vocal Source: 서울의 봄
Character: Elon Musk
Vocal Source: Musk's Speech
Character: Elon Musk
Vocal Source: Trevor's Talkshow
Character: Taylor Swift
Vocal Source: Iliza Shlesinger's Talkshow
Our method can generate complex and smooth hand movements, bringing the avatar to life with a vivid performance.
Character: Karina
Vocal Source: 明明 (Cut)
Character: Jang Won Young
Vocal Source: 想你 (Cut)
One potential application of our method is to enable designated characters to act out relevant scripts in film and game scenarios, with performances that align with their character profiles.
Character: Will Smith
Vocal Source: GTA 5
Character: AI commander
Vocal Source: Red Alert 2
Character: Donald Trump
Vocal Source: The Great Shenyang Street
Character: Donald Trump
Vocal Source: The Boys
Character: Jensen Huang
Vocal Source: House
Votes to Ban TikTok & RFK’s Unexpected VP Contender
Character: AI girl
Vocal Source: Genshin
Character: Elon Musk
Vocal Source: The Wolf of Wall Street
Character: AI girl
Vocal Source: Honkai
Character: Albert Einstein
Vocal Source: Rick and Morty
Character: AI girl
Vocal Source: Genshin
Character: AI girl
Vocal Source: "Yes, one; and in this manner." by: Octavia Selena Alexandru
Comparison with Vlogger
Comparison with CyberHost
Check out our lighthearted video, created using our method. This video serves as a demonstration of potential application scenarios for our research. Hope you like it, and it will truely raise me up.