This AI Face Animator Turns a Single Snapshot Into a Convincing Talking Head

Yet another AI genie’s out of the bottle. The…geniuses over at Microsoft have unveiled their new VASA technology for “Lifelike audio-driven talking faces generated in real time.” The tech can use a single snapshot, then animate it to match whatever audio is plugged into it. “Make anyone say anything,” is how this reviewer describes it.

It has to be seen to be believed (obviously you need to click the sound on):

While it has a slight air of videogame-cut-scene, would you have noticed these aren’t real if you weren’t told in advance?

Watching them and knowing, it is chilling just how dead the eyes look. I don’t know if it’s the blinking rate, or if human eyes IRL have some subtle twinkling that we’re not aware of, but the demo is downright creepy. Even worse, it’s just a matter of time before they attend to this and get it “right.”

And as this example shows, it looks like some animation jobs are going to evaporate:

A full demonstration:

Source: core77

