The AI genius, who has built out his virtual BabyX from a laughing, crying head, sees a symbiotic relationship between humans and machines.
People get up to weird things in New Zealand. At the University of Auckland, if you want to run hours upon hours of experiments on a baby trapped in a high chair, that’s cool. You can even have a conversation with her surprisingly chatty disembodied head.
BabyX, the virtual creation of Mark Sagar and his researchers, looks impossibly real. The child, a 3D digital rendering based on images of Sagar’s daughter at 18 months, has rosy cheeks, warm eyes, a full head of blond hair, and a soft, sweet voice. When I visited the computer scientist’s lab last year, BabyX was stuck inside a computer but could still see me sitting in front of the screen with her “father.” To get her attention, we’d call out, “Hi, baby. Look at me, baby,” and wave our hands. When her gaze locked onto our faces, we’d hold up a book filled with words (such as “apple” or “ball”) and pictures (sheep, clocks), then ask BabyX to read the words and identify the objects. When she got an answer right, we praised her, and she smiled with confidence. When she got one wrong, chiding her would turn her teary and sullen.
If it sounds odd to encounter a virtual child that can read words from a book, it’s much more disorienting to feel a sense of fatherly pride after she nails a bunch in a row and lights up with what appears to be authentic joy. BabyX and I seemed to be having a moment, learning from each other while trading expressions and subtle cues so familiar to the human experience. That’s the feeling Sagar is after with his research and his new company Soul Machines Ltd.
The term “artificial intelligence” has become a catchall for impersonal, mysterious calculations performed behind closed doors. Huge farms of computers crank away at piles of data, using statistics to analyze our internet history, driving habits, and speech to produce targeted ads, better maps, and Apple Inc.’s Siri. This sense of AI as an amorphous shadow falling over more and more of our lives has left people from Stephen Hawking to Elon Musk skeptical of the technology, which tends to feel unnatural, somehow less than real.
Sagar is a leading figure in the camp trying to humanize AI, which he says has the potential to yield a more symbiotic relationship between humans and machines. While he wasn’t the first to this idea, his approach is unique, a synthesis of his early years as a computer scientist and later ones in the world of Hollywood special effects. The face, he’s concluded, is the key to barreling through the uncanny valley and making virtual beings feel truly lifelike. Soul Machines’ creations are unparalleled in this respect, able to wince and grin with musculature and features that move shockingly like ours. They have human voices, too, and are already contracted for use as online helpers for companies ranging from insurance providers to airlines. Soul Machines wants to produce the first wave of likable, believable virtual assistants that work as customer service agents and breathe life into hunks of plastic such as Amazon.com’s Echo and Google Inc.’s Home.
Companies with similar aspirations throughout Japan and the U.S. have produced a wide array of virtual avatars, assistants, and holograms. Many of the people behind these projects say AI systems and robots can achieve their full potential only if they become more humanlike. They need to have memories, the thinking goes, plus something resembling emotions, to propel them to seek out their own experiences.
Sagar’s approach on this front may be his most radical contribution to the field. Behind the exquisite faces he builds are unprecedented biological models and simulations. When BabyX smiles, it’s because her simulated brain has responded to stimuli by releasing a cocktail of virtual dopamine, endorphins, and serotonin into her system. This is part of Sagar’s larger quest, using AI to reverse-engineer how humans work. He wants to get to the roots of emotion, desire, and thought and impart the lessons to computers and robots, making them more like us.
“Since my 20s, I’ve had these thoughts of can a computer become intelligent, can it have consciousness, burning in my mind,” he says. “We want to build a system that not only learns for itself but that is motivated to learn and motivated to interact with the world. And so I set out with this crazy goal of trying to build a computational model of human consciousness.”
Here’s what should really freak you out: He’s getting there a lot quicker than anybody would have thought. Since last year, BabyX has, among other things, sprouted a body and learned to play the piano. They grow up so fast.
Unlike most of those working in Silicon Valley, Sagar doesn’t reflexively defer to engineering. “When scientists see the world and artists see the world, they are looking at the same thing,” he says, “using a different language and viewpoint to describe it. But it’s all true. Everything is interconnected.”
He got that idea early. When he was born in Nairobi in 1966, his father was working for the East African Railways and Harbours Corp. as a systems analyst, programming punch-card computers to run the train infrastructure. His mother, a painter, took him to game reserves every Thursday to practice drawing animals. A few years later, the family moved to New Zealand, where Sagar started helping his dad DIY around the house—fixing the TV, monkeying with the plumbing, tuning up the cars. He kept honing his drawing skills, too, paying particular attention to his mom’s portrait work. “She was able to almost capture somebody’s likeness with about three lines, getting someone’s curves just right,” he says. “It made me really conscious of the importance of the exact curves of people’s eyes and mouths and things like that.”
Sagar made use of those observations as a young man abroad, when he sketched portraits for cash on the street and in restaurants. Like many youngsters from his part of the world, he took an extended break between high school and college. For four years he crisscrossed the globe, drawing, bartending, selling door to door, even filling sandbags for the Australian army to pay his way. After returning to New Zealand, he earned a Ph.D. in engineering from the University of Auckland, then pursued postdoctoral work at MIT. In Massachusetts, he and some colleagues built digital models of the human eye that were detailed and lifelike enough for surgeons to use for practice. By 1998, Hollywood had called on Sagar to try to make computer-generated imagery, or CGI, look less CG.
His first project was a remake of The Incredible Mr. Limpet, which called for Sagar’s team to morph Jim Carrey into a talking fish capable of hunting Nazi U-boats. (Yes, really. The original starred Don Knotts.) Warner Bros. Entertainment Inc. abandoned the project after paying for $10 million in digital Carrey-fish expressions, deeming it too costly for a full-length film. Sagar, however, wasn’t ready to stop working on digital faces. For a couple of years he used the creatures as the basis of a virtual assistant startup called Life F/X and had his faces read emails aloud. The company died with the dot-com bubble, so Sagar took a job doing special effects for Sony Pictures Imageworks Inc. (Spider-Man 2). That made him well-known in the movie business and led him back to New Zealand in 2004.
At Weta Digital, the effects shop run by Lord of the Rings director and fellow Kiwi Peter Jackson, Sagar won two Academy Awards in seven years, overseeing the digital character creation for Jackson’s King Kong remake and James Cameron’s Avatar. His synthesis of engineering and artistry had provided him with an advantage in making Kong and the alien Na’vi seem real. Years of drawing portraits and crafting virtual eyeballs had given him insights into the nuances of the face that are uncommon among CGI specialists, while his effects software has made it relatively easy to film an actor going through a range of emotions and to automatically fuse the expressions into, say, a giant gorilla. “It’s these almost imperceptible movements in the eye and face that we pick up on as something having a soul behind it,” he says.
Feeling he’d solved the riddles of the face, Sagar dreamed bigger. He’d kept an eye on advancements in AI technology and saw an opportunity to marry it with his art. In 2011 he left the film business and returned to academia to see if he could go beyond replicating emotions and expressions. He wanted to get to the heart of what caused them. He wanted to start modeling humans from the inside out.
At the University of Auckland, Sagar created the Laboratory for Animate Technologies and recruited about a dozen researchers. Far from Weta—or his Life F/X office on Hollywood Boulevard, with Bob Marley’s star out front—the Animate team worked in a cramped room kept permanently hot and sticky by the heat from their powerful computers. When I saw the space last year, the engineers were surrounded by giant animated faces projected onto the walls, every pore and eyebrow hair distinctly rendered. Far from being lifeless, the faces appeared eager to strike up conversations, their muscles contracting and relaxing with each breath.
At the back corner of the lab, Sagar sat amid a clutter of notes and books such as The Archaeology of Mind and Principles of Computational Modelling in Neuroscience. It was there, on his pair of massive computer monitors, that he put BabyX through her virtual paces. The baby represented the culmination of much of the lab’s efforts, combining Sagar’s facial artistry with the latest in AI learning and speech software. Underneath that cherubic face, there was also some pioneering, and borderline horrifying, technology.
With a click of his mouse, Sagar stripped away BabyX’s skin, leaving a floating pair of eyes—bloody veins and all—attached to a finely detailed brain with a brain stem running down the back. This version of BabyX could still see out into the world and interact with us. When we showed her words, the part of the brain that deals with language glowed purple. When we praised her, the pleasure center lit up yellow. “Researchers have built lots of computational models of cognition and pieces of this, but no one has stuck them together,” he said. “This is what we’re trying to do: wire them together and put them in an animated body. We are trying to make a central nervous system for human computing.”
We want to know what makes us tick,what drives social learning, what is the nature of free will
Sagar clicked again, and the tissue of the brain and eyes vanished to reveal an intricate picture of the neurons and synapses within BabyX’s brain—a supercomplex highway of fine lines and nodules that glowed with varying degrees of intensity as BabyX did her thing. This layer of engineering owes its existence to the years Sagar’s team spent studying and synthesizing the latest research into how the brain works. The basal ganglia connect to the amygdala, which connects to the thalamus, and so on, with their respective functions (tactile processing, reward processing, memory formation) likewise laid out. In other words, the Auckland team has built what may be the most detailed map of the human brain in existence and has used it to run a remarkable set of simulations.
BabyX isn’t just an intimate picture; she’s more like a live circuit board. Virtual hits of serotonin, oxytocin, and other chemicals can be pumped into the simulation, activating virtual neuroreceptors. You can watch in real time as BabyX’s virtual brain releases virtual dopamine, lighting up certain regions and producing a smile on her facial layer. All the parts work together through an operating system called Brain Language, which Sagar and his team invented. Since we first spoke last year, his goals haven’t gotten any more modest. “We want to know what makes us tick, what drives social learning, what is the nature of free will, what gives rise to curiosity and how does it manifest itself in the world,” he says. “There are these fantastic questions about the nature of human beings that we can try and answer now because the technology has improved so much.”
Not long after my first play date with BabyX, Sagar packed up his lab and researchers and moved them to the top floor of Auckland’s iconic Ferry Building, where he started Soul Machines to commercialize his team’s breakthroughs. By his standards, the near-term commercial applications are pretty straightforward. About 45 staffers, including artists, AI experts, language experts, and coders, are building a cast of virtual assistants. For the most part, these are refined versions of Sagar’s Hollywood work, only they’re smart enough to understand spoken language and respond to queries, with less of the creep factor characteristic of virtual people.
The first face Soul Machines revealed to the world, in February, is Nadia, a pretty white woman with pulled-back brown hair, greenish eyes, pink lipstick, and Cate Blanchett’s voice. Sagar’s team developed her for Australia’s National Disability Insurance Agency, which plans to employ her as an online aid for the country’s 500,000 people with disabilities. The hope is that those interacting with Nadia on the agency’s website will find her more personable and usable than text-based chatbots or the menu trees on its automated phone line.
The interactivity goes both ways, according to Sagar. Nadia gives a subtle nod to signal understanding and appears quizzical when confused, but she also interprets viewers’ expressions through the cameras on their PCs or mobile devices. “If you look confused, it can see that and proactively guide you,” Sagar says. “You can also still yell at these things, but they will respond in the most gracious way. People are good at dealing with irate customers and adjust their body language for the situation. We can do the same thing.”
Sagar had some help with Nadia, using International Business Machines Corp.’s Watson technology as the basis for her speech recognition. His company recruited Blanchett to spend 15 hours recording phrases that the software can turn into a much wider variety of responses to questions. Nadia has already been tested on 10,000 people, who taught her to refine her answers and the emotions she displays at certain times. The Australian government expects her to start full-time work early next year.
Soul Machines has 10 trials under way with airlines, health-care providers, and financial-services firms. In the early going, the company’s biggest test will be whether users find its software realistic enough to be as satisfying as human conversation. Even successful customer-relations experiences with chatbots, ones where the bot gives the right answer, tend to leave people dissatisfied because they feel like they’ve been pawned off on an inferior being.
For now, Sagar’s team has been developing each of its first few virtual assistants in a one-off fashion, a bit like a consulting company. “Most of our clients today see their first digital employee as an extension of their brand,” says Chief Business Officer Greg Cross. “They are going through a design process that is similar to selecting a spokesperson for your TV advertising campaign.”
To make its process easier to repeat, Soul Machines is writing character creation software that reduces development to a series of simple menus. By sliding a few dials, Sagar can transform a young, thin avatar into an older, chubbier one and alter complexion and other features. Each menu-built result looks as lifelike as a character that a film production or video game developer might spend millions of dollars and many months to create. The company has paid actors to record hundreds of hours of monologue, assembling an audio library it can use to give voice to characters such as a troll meant for a client in Scandinavia or an animated, anthropomorphic strawberry that’ll be used on an educational site for children.
As the technology matures, Cross expects it to travel further from the PC screen. Automakers are already thinking about the characters fielding questions and answers from riders on screens in their self-driving cars. Similarly, Amazon, Apple, and Google parent Alphabet will likely want faces to go with their voice-activated virtual assistants. “We’re also exploring the idea of creating a digital celebrity,” Cross says. “What if you could take one of the top recording artists or sports people and build a digital version that fans could interact with in a very emotionally intelligent way?”
Like Cross, Sagar often appears oblivious that his pitch might sound creepy. In August, when I pay a visit to Soul Machines to see Sagar’s latest creations, he’s wearing a T-shirt that depicts two fetuses sharing a womb, arranged head-to-toe in a kind of yin-yang pose. One of the fetuses is human; the other has a distinctly artificial brain filled with circuitry. He wanted to make this design the company logo. The investors who gave him $7.5 million last November said no.
Sagar comes off like a visionary academic, at times almost possessed. Ask a basic question, and you’re likely to get an impassioned 30-minute response that weaves in AI, art, psychology, and Plato. It’s hard to imagine this man holding court with a car insurer, trying to sell a suit-wearing executive on a virtual avatar, without things getting weird. But Sagar says he relishes the commercial part of the business, because it’s helping him better understand what people like and don’t like about his avatars and zero in on the finer details of interpersonal interactions.
Version 5.0 of BabyX has gone far beyond the original floating head. BabyX now has a full body that sits in a high chair, legs bobbing back and forth while her hands look for something to do. For the next part, you’ll want to sit down and grab a pacifier, too.
Sagar’s software allows him to place a virtual pane of glass in front of BabyX. Onto this glass, he can project anything, including an internet browser. This means Sagar can present a piano keyboard from a site such as Virtual Piano or a drawing pad from Sketch.IO in front of BabyX to see what happens. It turns out she does what any other child would: She tries to smack her hands against the keyboard or scratch out a shabby drawing.
What compels BabyX to hit the keys? Well, when one of her hands nudges against a piano key, it produces a sound that the software turns into a waveform and feeds into her biological simulation. The software then triggers a signal within BabyX’s auditory system, mimicking the hairs that would vibrate in a real baby’s cochlea. Separately, the system sets off virtual touch receptors in her fingers and releases a dose of digital dopamine in her simulated brain. “The first time this happens, it’s a huge novelty because the baby has not had this reaction before when it touched something,” Sagar says. “We are simulating the feeling of discovery. That changes the plasticity of the sensory motor neurons, which allows for learning to happen at that moment.”
Does the baby get bored of the piano like your non-Mozart baby? Yes, indeed. As she bangs away at the keys, the amount of dopamine being simulated within the brain receptors decreases, and BabyX starts to ignore the keyboard.
Sagar has teamed up with Annette Henderson, a psychologist who runs a baby research lab in Auckland, to advance the technology. Henderson has filmed hundreds of hours of interactions between babies and caregivers while performing different experiments, such as teaching a baby a new word or ignoring him for a few minutes. The children’s response data—laughs, cries, hand movements, shifts in posture—are being digitized to create a better-informed behavioral model. “We know the exact movements, microexpressions, and responses,” Sagar says. “When we build our next models for BabyX, we should be able to generate this same behavior.”
In about 18 months, Henderson plans to use an upgraded version of BabyX to run experiments with caregivers and other children. She sees the virtual baby as a way to test new theories in previously unimaginable ways, by altering thousands of variables at will—what if a baby doesn’t smile, what if she won’t hold your gaze, and so on. Studying a virtual child’s response to stimuli, she says, may help researchers understand how to better engage with flesh-and-blood children who aren’t particularly social.
In return, Sagar gets to advance his quest to understand human nature. “We can record the mother interacting with a virtual baby and keep adding features to BabyX until she is so lifelike that we get a natural interaction,” he says. “At that point, we have achieved our goal.”
And then what?
Many of the world’s leading brain researchers have come away impressed by the types of simulations Sagar and other AI optimists are building. “I spend more and more time with these guys,” says Gary Lynch, a professor of neurobiology at the University of California at Irvine. “This is all real. It’s not an academic enterprise any longer.” The problem with work like Sagar’s, as Lynch sees it, is that the end result—a truly conscious virtual baby—is so complex and unique that it’s not a useful mirror of human behavior. “It will do something that nobody ever dreamed of,” he says. “It will head out the door and say, ‘Goodbye. I have stuff I want to do.’ ”
Other researchers caution that Sagar could be misleading people about the state of the technology through his cute, intricate faces. “Westerners tend to want to anthropomorphize these things, and we can get very enchanted by them,” says Ken Goldberg, a professor of industrial engineering and operations research at University of California at Berkeley. “If you make it look human and act human, you almost have a double responsibility to be clear about its limitations.” He applauds Sagar for doing this type of research but doesn’t want people to get false hope about the near-term benefits of such technology. Sagar has a tendency to talk as though BabyX can already do all the things he’s dreaming.
While it seems reasonable to assume Sagar’s endgame is a world that ties humans inextricably with machines, he often spends weekends in the wilderness to get away from computers, and he won’t let his kids use the internet at night. This isn’t exactly the type of behavior one might expect from someone pushing AI as fast as he can into the unknown and hoping for the best. During one of our conversations, I point out that tales such as Frankenstein don’t usually end up well for the humans. “We’re not digging up dead bodies,” he says, neatly dodging the real moral of the playing-God story.
You don’t have to be paranoid to believe the rise of AI could turn out quite badly for humans. The computers might start making decisions for themselves, and those decisions could include things detrimental to mankind. One minute, BabyX is eating a virtual pudding cup off a website; the next, she’s sold your house for personal amusement or decided she should be in charge.
Sagar remains sanguine about the lessons AI can learn from us and vice versa. “We’re searching for the basis of things like cooperation, which is the most powerful force in human nature,” he says. As he sees it, an intelligent robot that he’s taught cooperation will be easier for humans to work with and relate to and less likely to enslave us or harvest our bodies for energy. “If we are really going to take advantage of AI, we’re going to need to learn to cooperate with the machines,” he says. “The future is a movie. We can make it dystopian or utopian.” Let’s all pray for a heartwarming comedy.