The Mind Reader
How Frank Guenther turns thoughts into words
http://www.bu.edu/today/2011/the-mind-reader/
For thousands of years humans have spoken. Noam Chomsky and many other linguists argue that speech is what sets
Homo sapiens apart in the animal kingdom. “Speech,” wrote Aristotle, “is the representation of the mind.”
It is a complex process, the series of lightning-quick steps by which your thoughts form themselves into words and travel from your brain, via the tongue, lips, vocal folds, and jaw (together known as the articulators), to your listeners’ ears—and into their own brains.
Complex, but mappable. Over the course of two decades and countless experiments using functional magnetic resonance imaging (fMRI) and other methods of data collection, neuroscientist Frank Guenther has built a computer model describing just how your brain pulls off the trick of speaking.
And the information isn’t merely fascinating. Guenther (GRS’93), a Sargent College professor of speech, language and hearing sciences, believes his model will help patients suffering from apraxia (where the desire to speak is intact, but speech production is damaged), stuttering, Lou Gehrig’s disease, throat cancer, even paralysis.
“Having a detailed understanding of how a complex system works helps you fix that system when it’s broken,” says Guenther, a former engineer who left Raytheon (“I hated being a corporate cog”) to earn a PhD in cognitive and neural sciences at BU. He now directs that program. “And a model like this is what it takes to really start understanding some of these complicated communication disorders.”
Guenther’s virtual vocal tract, Directions into Velocities of Articulators (DIVA), is the field’s leading model of speech production. It is based on fMRI studies showing what groups of neurons are activated in which regions of the brain when humans speak various phonemes (the mini-syllables that compose all words). The DIVA system imitates the way we speak: moving our articulators and unconsciously listening to ourselves and auto-correcting. When Guenther runs a fresh program, the model even goes through a babbling phase, teaching itself to produce phonemes just as human babies do.
Guenther and colleagues in his lab, which he moved to Sargent from the College of Arts & Sciences when the cognitive and neural sciences department was
dissolved this year and its activities distributed to other BU teaching and research units, continue to perfect the model, but primarily, they’re focused on “using insights from the model to help us address disorders like stuttering,” Guenther says. “What we’ll do is modify the model by damaging it to mimic what’s going on in these disorders.” As they learn more about the physiological differences in the brains of stutterers, for example, his team comes closer to “having more precise hypotheses about which receptor systems a drug should target, which should lead us more quickly to a drug that doesn’t cause other behavioral problems.”
Pick a letter and these caps can probably guess which one you’re thinking of. Sensors in the caps—the red one manufactured by Frank Guenther, the gray one modified by his team from an existing product—pick up the brain’s electrical signals and transmit them to a computer screen.
Giving voice to a thought
A large part of Guenther’s work consists of devising “brain-computer interface methods for augmentative communication,” he says. The most dramatic example has been a collaboration with pioneering neuroscientist Phil Kennedy of
Neural Signals, Inc., in Georgia, in which software developed by Guenther’s lab helped a paralyzed man articulate vowels with his mind.
Guenther explains the condition of a patient who is physically paralyzed but mentally sound: “In locked-in syndrome, the cortex, the main parts of the brain that the model addresses, are actually intact. What’s messed up is the motor output part of the brain. So the planning of speech goes on fine, but there’s no output.” Guenther had speculated that “if we knew what their neural signals were, how they were representing the speech, then we should be able to decode the speech. And it turned out that Kennedy and his team had implanted somebody with an electrode in that part of the brain—the speech motor cortex—but were unable to decode the signals.”
The volunteer who received the implant was
Erik Ramsey, who had suffered a severe stroke following a car crash and could communicate only by answering questions with “yes” or “no” using eye movements. With a grant from the
National Institutes of Health, Guenther and colleagues built Ramsey a neural prosthesis in 2008. With his electrodes hooked up to a wireless transmitter, Ramsey imagined speaking vowels, activating neurons that powered a real-time speech synthesizer (emitting a robotic “ahhhhoooooeeee…”) while the researchers watched his progress on a monitor that showed his formant plane, an X-Y axis graph representing “what we call the formant frequencies—where the tongue is, basically,” Guenther says.
“By the end of the experiment,” Guenther says, “he was hitting the auditory targets about 80 percent to 90 percent correctly.”
"It won’t cost patients $50,000, and they won’t have to undergo brain surgery," says Guenther. "It’s the kind of off-the-shelf thing that they can buy and use to communicate within a day or two of practicing."
Fuzzy mind reading
There are less invasive neural-prosthetic options, which Guenther’s lab is also pursuing. Electroencephalography, or EEG, involves picking up the brain’s electrical signals through external sensors resting on the subject’s head. Guenther’s colleague Jon Brumberg, a SAR research assistant professor, is testing an EEG system in which one imagines moving one’s left or right hand or foot, thereby moving a cursor on a screen. Another method involves choosing letters by staring at them on an alphabet grid.
These laborious methods have advantages, Guenther says. “First of all, it won’t cost patients $50,000, and they won’t have to undergo brain surgery. It’s the kind of off-the-shelf thing that they can buy and use to communicate”—albeit slowly—“within a day or two of practicing.”
However, because of interference from the skull, EEG signals have limited value. “Imagine an old TV antenna where you get a fuzzy picture,” Guenther says. “That’s what EEG is like. For real-time control of a synthesizer to produce conversational speech, I think the best way is going to be intracortical, intracranial, because you’re always going to get higher-resolution signals.” And Ramsey succeeded in producing vowels with only 2 output channels, while “the next system will have up to 96 channels,” Guenther says.
He points out that “these are the initial attempts. It’s like the first rockets that went up but didn’t even go into orbit. This is going to get more and more refined over the next decades. But it will happen. I can imagine a day when these surgeries become so routine that it’s not a big deal. Somebody might wear such a device as a necklace with a speaker on it.”
Guenther relishes his work as a pioneer at the nexus of engineering, neuroscience, and now rehabilitation. “Coming to Sargent College has been good timing for me because my earlier career was building up this model of normal human brain function,” he says, “and now that we’re starting to look at the disorders, like stuttering, we’re getting insights by talking to clinicians, and getting access to clinical populations, at Sargent.”
What hasn’t changed is Guenther’s fascination with the human brain. “It’s such an unbelievable machine. I’ve studied computers, and the brain does many things so much better than computers. And if you figure out how the brain works, you understand the mind, and you understand some of life’s great mysteries.”
Patrick L. Kennedy can be reached at plk@bu.edu.
This article originally appeared in the 2011-2012 edition of Inside Sargent.