Phrase archive restores lost voices
Staff Writer
“I bake sweet-chestnut bread,” a volunteer says into a microphone.
“I no longer understand what’s going on,” she carefully reads out next.
The volunteer, Kotobuki Hayashi, 56, is reading short lines of text popping up on a computer screen in front of her. The phrases have been taken randomly from newspapers and books. Studio staff check for any misreads.
In an hourlong session, Hayashi gets through about 150 phrases that will be used to create synthesized voices for people with amyotrophic lateral sclerosis, also known as ALS or Lou Gehrig’s disease, who can no longer speak.
“I thought it would be great if I could help those people simply by recording my voice,” Hayashi said. “Also, it’s exciting to imagine that fragments of my speech will be used to reconstruct voices.”
Hayashi is one of around 200 volunteers who have participated in a so-called voice bank project that kicked off in November last year. The goal is to reconstruct the voices of ALS sufferers by creating synthetic ones using an archive of other people’s voices.
The technology was developed by a team of researchers led by Junichi Yamagishi, an associate professor at the National Institute of Informatics who specializes in speech synthesis.
ALS is a progressive neurological disease that attacks the nervous system and paralyzes muscles. As it develops, patients lose the ability to speak, and in some cases can lose their voices within six months of diagnosis, experts say. One ALS sufferer widely known worldwide is physicist Stephen Hawking.
According to the Japan Intractable Diseases Information Center, there are some 9,000 people with ALS in Japan. Many communicate by typing into a personal computer or tablet PC by using whatever muscles they still have, and having a synthesized voice read it out loud. But the voice sounds impersonal and robotic.
“Patients very much needed to communicate in their own natural voice. But no such system existed that could provide a personalized synthesized voice for them,” Yamagishi said in a recent interview in Tokyo with The Japan Times.
Yamagishi and his team set out to re-create their own voices, initially in trials at Britain’s Edinburgh University in 2011.
The project has been running for three years and has seen about 600 volunteers take part, with 10 patients using the software. It is considered to be in its evaluation phase.
In Japan, the project is still in its initial phase and Yamagishi needs to collect as many voices as possible. The recording is done at rented studios in Tokyo, Osaka and Nagoya.
He anticipates it will take one or two years to develop a synthesizer but thinks it could help people with ALS and possibly other disorders.
Yamagishi’s system analyzes the recordings, processing them by using statistical models of the components of speech, and produces a basic voice model for each age group, sex and dialect. This model then serves as the framework for synthesizing the patient’s voice.
“It’s like transplanting part of the volunteers’ voices,” he said. “We find donors whose background matches the patients’ voices, such as in terms of age and home town. We then transplant elements of the donors’ voices, such as the speed with which they move their tongues,” to reconstruct the patients’ voices.
Some companies in Japan already conduct personalized voice synthesis by cutting and pasting recordings of the patients’ own voices. However, this requires hours of recording and is physically impossible for some ALS patients or for those who are already mute.
Yamagishi’s technology requires a 5-minute recording of a patient’s voice. Even if some of the words cannot be pronounced, the system can draw on examples from volunteers to guess at the patient’s original pronunciation.
It also helps to hear the voices of the patients’ siblings, Yamagishi said, since close relatives often have a similar accent or tone.
“There is no particular cure for ALS and it’s really hard for their families to see (their loved ones) develop the illness,” Yamagishi said. It’s important for families to have something to help improve the quality of the patients’ lives even a little, and the voice bank may be one of those things, he said.
“Now we are conducting a large-scale demonstration experiment . . . I want many volunteers from all the regions across Japan,” he said.
“I no longer understand what’s going on,” she carefully reads out next.
The volunteer, Kotobuki Hayashi, 56, is reading short lines of text popping up on a computer screen in front of her. The phrases have been taken randomly from newspapers and books. Studio staff check for any misreads.
In an hourlong session, Hayashi gets through about 150 phrases that will be used to create synthesized voices for people with amyotrophic lateral sclerosis, also known as ALS or Lou Gehrig’s disease, who can no longer speak.
“I thought it would be great if I could help those people simply by recording my voice,” Hayashi said. “Also, it’s exciting to imagine that fragments of my speech will be used to reconstruct voices.”
Hayashi is one of around 200 volunteers who have participated in a so-called voice bank project that kicked off in November last year. The goal is to reconstruct the voices of ALS sufferers by creating synthetic ones using an archive of other people’s voices.
The technology was developed by a team of researchers led by Junichi Yamagishi, an associate professor at the National Institute of Informatics who specializes in speech synthesis.
ALS is a progressive neurological disease that attacks the nervous system and paralyzes muscles. As it develops, patients lose the ability to speak, and in some cases can lose their voices within six months of diagnosis, experts say. One ALS sufferer widely known worldwide is physicist Stephen Hawking.
According to the Japan Intractable Diseases Information Center, there are some 9,000 people with ALS in Japan. Many communicate by typing into a personal computer or tablet PC by using whatever muscles they still have, and having a synthesized voice read it out loud. But the voice sounds impersonal and robotic.
“Patients very much needed to communicate in their own natural voice. But no such system existed that could provide a personalized synthesized voice for them,” Yamagishi said in a recent interview in Tokyo with The Japan Times.
Yamagishi and his team set out to re-create their own voices, initially in trials at Britain’s Edinburgh University in 2011.
The project has been running for three years and has seen about 600 volunteers take part, with 10 patients using the software. It is considered to be in its evaluation phase.
In Japan, the project is still in its initial phase and Yamagishi needs to collect as many voices as possible. The recording is done at rented studios in Tokyo, Osaka and Nagoya.
He anticipates it will take one or two years to develop a synthesizer but thinks it could help people with ALS and possibly other disorders.
Yamagishi’s system analyzes the recordings, processing them by using statistical models of the components of speech, and produces a basic voice model for each age group, sex and dialect. This model then serves as the framework for synthesizing the patient’s voice.
“It’s like transplanting part of the volunteers’ voices,” he said. “We find donors whose background matches the patients’ voices, such as in terms of age and home town. We then transplant elements of the donors’ voices, such as the speed with which they move their tongues,” to reconstruct the patients’ voices.
Some companies in Japan already conduct personalized voice synthesis by cutting and pasting recordings of the patients’ own voices. However, this requires hours of recording and is physically impossible for some ALS patients or for those who are already mute.
Yamagishi’s technology requires a 5-minute recording of a patient’s voice. Even if some of the words cannot be pronounced, the system can draw on examples from volunteers to guess at the patient’s original pronunciation.
It also helps to hear the voices of the patients’ siblings, Yamagishi said, since close relatives often have a similar accent or tone.
“There is no particular cure for ALS and it’s really hard for their families to see (their loved ones) develop the illness,” Yamagishi said. It’s important for families to have something to help improve the quality of the patients’ lives even a little, and the voice bank may be one of those things, he said.
“Now we are conducting a large-scale demonstration experiment . . . I want many volunteers from all the regions across Japan,” he said.