It’s called Tacotron 2 and is based on neural networks.
Google has developed an advanced speech synthesizer new generation. It’s called Tacotron 2 and is based on neural networks.
A system for converting text into natural-sounding speech Tacotron 2 copes with this problem more effective than their predecessors Tacotron and WaveNet.
Previous systems for the generation of the speech had a number of significant drawbacks. WaveNet, for example, issued a very sharp sound. Tacotron better cope with the intonation, but could not give a full “speech product”.
Tacotron algorithm 2, which is represented by a team of Google engineers with the participation of Jonathan Shen, works based on two neural networks. The printed version converted to special Tacotron-spectrogram, in which are distributed a rhythm, and accent, and the words are generated in the analogue WaveNet. In addition, the attached data acquisition system for neural network learning.
The audio is really similar to the speech of living people. The rate of speech that sounded very convincing, and the main hesitation occur in words with unusual pronunciation. However, some people in the comments claim that some speech system pronounces “lomano”.
Work samples Tacorton 2 can be heard on the official Google site. This technology is likely to immediately begin to use the products of the company.
One of the main problems of the new algorithm is the lack of adjust tone of speech. It is impossible to predict how the phrase is pronounced elevated, and kind of rude.
© 2017 – 2019, paradox. All rights reserved.