Speech synthesis: Cosmos — All that is, or was, or ever will be

Speech synthesis

cosmos 21st December 2017 at 3:22am

aka text-to-speech , TTS

Early video that created about TTS using Artificial neural networks (NetTalk)

WaveNet: A Generative Model for Raw Audio

"Interestingly, we found that training on many speakers made it better at modelling a single speaker than training on that speaker alone, suggesting a form of Transfer learning." Reminds me of the idea of contrasting cases

Deep Generative Models for Speech and Images

Generative Model-Based Text-to-Speech Synthesis, novel idea (at the end): train to minimize listener error

https://google.github.io/tacotron/publications/tacotron2/

Baidu's deep voice

https://www.wikiwand.com/en/Speech_synthesis