Cartesia - Poe
OpenAI on Twitter / X
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Download PDF
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers