FlashSpeech: Efficient Zero-Shot Speech Synthesis#Generative Speech#Machine Learning#Audio#Paper#PDF·arxiv.org·Apr 24, 2024FlashSpeech: Efficient Zero-Shot Speech Synthesis
Proactive Detection of Voice Cloning with Localized WatermarkingDownload PDF#Watermark#Audio#Authentication#Paper#PDF·arxiv.org·Jan 31, 2024Proactive Detection of Voice Cloning with Localized Watermarking
MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised TrainingPDF#Machine Learning#Audio#Paper#PDF·arxiv.org·Jun 4, 2023MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers#Text-to-Speech#Microsoft#Generative Models#Audio#Paper#PDF·arxiv.org·Apr 30, 2023NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers