Audio computing

134 bookmarks
Newest
Plachtaa/VALL-E-X: An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
Plachtaa/VALL-E-X: An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/ - Plachtaa/VALL-E-X
·github.com·
Plachtaa/VALL-E-X: An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
rsxdalv/tts-generation-webui: TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5, F5-TTS, ParlerTTS)
rsxdalv/tts-generation-webui: TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5, F5-TTS, ParlerTTS)
TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5, F5-TTS, ParlerTTS) - rsxdalv/tts-generation-webui
·github.com·
rsxdalv/tts-generation-webui: TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5, F5-TTS, ParlerTTS)
OpenAI.fm
OpenAI.fm
An interactive demo for developers to try the new text-to-speech model in the OpenAI API
·openai.fm·
OpenAI.fm
Bland AI | Automate Phone Calls with Conversational AI for Enterprises
Bland AI | Automate Phone Calls with Conversational AI for Enterprises
Transform your enterprise communication with Bland AI. Automate inbound and outbound phone calls using AI that sounds human. Perfect for sales, customer support, and operations with customizable voices and seamless integrations.
·bland.ai·
Bland AI | Automate Phone Calls with Conversational AI for Enterprises
‎Transcriptor
‎Transcriptor
‎Convert voice to text in real time! The UI couldn't be simpler! You can edit, search and share all your transcriptions. Your transcriptions are automatically saved to iCloud. Supported languages: English, Arabic, Chinese, Dutch, French, German, Hindi, Indonesian, Italian, Japanese, Korean, Polish…
·apps.apple.com·
‎Transcriptor
TTSMaker - Free Text to Speech Online
TTSMaker - Free Text to Speech Online
TTSMaker is a free text-to-speech tool and an online text reader that can convert text to speech, as an AI voice generator, it supports 100+ languages and 300+ voice styles, powerful neural network makes speech sound more natural, you can listen online, or download audio files in mp3, wav format.
·ttsmaker.com·
TTSMaker - Free Text to Speech Online
open-mmlab/Amphion: Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
open-mmlab/Amphion: Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi...
·github.com·
open-mmlab/Amphion: Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Parrot AI - Celebrity Voice Generator
Parrot AI - Celebrity Voice Generator
Parrot AI is the top celebrity voice generator. Create fun audio clips to roast your friends, send birthday messages, and light up your group chat!
·tryparrotai.com·
Parrot AI - Celebrity Voice Generator
w-okada/voice-changer · GitHub
w-okada/voice-changer · GitHub
リアルタイムボイスチェンジャー Realtime Voice Changer. Contribute to w-okada/voice-changer development by creating an account on GitHub.
·github.com·
w-okada/voice-changer · GitHub
VALL-E
VALL-E
VALL-E is a neural codec language model using discrete codes derived from an off-the-shelf neural audio codec model, and regard TTS as a conditional language modeling task rather. VALL-E emerges in-context learning capabilities and can be used to synthesize high-quality personalized speech with only a 3-second enrolled recording of an unseen speaker as a prompt. We also extend VALL-E and train a multi-lingual conditional codec language model. VALL-E X can generate high-quality speech in the target language via just one speech utterance in the source language as a prompt while preserving the unseen speaker’s voice, emotion, and acoustic environment.
·microsoft.com·
VALL-E
Voicery Text-to-Speech
Voicery Text-to-Speech
Voicery creates natural-sounding Text-to-Speech (TTS) engines and custom brand voices for enterprise. Our solutions leverage cutting-edge deep-learning research optimized for your business use-case and technical infrastructure.
·voicery.com·
Voicery Text-to-Speech