voxforge.org - Free Speech... Recognition (Linux, Windows and Mac)
Audio computing
VALL-E
VALL-E is a neural codec language model using discrete codes derived from an off-the-shelf neural audio codec model, and regard TTS as a conditional language modeling task rather. VALL-E emerges in-context learning capabilities and can be used to synthesize high-quality personalized speech with only a 3-second enrolled recording of an unseen speaker as a prompt. We also extend VALL-E and train a multi-lingual conditional codec language model. VALL-E X can generate high-quality speech in the target language via just one speech utterance in the source language as a prompt while preserving the unseen speaker’s voice, emotion, and acoustic environment.
WhisperSpeech/WhisperSpeech · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface/parler-tts · GitHub
Inference and training library for high-quality TTS models. - huggingface/parler-tts
DiTTo-TTS
coqui-ai/TTS · GitHub
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production - coqui-ai/TTS
Voicery Text-to-Speech
Voicery creates natural-sounding Text-to-Speech (TTS) engines and custom brand voices for enterprise. Our solutions leverage cutting-edge deep-learning research optimized for your business use-case and technical infrastructure.
fishaudio/fish-speech: Brand new TTS solution
Brand new TTS solution. Contribute to fishaudio/fish-speech development by creating an account on GitHub.
Audiomatic
Audio translated automatically using AI voice cloning technology.
Parlatype
GNOME audio player for transcription
bigWav.app - Private audio transcription & annotation
bigWav: free and private audio transcription and annotation
Lovo - AI Voice Generator: Realistic Text to Speech & Voice Cloning
Award-winning AI Voice Generator and text to speech software with 500+ voices in 100 languages. Realistic AI Voices with Online Video Editor. Clone your own voice.
mkiol/dsnote: Speech Note Linux app
Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation. - mkiol/dsnote
Speech to Note - Voice to Text, Note Speech & Speak Writer Solution
Explore Speech to Note for top-notch voice to text, note speech, and speak writer solutions. Our AI technology powered by GPT-4o ensures easy conversion of your voice into written notes.
ihuguet/picotts · GitHub
Pico TTS: text to speech voice sinthesizer from SVox, included in Android AOSP - ihuguet/picotts
IBM Watson - Text to Speech
Watson Speech to Text is an API that transcribes speech to text in a variety of languages. It’s available as SaaS or for self-hosting.
Google Cloud - Text-to-Speech AI
Turn text into natural-sounding speech in 220+ voices across 40+ languages and variants with an API powered by Google’s machine learning technology.
Deutsche AI/KI TTS-Stimme kostenlos mit Thorsten-Voice
Das Thorsten-Voice Projekt stellt kostenlos deutsche, AI/KI erzeugte Text to Speech (TTS) Stimmen bereit die ohne Internet funktionieren.
thorstenMueller/Thorsten-Voice · GitHub
Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be available for every project without any license struggling. - thorstenMueller/Thorsten-Voice
Open Voices
Home page of OVOS
nvaccess/nvda: NVDA, the free and open source Screen Reader for Microsoft Windows
NVDA, the free and open source Screen Reader for Microsoft Windows - nvaccess/nvda
evuraan/mintPiper: Make Linux speak what's on the screen: clearly and securely.
Make Linux speak what's on the screen: clearly and securely. - evuraan/mintPiper
davidacm/NVDA-IBMTTS-Driver · GitHub
This project is aimed at developing and maintaining the NVDA IBMTTS driver. IBMTTS is a synthesizer similar to Eloquence. Please send your ideas and contributions here! - davidacm/NVDA-IBMTTS-Driver
mush42/sonata-nvda · GitHub
This add-on implements a speech synthesizer driver for NVDA using neural TTS models. It supports Piper - mush42/sonata-nvda
CrashXBETAX/Text_To_Speech_Live_WinUI3_Public · GitHub
Contribute to CrashXBETAX/Text_To_Speech_Live_WinUI3_Public development by creating an account on GitHub.
RHVoice/RHVoice: a free and open source speech synthesizer for Russian and other languages
a free and open source speech synthesizer for Russian and other languages - RHVoice/RHVoice
RHVoice.org
muflone/gespeaker · GitHub
A text to speech GTK+ front-end for eSpeak and mbrola to play a text in many languages with settings for voice, pitch, volume and speed - muflone/gespeaker
marytts/marytts · GitHub
MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java - marytts/marytts
brailcom/speechd: Common high-level interface to speech synthesis
Common high-level interface to speech synthesis. Contribute to brailcom/speechd development by creating an account on GitHub.