GitHub - yl4579/StyleTTS2: StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models - yl4579/StyleTTS2
GitHub - huakunyang/SummerTTS: SummerTTS 是一个基于C++的独立编译的中文和英文语音合成项目,可以本地运行不需要网络,而且没有额外的依赖,一键编译完成即可用于中文和英文的语音合成。SummerTTS is a standalone Chinese and English speech synthesis(TTS) project that has almost no dependency and could be easily used for Chinese TTS with just one key build out
SummerTTS 是一个基于C++的独立编译的中文和英文语音合成项目,可以本地运行不需要网络,而且没有额外的依赖,一键编译完成即可用于中文和英文的语音合成。SummerTTS is a standalone Chinese and English speech synthesis(TTS) project that has almost no dependency and could b...
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi...
VanillaVoice - Turn Text into Human-Sounding Speech
This free text to speech tool will read out loud any text with a natural human-sounding voice. You can select several different voices including male, female, or child.
SevaSk/ecoute: Ecoute is a live transcription tool that provides real-time transcripts for both the user's microphone input (You) and the user's speakers output (Speaker) in a textbox. It also generates a suggested response using OpenAI's GPT-3.5 for the user to say based on the live transcription of the conversation. --- SevaSk/ecout:Ecoute 是一种实时转录工具,可在文本框中提供用户麦克风输入 (You) 和用户扬声器输出 (Speaker) 的实时转录。它还使用 OpenAI 的 GPT-3.5 生成建议响应,供用户根据对话的实时转录说出建议。
Ecoute is a live transcription tool that provides real-time transcripts for both the user's microphone input (You) and the user's speakers output (Speaker) in a textbox. It also gen...
We developed an online text-to-speech synthesis tool, which converts text into natural and smooth human voice, provides 100+ speakers for you to choose, supports multi-language, multi-dialect and Chinese-English mixing, and can configure audio flexibly parameter. It is widely used in news reading, travel navigation, intelligent hardware and notification broadcasting. And can convert the text content into MP3 files to download and save.