NotebookLM’s automatically generated podcasts are surprisingly effective
Audio Overview is a fun new feature of Google’s NotebookLM which is getting a lot of attention right now. It generates a one-off custom podcast against content you provide, where …
GitHub - ictnlp/LLaMA-Omni: LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level. - ictnlp/LLaMA-Omni
AI-generated Tabs,chords,lyrics,melodies. Edit,transpose,separate tracks easily.Explore over 40M songs.Also includes interactive learning, turns any music or song(YouTube, Deezer, SoundCloud, MP3) into chords.Play along with guitar, ukulele, or piano.
GitHub - collabora/WhisperFusion: WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.
WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI. - GitHub - collabora/WhisperFusion: WhisperFusion builds upon the capabil...
yl4579/StyleTTS2: StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models - yl4579/StyleTTS2: StyleTTS 2: Towards Human-Level Text-to-Speech ...
An early look at the possibilities as we experiment with AI and Music
Today we’re sharing a sneak peek at our first set of AI-related music experiments - Dream Track for Shorts and Music AI tools – built in collaboration with Google DeepMind.
ElevenLabs - Generative AI Text to Speech & Voice Cloning
AI Voice Research Lab and AI Voice Generator. Generate high-quality spoken audio in any voice, style and language with the most powerful AI speech tool ever.
facebookresearch/audiocraft: Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
SevaSk/ecoute: Ecoute is a live transcription tool that provides real-time transcripts for both the user's microphone input (You) and the user's speakers output (Speaker) in a textbox. It also generates a suggested response using OpenAI's GPT-3.5 for the user to say based on the live transcription of the conversation.