Search AI/ML

Found 36 bookmarks

Custom sorting

FastRTC: The Real-Time Communication Library for Python

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

#voice #audio #python

·huggingface.co·Mar 7, 2025

FastRTC: The Real-Time Communication Library for Python

📚 Convert E-books into audiobooks with Kokoro - Claudio Santini

A guide on how to convert .epub e-books into high-quality audiobooks narrated by neural text-to-speech

#audio #tools

·claudio.uk·Jan 16, 2025

📚 Convert E-books into audiobooks with Kokoro - Claudio Santini

GitHub - lifeiteng/OmniSenseVoice: Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯

Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯 - lifeiteng/OmniSenseVoice

#voice #audio

·github.com·Oct 14, 2024

GitHub - lifeiteng/OmniSenseVoice: Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯

Riffusion

#audio #music

·riffusion.com·Oct 9, 2024

Riffusion

Ted Benson

#safety #security #audio #voice

·edwardbenson.com·Oct 8, 2024

Ted Benson

Wispr Flow | Effortless Voice Dictation

Flow makes writing quick and clear with seamless voice dictation. It is the fastest, smartest way to type with your voice.

#voice #automation #agent #audio

·flowvoice.ai·Oct 4, 2024

Wispr Flow | Effortless Voice Dictation

NotebookLM’s automatically generated podcasts are surprisingly effective

Audio Overview is a fun new feature of Google’s NotebookLM which is getting a lot of attention right now. It generates a one-off custom podcast against content you provide, where …

#RAG #audio

·simonwillison.net·Sep 30, 2024

NotebookLM’s automatically generated podcasts are surprisingly effective

GitHub - ictnlp/LLaMA-Omni: LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level. - ictnlp/LLaMA-Omni

#speech #audio

·github.com·Sep 21, 2024

GitHub - fixie-ai/ultravox

Contribute to fixie-ai/ultravox development by creating an account on GitHub.

#voice #audio

·github.com·Jun 10, 2024

GitHub - fixie-ai/ultravox

Suno

Suno is building a future where anyone can make great music.

#music #audio

·suno.com·May 12, 2024

Suno

Cleft Notes - Turn Voice Memos Into Shared Notes

Cleft AI moves messy ideas to organized content instantly

#voice #audio

·cleftnotes.com·Mar 31, 2024

Cleft Notes - Turn Voice Memos Into Shared Notes

DoMusic/Hybrid-Net: Real-time audio source separation, generate lyrics, chords, beat.

Real-time audio source separation, generate lyrics, chords, beat. - DoMusic/Hybrid-Net

#music #audio

·github.com·Mar 29, 2024

DoMusic/Hybrid-Net: Real-time audio source separation, generate lyrics, chords, beat.

Lamucal : AI-Enhanced Tabs & Chords for Any Song

AI-generated Tabs,chords,lyrics,melodies. Edit,transpose,separate tracks easily.Explore over 40M songs.Also includes interactive learning, turns any music or song(YouTube, Deezer, SoundCloud, MP3) into chords.Play along with guitar, ukulele, or piano.

#audio #music

·lamucal.ai·Mar 29, 2024

Lamucal : AI-Enhanced Tabs & Chords for Any Song

Aqua Voice - Voice-only Document Editor

dictation AI

#audio

·withaqua.com·Mar 27, 2024

Aqua Voice - Voice-only Document Editor

Simply News

AI-Powered Conversations on Current Events, Sports, and Beyond!

#news #audio

·simplynews.ai·Mar 9, 2024

Simply News

EMO

EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

#image #video #audio

·humanaigc.github.io·Mar 2, 2024

EMO

GitHub - collabora/WhisperFusion: WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.

WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI. - GitHub - collabora/WhisperFusion: WhisperFusion builds upon the capabil...

#audio #speech

·github.com·Jan 30, 2024

GitHub - collabora/WhisperFusion: WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.

Home

#voice #audio

·research.myshell.ai·Jan 15, 2024

Home

superwhisper

AI powered voice to text for macOS

#mac #audio #voice #language

·superwhisper.com·Jan 11, 2024

superwhisper

Music-Map - Find Similar Music

Music-Map is the similar music finder that helps you find similar bands and artists to the ones you love.

#audio #music

·music-map.com·Dec 26, 2023

Music-Map - Find Similar Music

Suno AI

We are building a future where anyone can make great music. No instrument needed, just imagination. From your mind to music.

#audio #music

·suno.ai·Dec 25, 2023

Suno AI

yl4579/StyleTTS2: StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models - yl4579/StyleTTS2: StyleTTS 2: Towards Human-Level Text-to-Speech ...

#speech #audio

·github.com·Nov 20, 2023

yl4579/StyleTTS2: StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

An early look at the possibilities as we experiment with AI and Music

Today we’re sharing a sneak peek at our first set of AI-related music experiments - Dream Track for Shorts and Music AI tools – built in collaboration with Google DeepMind.

#audio

·blog.youtube·Nov 19, 2023

An early look at the possibilities as we experiment with AI and Music

Transforming the future of music creation

Announcing our most advanced music generation model and two new AI experiments, designed to open a new playground for creativity

#audio

·deepmind.google·Nov 19, 2023

Transforming the future of music creation

whisper.cpp/examples/talk-llama at master · ggerganov/whisper.cpp

Port of OpenAI's Whisper model in C/C++. Contribute to ggerganov/whisper.cpp development by creating an account on GitHub.

#audio

·github.com·Nov 3, 2023

whisper.cpp/examples/talk-llama at master · ggerganov/whisper.cpp

ElevenLabs - Generative AI Text to Speech & Voice Cloning

AI Voice Research Lab and AI Voice Generator. Generate high-quality spoken audio in any voice, style and language with the most powerful AI speech tool ever.

#audio

·elevenlabs.io·Aug 18, 2023

ElevenLabs - Generative AI Text to Speech & Voice Cloning

Sir Paul McCartney says artificial intelligence has enabled a 'final' Beatles song - BBC News

#music #audio

·bbc.com·Jun 13, 2023

Sir Paul McCartney says artificial intelligence has enabled a 'final' Beatles song - BBC News

MusicGen: Simple and Controllable Music Generation

#audio #music

·ai.honu.io·Jun 11, 2023

MusicGen: Simple and Controllable Music Generation

facebookresearch/audiocraft: Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

#audio #music

·github.com·Jun 11, 2023

SevaSk/ecoute: Ecoute is a live transcription tool that provides real-time transcripts for both the user's microphone input (You) and the user's speakers output (Speaker) in a textbox. It also generates a suggested response using OpenAI's GPT-3.5 for the user to say based on the live transcription of the conversation.

#audio

·github.com·May 31, 2023