MedCoT: Medical Chain of Thought via Hierarchical ExpertJAN#arxiv.org·arxiv.org·Jan 4, 2025MedCoT: Medical Chain of Thought via Hierarchical Expert
LeviTor: 3D Trajectory Oriented br Image-to-Video SynthesisJAN#arxiv.org·ppetrichor.github.io·Jan 4, 2025LeviTor: 3D Trajectory Oriented br Image-to-Video Synthesis
Scene Co-pilot: Procedural Text to Video Generation with Human in the LoopDEC#arxiv.org·arxiv.org·Dec 6, 2024Scene Co-pilot: Procedural Text to Video Generation with Human in the Loop
Scaling Transformers for Low-Bitrate High-Quality Speech CodingDEC#arxiv.org·arxiv.org·Dec 3, 2024Scaling Transformers for Low-Bitrate High-Quality Speech Coding
LARP: Tokenizing Videos 🎬 with a Learned Autoregressive Generative Prior 🚀NOV#arxiv.org·hywang66.github.io·Nov 15, 2024LARP: Tokenizing Videos 🎬 with a Learned Autoregressive Generative Prior 🚀
Retrieval-Augmented Diffusion Models for Time Series ForecastingNOV#arxiv.org·arxiv.org·Nov 15, 2024Retrieval-Augmented Diffusion Models for Time Series Forecasting
Continuous Speech Synthesis using per-token Latent DiffusionNOV#arxiv.org·arxiv.org·Nov 15, 2024Continuous Speech Synthesis using per-token Latent Diffusion
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with...OCT#arxiv.org·arxiv.org·Oct 16, 2024F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with...
Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language ModelsOCT#arxiv.org·arxiv.org·Oct 6, 2024Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models
FastTalker: Jointly Generating Speech and Conversational Gestures from TextFastTalker: Jointly Generating Speech and Conversational Gestures from TextOCT#arxiv.org·arxiv.org·Oct 6, 2024FastTalker: Jointly Generating Speech and Conversational Gestures from Text
Intelligent Conversational Bot for Massive Online Open Courses (MOOCs)Intelligent Conversational Bot for Massive Online Open Courses (MOOCs)OCT#arxiv.org·arxiv.org·Oct 6, 2024Intelligent Conversational Bot for Massive Online Open Courses (MOOCs)
MasterKey: Automated Jailbreak Across Multiple Large Language...SEP#arxiv.org·arxiv.org·Sep 30, 2024MasterKey: Automated Jailbreak Across Multiple Large Language...
SPARK: Self-supervised Personalized Real-time Monocular Face CaptureSEP#arxiv.org·arxiv.org·Sep 25, 2024SPARK: Self-supervised Personalized Real-time Monocular Face Capture
Training Spiking Neural Networks Using Lessons From Deep LearningTraining Spiking Neural Networks Using Lessons From Deep LearningSEP#arxiv.org#2024#SEP·browse.arxiv.org·Sep 19, 2024Training Spiking Neural Networks Using Lessons From Deep Learning
Apollo: Band-sequence Modeling for High-Quality Music Restoration in Compressed AudioSEP#arxiv.org#2024#SEP·cslikai.cn·Sep 19, 2024Apollo: Band-sequence Modeling for High-Quality Music Restoration in Compressed Audio
AudioBERT: Audio Knowledge Augmented Language ModelSEP#arxiv.org#2024#SEP·arxiv.org·Sep 19, 2024AudioBERT: Audio Knowledge Augmented Language Model
Prompt2Fashion: An automatically generated fashion datasetSEP#arxiv.org#2024#SEP·arxiv.org·Sep 19, 2024Prompt2Fashion: An automatically generated fashion dataset
Startup success prediction and VC portfolio simulation using...SEP#arxiv.org#2024#SEP·arxiv.org·Sep 17, 2024Startup success prediction and VC portfolio simulation using...
LiDAR-Event Stereo Fusion with HallucinationsSEP#arxiv.org#2024#SEP·eventvppstereo.github.io·Sep 16, 2024LiDAR-Event Stereo Fusion with Hallucinations
LOOPY: TAMING AUDIO-DRIVEN PORTRAIT AVATAR WITH LONG-TERM MOTION DEPENDENCYSEP#arxiv.org#2024#SEP·arxiv.org·Sep 5, 2024LOOPY: TAMING AUDIO-DRIVEN PORTRAIT AVATAR WITH LONG-TERM MOTION DEPENDENCY
Accelerating Scientific Discovery with Generative Knowledge...JUL#arxiv.org#2024#JUL·arxiv.org·Aug 1, 2024Accelerating Scientific Discovery with Generative Knowledge...
Proxemics and Social Interactions in an Instrumented Virtual...JUL#arxiv.org#2024#JUL·arxiv.org·Jul 31, 2024Proxemics and Social Interactions in an Instrumented Virtual...
Florence-2: Advancing a Unified Representation for a Variety of...JUL#arxiv.org#2024#JUL·arxiv.org·Jul 14, 2024Florence-2: Advancing a Unified Representation for a Variety of...
Large Motion Model for Unified Multi-Modal Motion GenerationView PDFJUN#arxiv.org#2024#JUN#W23·arxiv.org·Jun 4, 2024Large Motion Model for Unified Multi-Modal Motion Generation
Simple, unified analysis of Johnson-Lindenstrauss with applicationsFEB#arxiv.org#W08#2024#FEB·arxiv.org·Feb 23, 2024Simple, unified analysis of Johnson-Lindenstrauss with applications
Spontaneous Theory of Mind for Artificial IntelligenceFEB#arxiv.org#W08#2024#FEB·arxiv.org·Feb 23, 2024Spontaneous Theory of Mind for Artificial Intelligence