Search 2024

Found 26 bookmarks

Newest

Scene Co-pilot: Procedural Text to Video Generation with Human in the Loop

Scene Co-pilot: Procedural Text to Video Generation with Human in the Loop

·arxiv.org·Dec 6, 2024

Scene Co-pilot: Procedural Text to Video Generation with Human in the Loop

Scaling Transformers for Low-Bitrate High-Quality Speech Coding

·arxiv.org·Dec 3, 2024

Scaling Transformers for Low-Bitrate High-Quality Speech Coding

LARP: Tokenizing Videos 🎬 with a Learned Autoregressive Generative Prior 🚀

LARP: Tokenizing Videos 🎬 with a Learned Autoregressive Generative Prior 🚀

·hywang66.github.io·Nov 15, 2024

LARP: Tokenizing Videos 🎬 with a Learned Autoregressive Generative Prior 🚀

MeshRet

MeshRet

·abcyzj.github.io·Nov 15, 2024

Retrieval-Augmented Diffusion Models for Time Series Forecasting

·arxiv.org·Nov 15, 2024

Retrieval-Augmented Diffusion Models for Time Series Forecasting

Continuous Speech Synthesis using per-token Latent Diffusion

·arxiv.org·Nov 15, 2024

Continuous Speech Synthesis using per-token Latent Diffusion

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with...

·arxiv.org·Oct 16, 2024

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with...

Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models

Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models

·arxiv.org·Oct 6, 2024

Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models

FastTalker: Jointly Generating Speech and Conversational Gestures from Text

FastTalker: Jointly Generating Speech and Conversational Gestures from Text

·arxiv.org·Oct 6, 2024

FastTalker: Jointly Generating Speech and Conversational Gestures from Text

Intelligent Conversational Bot for Massive Online Open Courses (MOOCs)

Intelligent Conversational Bot for Massive Online Open Courses (MOOCs)

·arxiv.org·Oct 6, 2024

Intelligent Conversational Bot for Massive Online Open Courses (MOOCs)

MasterKey: Automated Jailbreak Across Multiple Large Language...

·arxiv.org·Sep 30, 2024

MasterKey: Automated Jailbreak Across Multiple Large Language...

SPARK: Self-supervised Personalized Real-time Monocular Face Capture

SPARK: Self-supervised Personalized Real-time Monocular Face Capture

·arxiv.org·Sep 25, 2024

SPARK: Self-supervised Personalized Real-time Monocular Face Capture

Training Spiking Neural Networks Using Lessons From Deep Learning

Training Spiking Neural Networks Using Lessons From Deep Learning

SEP #arxiv.org #2024 #SEP

·browse.arxiv.org·Sep 19, 2024

Training Spiking Neural Networks Using Lessons From Deep Learning

Apollo: Band-sequence Modeling for High-Quality Music Restoration in Compressed Audio

Apollo: Band-sequence Modeling for High-Quality Music Restoration in Compressed Audio

SEP #arxiv.org #2024 #SEP

·cslikai.cn·Sep 19, 2024

Apollo: Band-sequence Modeling for High-Quality Music Restoration in Compressed Audio

AudioBERT: Audio Knowledge Augmented Language Model

SEP #arxiv.org #2024 #SEP

·arxiv.org·Sep 19, 2024

AudioBERT: Audio Knowledge Augmented Language Model

Prompt2Fashion: An automatically generated fashion dataset

SEP #arxiv.org #2024 #SEP

·arxiv.org·Sep 19, 2024

Prompt2Fashion: An automatically generated fashion dataset

Startup success prediction and VC portfolio simulation using...

SEP #arxiv.org #2024 #SEP

·arxiv.org·Sep 17, 2024

Startup success prediction and VC portfolio simulation using...

LiDAR-Event Stereo Fusion with Hallucinations

LiDAR-Event Stereo Fusion with Hallucinations

SEP #arxiv.org #2024 #SEP

·eventvppstereo.github.io·Sep 16, 2024

LiDAR-Event Stereo Fusion with Hallucinations

LOOPY: TAMING AUDIO-DRIVEN PORTRAIT AVATAR WITH LONG-TERM MOTION DEPENDENCY

SEP #arxiv.org #2024 #SEP

·arxiv.org·Sep 5, 2024

LOOPY: TAMING AUDIO-DRIVEN PORTRAIT AVATAR WITH LONG-TERM MOTION DEPENDENCY

Accelerating Scientific Discovery with Generative Knowledge...

JUL #arxiv.org #2024 #JUL

·arxiv.org·Aug 1, 2024

Accelerating Scientific Discovery with Generative Knowledge...

Proxemics and Social Interactions in an Instrumented Virtual...

JUL #arxiv.org #2024 #JUL

·arxiv.org·Jul 31, 2024

Proxemics and Social Interactions in an Instrumented Virtual...

Florence-2: Advancing a Unified Representation for a Variety of...

JUL #arxiv.org #2024 #JUL

·arxiv.org·Jul 14, 2024

Florence-2: Advancing a Unified Representation for a Variety of...

Large Motion Model

Large Motion Model

JUN #arxiv.org #2024 #JUN #W23

·mingyuan-zhang.github.io·Jun 4, 2024

Large Motion Model

Large Motion Model for Unified Multi-Modal Motion Generation

View PDF

JUN #arxiv.org #2024 #JUN #W23

·arxiv.org·Jun 4, 2024

Large Motion Model for Unified Multi-Modal Motion Generation

Simple, unified analysis of Johnson-Lindenstrauss with applications

FEB #arxiv.org #W08 #2024 #FEB

·arxiv.org·Feb 23, 2024

Simple, unified analysis of Johnson-Lindenstrauss with applications

Spontaneous Theory of Mind for Artificial Intelligence

FEB #arxiv.org #W08 #2024 #FEB

·arxiv.org·Feb 23, 2024

Spontaneous Theory of Mind for Artificial Intelligence