MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Apple Announces MM1: A Family of Multimodal LLMs Up To 30B Parameters that are SoTA in Pre-Training Metrics and Perform Competitively after Fine-Tuning
Ferret: Refer and Ground Anything Anywhere at Any Granularity
New models and developer products announced at DevDay
LLaVA
NExT-GPT: Any-to-Any Multimodal LLM
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace
Stable Bias: Analyzing Societal Representations in Diffusion Models
ChatGPT is on the horizon: Could a large language model be all we need for Intelligent Transportation?
MusicLM
Point-E: A System for Generating 3D Point Clouds from Complex Prompts
Baidu Research
AI's New Creative Streak Sparks a Silicon Valley Gold Rush
What the new wave of machine learning libraries means for SEO, marketing
DreamFusion: Text-to-3D using 2D Diffusion
Is This The Death of VFX?
Foundation models: 2022’s AI paradigm shift
The AI Unbundling
DALL-E 2's Failures Are the Most Interesting Thing About It
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
Unified-IO, a new general purpose model from AI2
How AI creates photorealistic images from text
Artificial intelligence system learns concepts shared across video, audio, and text
Vision Language models: towards multi-modal deep learning | AI Summer
Experts Say That Soon, Almost the Entire Internet Could Be Generated by AI
CM3: A Causal Masked Multimodal Model of the Internet (Paper Explained w/ Author Interview)