LIBMoE: A Library for comprehensive benchmarking Mixture of...View PDF#Mixture of Experts#Benchmark#Large Language Models#Paper#PDF·arxiv.org·Nov 6, 2024LIBMoE: A Library for comprehensive benchmarking Mixture of...
Multi-Head Mixture-of-ExpertsView PDF#Mixture of Experts#Machine Learning#Microsoft#Paper#PDF·arxiv.org·Apr 24, 2024Multi-Head Mixture-of-Experts
Jamba: A Hybrid Transformer-Mamba Language Model#Large Language Models#Paper#PDF#Mixture of Experts·arxiv.org·Apr 1, 2024Jamba: A Hybrid Transformer-Mamba Language Model
Hinton vs LeCun vs Ng vs Tegmark vs O#Regulation#Mixture of Experts#Blog·garymarcus.substack.com·Nov 27, 2023Hinton vs LeCun vs Ng vs Tegmark vs O
CS25 I Stanford Seminar - Mixture of Experts (MoE) paradigm and the Switch Transformer#Transformers#Mixture of Experts#Machine Learning#Large Language Models·youtube.com·Jul 18, 2022CS25 I Stanford Seminar - Mixture of Experts (MoE) paradigm and the Switch Transformer