LIBMoE: A Library for comprehensive benchmarking Mixture of...View PDF#Mixture of Experts#Benchmark#Large Language Models#Paper#PDF·arxiv.org·Nov 6, 2024LIBMoE: A Library for comprehensive benchmarking Mixture of...
Multi-Head Mixture-of-ExpertsView PDF#Mixture of Experts#Machine Learning#Microsoft#Paper#PDF·arxiv.org·Apr 24, 2024Multi-Head Mixture-of-Experts
Jamba: A Hybrid Transformer-Mamba Language Model#Large Language Models#Paper#PDF#Mixture of Experts·arxiv.org·Apr 1, 2024Jamba: A Hybrid Transformer-Mamba Language Model