Data-Efficient Multimodal Fusion on a Single GPUView PDF#Machine Learning#Computer Vision#Multimodal#Paper#PDF·arxiv.org·May 2, 2024Data-Efficient Multimodal Fusion on a Single GPU
Gong, Y., Rouditchenko, A., Liu, A. H., Harwath, D., Karlinsky, L., Kuehne, H., & Glass, J. (2022). Contrastive audio-visual masked autoencoder. arXiv preprint arXiv:2210.07839.#Machine Learning#Multimodal#Paper#PDF·openreview.net·Jun 10, 2023Gong, Y., Rouditchenko, A., Liu, A. H., Harwath, D., Karlinsky, L., Kuehne, H., & Glass, J. (2022). Contrastive audio-visual masked autoencoder. arXiv preprint arXiv:2210.07839.
Personalizing Stable Diffusion with Determined#Machine Learning#Model#API#Multimodal·determined.ai·Nov 1, 2022Personalizing Stable Diffusion with Determined
What the new wave of machine learning libraries means for SEO, marketing#SEO#Machine Learning#Multimodal#MUM#Large Language Models·searchengineland.com·Oct 6, 2022What the new wave of machine learning libraries means for SEO, marketing
An AI used medical notes to teach itself to spot disease on chest x-rays#Multimodal#Medical#Training#Machine Learning·technologyreview.com·Sep 15, 2022An AI used medical notes to teach itself to spot disease on chest x-rays
Mapping Urban Trees Across North America with the Auto Arborist Dataset#Forestry#Machine Learning#Multimodal#Google·ai.googleblog.com·Jun 23, 2022Mapping Urban Trees Across North America with the Auto Arborist Dataset
Google AI Introduces 'LIMoE': One Of The First Large-Scale Architecture That Processes Both Images And Text Using A Sparse Mixture Of Experts#Machine Learning#Multimodal#Subject Matter Experts·marktechpost.com·Jun 12, 2022Google AI Introduces 'LIMoE': One Of The First Large-Scale Architecture That Processes Both Images And Text Using A Sparse Mixture Of Experts
Vision Language models: towards multi-modal deep learning | AI Summer#Multimodal#Machine Learning#Large Language Models#Computer Vision#Natural Language Processing#Transformers#Attention·theaisummer.com·Mar 4, 2022Vision Language models: towards multi-modal deep learning | AI Summer
Google, Cambridge U & Alan Turing Institute Propose PolyViT: A Universal Transformer for Image, Video, and Audio Classification | Synced#Machine Learning#Multimodal·syncedreview.com·Dec 1, 2021Google, Cambridge U & Alan Turing Institute Propose PolyViT: A Universal Transformer for Image, Video, and Audio Classification | Synced
Artificial intelligence that understands object relationships#Machine Learning#Multimodal·news.mit.edu·Nov 29, 2021Artificial intelligence that understands object relationships