AI/ML

AI/ML

2433 bookmarks
Custom sorting
The Illustrated Transformer
The Illustrated Transformer
Discussions: Hacker News (65 points, 4 comments), Reddit r/MachineLearning (29 points, 3 comments) Translations: Arabic, Chinese (Simplified) 1, Chinese (Simplified) 2, French 1, French 2, Italian, Japanese, Korean, Persian, Russian, Spanish 1, Spanish 2, Vietnamese Watch: MIT’s Deep Learning State of the Art lecture referencing this post Featured in courses at Stanford, Harvard, MIT, Princeton, CMU and others In the previous post, we looked at Attention – a ubiquitous method in modern deep learning models. Attention is a concept that helped improve the performance of neural machine translation applications. In this post, we will look at The Transformer – a model that uses attention to boost the speed with which these models can be trained. The Transformer outperforms the Google Neural Machine Translation model in specific tasks. The biggest benefit, however, comes from how The Transformer lends itself to parallelization. It is in fact Google Cloud’s recommendation to use The Transformer as a reference model to use their Cloud TPU offering. So let’s try to break the model apart and look at how it functions. The Transformer was proposed in the paper Attention is All You Need. A TensorFlow implementation of it is available as a part of the Tensor2Tensor package. Harvard’s NLP group created a guide annotating the paper with PyTorch implementation. In this post, we will attempt to oversimplify things a bit and introduce the concepts one by one to hopefully make it easier to understand to people without in-depth knowledge of the subject matter. 2020 Update: I’ve created a “Narrated Transformer” video which is a gentler approach to the topic: A High-Level Look Let’s begin by looking at the model as a single black box. In a machine translation application, it would take a sentence in one language, and output its translation in another.
·jalammar.github.io·
The Illustrated Transformer
Neutral.News
Neutral.News
Nᴇᴜᴛʀᴀʟ.Nᴇᴡs is a tool that takes any news article URL and processes its content to remove potential biases, emotionally charged language, and other subjective elements. The end result is a set of unbiased, neutral summarized points that represent the core information of the article.
·neutral.news·
Neutral.News
Wonder Tools 🏆 Best of 2023
Wonder Tools 🏆 Best of 2023
The most useful new sites & most-shared posts. Plus tools to watch in 2024
·wondertools.substack.com·
Wonder Tools 🏆 Best of 2023
AI #44: Copyright Confrontation — LessWrong
AI #44: Copyright Confrontation — LessWrong
The New York Times has thrown down the gauntlet, suing OpenAI and Microsoft for copyright infringement. Others are complaining about recreated images…
·lesswrong.com·
AI #44: Copyright Confrontation — LessWrong
Alchemite™ Analytics machine learning platform
Alchemite™ Analytics machine learning platform
Alchemite™ Analytics accelerates innovation through applied machine learning. Get deep insights into real-world, sparse and noisy, experimental and process data, Reduce the number of experiments required to achieve your goals by 50-80%.
·intellegens.com·
Alchemite™ Analytics machine learning platform
Many options for running Mistral models in your terminal using LLM
Many options for running Mistral models in your terminal using LLM
Mistral AI is the most exciting AI research lab at the moment. They’ve now released two extremely powerful smaller Large Language Models under an Apache 2 license, and have a …
·simonwillison.net·
Many options for running Mistral models in your terminal using LLM
GroqChat
GroqChat
A GroqLabs AI Language Interface
·chat.groq.com·
GroqChat
apple/ml-ferret
apple/ml-ferret
Contribute to apple/ml-ferret development by creating an account on GitHub.
·github.com·
apple/ml-ferret
Suno AI
Suno AI
We are building a future where anyone can make great music. No instrument needed, just imagination. From your mind to music.
·suno.ai·
Suno AI