mlabonne/llm-course: Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

AI/ML
Fine-tune a Mistral-7b model with Direct Preference Optimization
Boost the performance of your supervised fine-tuned models
CultriX/MistralTrix-v1 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Leverage KeyBERT, HDBSCAN and Zephyr-7B-Beta to Build a Knowledge Graph
LLM-enhanced natural language processing and traditional machine learning techniques are used to extract structure and to build a knowledge…
Eyes on tokenize
I was writing a tokenizer for SMILES and came across a recent paper by the IBM Research team on reaction standardisation which contained a ...
The Narrated Transformer Language Model
AI/ML has been witnessing a rapid acceleration in model improvement in the last few years. The majority of the state-of-the-art models in the field are based on the Transformer architecture. Examples include models like BERT (which when applied to Google Search, resulted in what Google calls "one of the biggest leaps forward in the history of Search") and OpenAI's GPT2 and GPT3 (which are able to generate coherent text and essays).
This video by the author of the popular "Illustrated Transformer" guide will introduce the Transformer architecture and its various applications. This is a visual presentation accessible to people with various levels of ML experience.
Intro (0:00)
The Architecture of the Transformer (4:18)
Model Training (7:11)
Transformer LM Component 1: FFNN (10:01)
Transformer LM Component 2: Self-Attention(12:27)
Tokenization: Words to Token Ids (14:59)
Embedding: Breathe meaning into tokens (19:42)
Projecting the Output: Turning Computation into Language (24:11)
Final Note: Visualizing Probabilities (25:51)
The Illustrated Transformer:
https://jalammar.github.io/illustrated-transformer/
Simple transformer language model notebook:
https://github.com/jalammar/jalammar.github.io/blob/master/notebooks/Simple_Transformer_Language_Model.ipynb
Philosophers On GPT-3 (updated with replies by GPT-3):
https://dailynous.com/2020/07/30/philosophers-gpt-3/
-----
Twitter: https://twitter.com/JayAlammar
Blog: https://jalammar.github.io/
Mailing List: https://jayalammar.substack.com/
More videos by Jay:
Jay's Visual Intro to AI
https://www.youtube.com/watch?v=mSTCzNgDJy4
How GPT-3 Works - Easily Explained with Animations
https://www.youtube.com/watch?v=MQnJZuBGmSQ
Will AI Change Our Memories?
Go to https://www.squarespace.com/nerdwriter for 10% off your first purchase.GET THE PAPERBACK OF MY BOOK: https://amzn.to/3EPDQKtSupport Nerdwriter videos: ...
The I in LLM stands for intelligence | daniel.haxx.se
The Illustrated Transformer
Discussions:
Hacker News (65 points, 4 comments), Reddit r/MachineLearning (29 points, 3 comments)
Translations: Arabic, Chinese (Simplified) 1, Chinese (Simplified) 2, French 1, French 2, Italian, Japanese, Korean, Persian, Russian, Spanish 1, Spanish 2, Vietnamese
Watch: MIT’s Deep Learning State of the Art lecture referencing this post
Featured in courses at Stanford, Harvard, MIT, Princeton, CMU and others
In the previous post, we looked at Attention – a ubiquitous method in modern deep learning models. Attention is a concept that helped improve the performance of neural machine translation applications. In this post, we will look at The Transformer – a model that uses attention to boost the speed with which these models can be trained. The Transformer outperforms the Google Neural Machine Translation model in specific tasks. The biggest benefit, however, comes from how The Transformer lends itself to parallelization. It is in fact Google Cloud’s recommendation to use The Transformer as a reference model to use their Cloud TPU offering. So let’s try to break the model apart and look at how it functions.
The Transformer was proposed in the paper Attention is All You Need. A TensorFlow implementation of it is available as a part of the Tensor2Tensor package. Harvard’s NLP group created a guide annotating the paper with PyTorch implementation. In this post, we will attempt to oversimplify things a bit and introduce the concepts one by one to hopefully make it easier to understand to people without in-depth knowledge of the subject matter.
2020 Update: I’ve created a “Narrated Transformer” video which is a gentler approach to the topic:
A High-Level Look
Let’s begin by looking at the model as a single black box. In a machine translation application, it would take a sentence in one language, and output its translation in another.
What’s up with LLMs representing XORs of arbitrary features? — LessWrong
Thanks to Clément Dumas, Nikola Jurković, Nora Belrose, Arthur Conmy, and Oam Patel for feedback. …
Scikit-Mol – Easy Embedding of RDKit into Scikit-Learn | Cheminformania
Large language models on a mobile device? – Macs in Chemistry
LLMs and Programming in the first days of 2024 - antirez
dosco/llm-client: LLMClient - A simple library to build RAG + Reasoning + Function calling Agents + LLM Proxy + Tracing + Logging
LLMClient - A simple library to build RAG + Reasoning + Function calling Agents + LLM Proxy + Tracing + Logging - dosco/llm-client: LLMClient - A simple library to build RAG + Reasoning + Function ...
srush/MiniChain: A tiny library for coding with large language models.
A tiny library for coding with large language models. - srush/MiniChain: A tiny library for coding with large language models.
Issue 41: Reflecting on 2023 and Predictions for 2024
The Holiday Issue Part II
Neutral.News
Nᴇᴜᴛʀᴀʟ.Nᴇᴡs is a tool that takes any news article URL and processes its content to remove potential biases, emotionally charged language, and other subjective elements. The end result is a set of unbiased, neutral summarized points that represent the core information of the article.
What I Learned Using Private LLMs to Write an Undergraduate History Essay
TL;DR Context Writing A 1996 Essay Again in 2023, This Time With Lots More Transistors ChatGPT 3 Gathering the Sources PrivateGPT Ollama (and Llama2:70b) Hallucinations What I Learned TL;DR I used …
An Intuition for How Models like ChatGPT Work
Wonder Tools 🏆 Best of 2023
The most useful new sites & most-shared posts. Plus tools to watch in 2024
GitHub makes Copilot Chat generally available, letting devs ask questions about code | TechCrunch
Copilot Chat, a feature of GitHub's gen AI coding tool Copilot that lets devs ask natural language questions about code, is now generally available.
AI #44: Copyright Confrontation — LessWrong
The New York Times has thrown down the gauntlet, suing OpenAI and Microsoft for copyright infringement. Others are complaining about recreated images…
jasonjmcghee/rem: An open source approach to locally record and enable searching everything you view on your Apple Silicon.
An open source approach to locally record and enable searching everything you view on your Apple Silicon. - jasonjmcghee/rem: An open source approach to locally record and enable searching everythi...
Dark Visitors - A list of known AI agents on the internet
Insight into the hidden ecosystem of autonomous chatbots and data scrapers crawling across the web
AI Is Scarily Good at Guessing the Location of Random Photos - Schneier on Security
Pushing ChatGPT's Structured Data Support To Its Limits
“Function calling” with ChatGPT is ChatGPT’s best feature since ChatGPT.
Alchemite™ Analytics machine learning platform
Alchemite™ Analytics accelerates innovation through applied machine learning. Get deep insights into real-world, sparse and noisy, experimental and process data, Reduce the number of experiments required to achieve your goals by 50-80%.
Apple M2 Max GPU vs Nvidia V100, P100 and T4 | by Fabrice Daniel | No…
archived 5 Nov 2023 01:42:48 UTC
Music-Map - Find Similar Music
Music-Map is the similar music finder that helps you find similar bands and artists to the ones you love.
Many options for running Mistral models in your terminal using LLM
Mistral AI is the most exciting AI research lab at the moment. They’ve now released two extremely powerful smaller Large Language Models under an Apache 2 license, and have a …