Deep Learning
Learn AI
The Illustrated Transformer
Discussions:
Hacker News (65 points, 4 comments), Reddit r/MachineLearning (29 points, 3 comments)
Translations: Arabic, Chinese (Simplified) 1, Chinese (Simplified) 2, French 1, French 2, Italian, Japanese, Korean, Persian, Russian, Spanish 1, Spanish 2, Vietnamese
Watch: MIT’s Deep Learning State of the Art lecture referencing this post
In the previous post, we looked at Attention – a ubiquitous method in modern deep learning models. Attention is a concept that helped improve the performance of neural machine translation applications. In this post, we will look at The Transformer – a model that uses attention to boost the speed with which these models can be trained. The Transformer outperforms the Google Neural Machine Translation model in specific tasks. The biggest benefit, however, comes from how The Transformer lends itself to parallelization. It is in fact Google Cloud’s recommendation to use The Transformer as a reference model to use their Cloud TPU offering. So let’s try to break the model apart and look at how it functions.
The Transformer was proposed in the paper Attention is All You Need. A TensorFlow implementation of it is available as a part of the Tensor2Tensor package. Harvard’s NLP group created a guide annotating the paper with PyTorch implementation. In this post, we will attempt to oversimplify things a bit and introduce the concepts one by one to hopefully make it easier to understand to people without in-depth knowledge of the subject matter.
2020 Update: I’ve created a “Narrated Transformer” video which is a gentler approach to the topic:
A High-Level Look
Let’s begin by looking at the model as a single black box. In a machine translation application, it would take a sentence in one language, and output its translation in another.
Embeddings - LlamaIndex 🦙 0.9.7
ivanfioravanti/chatbot-ollama: Chatbot Ollama is an open source chat UI for Ollama.
Chatbot Ollama is an open source chat UI for Ollama.
Vector Databases and How to Pick the Right One | by Ketan | in FabricHQ - Freedium
In the 1960s when the need for efficient data management arose with the growing volume of...
What is a Vector Database? - Zilliz Vector database blog
A vector database is a dedicated solution for storing, indexing and searching across a massive dataset of unstructured data used in AI applications.
What is a Vector Database? (2021) | Hacker News
Python Vector Databases and Vector Indexes: Architecting LLM Apps - KDnuggets
Vector databases enable fast similarity search and scale across data points. For LLM apps, vector indexes can simplify architecture over full vector databases by attaching vectors to existing storage. Choosing indexes vs databases depends on specialized needs, existing infrastructure, and broader enterprise requirements.
Implementing Vector Database for AI
What are vector databasesWhy are vector databases important to AICore concepts of a vector databaseFactors to consider when choosing a vector databasePopular Vector Databases for your considerationStep by Step guide to implementing a Vector databaseStep 1 : Installing MilvusStep 2: Creating a Milvus ClientStep 3 : Create a collectionStep 4: Inserting data
Introducing Vector-Storage: A Lightweight Vector Database for the Browser | by Nitai Aharoni 🎾 - Freedium
In the world of natural language processing (NLP), vector embeddings have become a powerful tool...
Storage Limitations: The browser's local storage has a size limit of approximately 5MB, which may limit the number of document vectors that can be stored. Vector Storage addresses this by implementing an LRU mechanism to manage storage size.
Top 10 best vector databases and libraries
A vector database is a type of database that stores data as high-dimensional vectors, which are mathematical representations of features or attributes.
Elasticsearch
Encoding data with Transformers | by Michelangiolo Mazzeschi | in Towards Data Science - Freedium
How to use transformer-based technology to perform data encoding
Leveraging LLMs in your Obsidian Notes · Ollama Blog
This post walks through how you could incorporate a local LLM using Ollama in Obsidian, or potentially any note taking tool.
Ollama
Get up and running with large language models, locally.
Text to Speech & AI Voice Generator
Create premium AI voices in any style and language with the most powerful AI speech tool ever. Clone your voice and generate voiceovers in minutes with our AI voice generator.
Mistral 7B Tutorial: Step-by-Step Guide for Using Mistral 7B
The tutorial covers accessing, quantizing, fine-tuning, merging, and saving this powerful 7.3 billion parameter open-source language model.
Bring Your Own Data to LLMs Using LangChain & LlamaIndex
Unlocking the Power of Large Language Models — GenAI, LLMs, RAG — ChatGPT
Ask Your Web Pages Using Mistral-7b & LangChain
Chat with Web Pages — Mistral-7b, Hugging Face, LangChain, ChromaDB
,
Learn to Build — Towards AI Community Newsletter #2 – Towards AI
Originally published on Towards AI. What a weekend and week in AI… You missed out if you haven’t followed the OpenAI drama over the past few days. Something unbelievable...
AI Expert Roadmap
Roadmaps to becoming a Full-Stack AI Developer, Data Scientist, Machine Learning Engineer, and more - KDnuggets
As the fields related to AI and Data Science expand, they are becoming complex with more options and specializations to consider. If you are beginning your journey toward becoming an expert in Artificial Intelligence, this roadmap will guide you to find your path along what to learn next while steering…
Alphaa Home
Jupyter Notebook Tutorial: Introduction, Setup, and Walkthrough
In this Python Tutorial, we will be learning how to install, setup, and use Jupyter Notebooks. Jupyter Notebooks have become very popular in the last few yea...
How to Become an AI Engineer: A Comprehensive Roadmap | by Madani Bezoui | Medium
AI, or Artificial Intelligence, is the buzzword of the 21st century. With its transformative potential, AI is reshaping industries, from…
CS50's Introduction to Artificial Intelligence with Python | Harvard University
Learn to use machine learning in Python in this introductory course on artificial intelligence.
NVIDIA AI Essentials Learning Series
Start your career journey as a developer or tech specialist with this summer technology session series. Check out our “getting started” videos and explore today’s hottest technologies.
How to learn Artificial Intelligence as a Beginner in 2023?
Want to learn AI? Here’s a guide that will give you better insights into Artificial Intelligence.
Jerry Liu on Twitter / X
As a fun Thanksgiving project we’re featuring an LLM-powered resume screener 🧑💼 - given a candidate’s resume, decide if it matches a JD.It’s a LlamaPack template 🦙 - easily use it as is or customize!As a fun example let’s see if I qualify for the role of @OpenAI CEO role,… https://t.co/1HRcik8IWn pic.twitter.com/VHuuuH1gSg— Jerry Liu (@jerryjliu0) November 23, 2023
[1hr Talk] Intro to Large Language Models
This is a 1 hour general-audience introduction to Large Language Models: the core technical component behind systems like ChatGPT, Claude, and Bard. What the...
Cloning my voice with ElevenLabs
Charlie Holtz published an astonishing demo today, where he hooked together GPT-Vision and a text-to-speech model trained on his own voice to produce a video of Sir David Attenborough narrating his life as observed through his webcam.