Large language models like GPT-3 work by representing words as vectors of numbers and using neural networks with attention and transformer layers.
Word vectors allow language models to perform operations and reason about words in ways that strings of letters cannot.
Attention heads allow words to share contextual information with each other, helping the model resolve ambiguities and predict the next word.
Feed-forward layers act as a database of facts that the model has learned, enabling it to make predictions based on that knowledge.
Language models are trained by trying to predict the next word in text, requiring huge amounts of training data.
The performance of language models scales dramatically with their size, the amount of training data, and the compute used for training.
As language models get larger, they develop the ability to perform more complex reasoning and tasks requiring abstract thought.
Researchers do not fully understand how language models accomplish their abilities, and fully explaining them remains a huge challenge.
Language models appear to spontaneously develop capabilities like theory of mind as a byproduct of increasing language ability.
There is debate over whether language models truly "understand" language in the same sense that humans do.
Greg Rutkowski Was Removed From Stable Diffusion, But AI Artists Brought Him Back - Decrypt
chenhunghan/ialacol: 🪶 Lightweight 🦄 Self hosted, private, 🐟 scalable, 🤑 commercially usable, 💬 LLM chat streaming service with 1-click Kubernetes cluster installation on any cloud
Anthropic, Google, Microsoft and OpenAI launch Frontier Model Forum
Universal and Transferable Attacks on Aligned Language Models
[P] Run Llama 2 Locally in 7 Lines! (Apple Silicon Mac) : r/MachineLearning
mlc-ai/mlc-llm: Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
Running PyTorch on the M1 GPU
Sparta Is No Model for U.S. Soldiers
Guide to running Llama 2 locally | Hacker News
Large language models, explained with a minimum of math and jargon
The Best GPUs for Deep Learning in 2023 — An In-depth Analysis
Here, I provide an in-depth analysis of GPUs for deep learning/machine learning and explain what is the best GPU for your use-case and budget.
A simple guide to fine-tuning Llama 2 | Brev docs
Today Brev releases support for Lambda Cloud
Real-Real-World Programming with ChatGPT
Taking AI Far Beyond Small Self-Contained Coding Tasks
What might LLMs/generative AI mean for public benefits and the safety net/tech?
What We Know About LLMs (Primer)
A comprehensive guide to running Llama 2 locally
How to run Llama 2 on Mac, Linux, Windows, and your phone.
GPU Cloud, Clusters, Servers, Workstations | Lambda
Building My Own Deep Learning Rig · Den Delimarsky
Build A Capable Machine For LLM and AI | by Andrew Zhu | CodeX | Medium
MyScale | Run Vector Search with SQL
Teach your LLM to always answer with facts not fiction | MyScale | Blog
GitHub - karpathy/llama2.c: Inference Llama 2 in one file of pure C
Inference Llama 2 in one file of pure C. Contribute to karpathy/llama2.c development by creating an account on GitHub.
OpenAI’s Karpathy Creates Baby Llama Instead of GPT-5
The person who can easily build GPT-5 over the weekend, is surprisingly spending time testing out the capabilities of open source Llama 2
SDXL - Stable Diffusion XL - NightCafe Creator
facebookresearch/fastText: Library for fast text representation and classification.
examples/learn/generation/llm-field-guide/llama-2-70b-chat-agent.ipynb at master · pinecone-io/examples
LLAMA 2: an incredible open-source LLM - by Nathan Lambert
Llama 2: Open Foundation and Fine-Tuned Chat Models | Meta AI Research
OpenAI's head of trust and safety steps down | Reuters
AI and Microdirectives - Schneier on Security