RedPajama, a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens — TOGETHER
RedPajama is a project to create a set of leading, fully open-source models. Today, we are excited to announce the completion of the first step of this project: the reproduction of the LLaMA training dataset of over 1.2 trillion tokens.
This bot is trained on over 200k comments and posts from BuyItForLife subreddits to embody the collective knowledge of the Reddit BuyItForLife community.
Make loading weights 10-100x faster by jart · Pull Request #613 · ggerganov/llama.cpp
This is a breaking change that's going to give us three benefits:
Your inference commands should load 100x faster
You may be able to safely load models 2x larger
You can run many concurrent infere...
Cerebras-GPT: A Family of Open, Compute-efficient, Large Language Models - Cerebras
Cerebras open sources seven GPT-3 models from 111 million to 13 billion parameters. Trained using the Chinchilla formula, these models set new benchmarks for accuracy and compute efficiency.
Lightning-AI/lit-llama: Implementation of the LLaMA language model based on nanoGPT. Supports quantization, LoRA fine-tuning, pre-training. Apache 2.0-licensed.
Implementation of the LLaMA language model based on nanoGPT. Supports quantization, LoRA fine-tuning, pre-training. Apache 2.0-licensed. - Lightning-AI/lit-llama: Implementation of the LLaMA langua...
Hello Dolly: Democratizing the magic of ChatGPT with open models
Introducing Dolly, a breakthrough in LLM from Databricks. Learn how Databricks open sourced the model and all its training code, enabling organizations to re-create Dolly at a minimal cost.
tatsu-lab/stanford_alpaca: Code and documentation to train Stanford's Alpaca models, and generate the data.
Code and documentation to train Stanford's Alpaca models, and generate the data. - tatsu-lab/stanford_alpaca: Code and documentation to train Stanford's Alpaca models, and generate the data.
Stanford Alpaca, and the acceleration of on-device large language model development
On Saturday 11th March I wrote about how Large language models are having their Stable Diffusion moment. Today is Monday. Let’s look at what’s happened in the past three days. …
Wizard of Wikipedia is a large dataset with conversations directly grounded with knowledge retrieved from Wikipedia. It is used to train and evaluate dialogue systems for knowledgeable open dialogue with clear grounding