Azure ChatGPT: Private and secure ChatGPT for internal enterprise use | Hacker News
Fine-Tuning Llama-2: A Comprehensive Case Study for Tailoring Models to Unique Applications
In this blog, we provide a thorough analysis and a practical guide for fine-tuning. We examine the Llama-2 models under three real-world use cases, and show that fine-tuning yields significant accuracy improvements across the board (in some niche cases, better than GPT-4).
Pythia: A Suite of 16 LLMs for In-Depth Research - KDnuggets
A simple guide to fine-tuning Llama 2 | Brev docs
Today Brev releases support for Lambda Cloud
GitHub - karpathy/llama2.c: Inference Llama 2 in one file of pure C
Inference Llama 2 in one file of pure C. Contribute to karpathy/llama2.c development by creating an account on GitHub.
OpenAI’s Karpathy Creates Baby Llama Instead of GPT-5
The person who can easily build GPT-5 over the weekend, is surprisingly spending time testing out the capabilities of open source Llama 2
Running Llama 2 on CPU Inference Locally for Document Q&A | by Kennet…
archived 20 Jul 2023 09:09:33 UTC
PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news
We will show in this article how one can surgically modify an open-source model, GPT-J-6B, and upload it to Hugging Face to make it spread misinformation while being undetected by standard benchmarks.
GitHub - h2oai/h2ogpt: Join us at H2O.ai to make the world's best open-source GPT with document and image Q&A, 100% private chat, no data leaks, Apache 2.0 https://arxiv.org/pdf/2306.08161.pdf Live Demo: https://gpt.h2o.ai/
GitHub - InternLM/InternLM: InternLM has open-sourced a 7 billion parameter base model, a chat model tailored for practical scenarios and the training system.
What Is a Transformer Model? | NVIDIA Blogs
Recurrent Neural Networks, Explained and Visualized from the Ground Up | by Andre Ye | Jun, 2023 | Towards Data Science
bentoml/OpenLLM: An open platform for operating large language models (LLMs) in production. Fine-tune, serve, deploy, and monitor any LLMs with ease.
artidoro/qlora: QLoRA: Efficient Finetuning of Quantized LLMs
It’s infuriatingly hard to understand how closed models train on their input
rmihaylov/falcontune: Tune any FALCON in 4-bit
How to create a private ChatGPT that interacts with your local documents - TechTalks
State of GPT | BRK216HFS - YouTube
Understanding Parameter-Efficient Finetuning of Large Language Models: From Prefix Tuning to LLaMA-Adapters
Gorilla
Gorilla is a LLM that can provide appropriate API calls. It is trained on three massive machine learning hub datasets: Torch Hub, TensorFlow Hub and HuggingFace. We are rapidly adding new domains, including Kubernetes, GCP, AWS, OpenAPI, and more. Zero-shot Gorilla outperforms GPT-4, Chat-GPT and Claude. Gorilla is extremely reliable, and significantly reduces hallucination errors.
How To Finetune GPT Like Large Language Models on a Custom Dataset - Lightning AI
Google's newest A.I. model uses nearly five times more text data for training than its predecessor
In announcing its PaLM 2 large language model, Google neglected to say how much training data was used for its most advanced LLM.
The Rise of Generative AI Large Language Models (LLMs) like ChatGPT — Information is Beautiful
Charting the unsilence of the LLMs
Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality | LMSYS Org
We introduce Vicuna-13B, an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. Preliminary evaluation ...
How to run Llama 13B with a 6GB graphics card
How to run Llama 13B with a 6GB graphics card. GitHub Gist: instantly share code, notes, and snippets.
imartinez/privateGPT: Interact privately with your documents using the power of GPT, 100% privately, no data leaks
Interact privately with your documents using the power of GPT, 100% privately, no data leaks - imartinez/privateGPT: Interact privately with your documents using the power of GPT, 100% privately, n...
nlpxucan/WizardLM: WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructions
WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructions - nlpxucan/WizardLM: WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructions
Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs
Introducing MPT-7B, the latest entry in our MosaicML Foundation Series. MPT-7B is a transformer trained from scratch on 1T tokens of text and code. It is open source, available for commercial use, and matches the quality of LLaMA-7B. MPT-7B was trained on the MosaicML platform in 9.5 days with zero human intervention at a cost of ~$200k. Starting today, you can train, finetune, and deploy your own private MPT models, either starting from one of our checkpoints or training from scratch. For inspiration, we are also releasing three finetuned models in addition to the base MPT-7B: MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-StoryWriter-65k+, the last of which uses a context length of 65k tokens!
Chatbot Arena: The LLM Benchmark Platform - KDnuggets
Chatbot Arena is a benchmark platform for large language models, where the community can contribute new models and evaluate them.
Chat with Open Large Language Models