Search AI/ML

Found 115 bookmarks

Custom sorting

Azure ChatGPT: Private and secure ChatGPT for internal enterprise use | Hacker News

Azure ChatGPT: Private and secure ChatGPT for internal enterprise use | Hacker News

#model training #discussion

·news.ycombinator.com·Aug 15, 2023

Azure ChatGPT: Private and secure ChatGPT for internal enterprise use | Hacker News

Fine-Tuning Llama-2: A Comprehensive Case Study for Tailoring Models to Unique Applications

Fine-Tuning Llama-2: A Comprehensive Case Study for Tailoring Models to Unique Applications

In this blog, we provide a thorough analysis and a practical guide for fine-tuning. We examine the Llama-2 models under three real-world use cases, and show that fine-tuning yields significant accuracy improvements across the board (in some niche cases, better than GPT-4).

#model training

·anyscale.com·Aug 12, 2023

Fine-Tuning Llama-2: A Comprehensive Case Study for Tailoring Models to Unique Applications

Pythia: A Suite of 16 LLMs for In-Depth Research - KDnuggets

Pythia: A Suite of 16 LLMs for In-Depth Research - KDnuggets

#model training

·kdnuggets.com·Aug 1, 2023

Pythia: A Suite of 16 LLMs for In-Depth Research - KDnuggets

A simple guide to fine-tuning Llama 2 | Brev docs

A simple guide to fine-tuning Llama 2 | Brev docs

Today Brev releases support for Lambda Cloud

#model training

·brev.dev·Jul 27, 2023

A simple guide to fine-tuning Llama 2 | Brev docs

GitHub - karpathy/llama2.c: Inference Llama 2 in one file of pure C

GitHub - karpathy/llama2.c: Inference Llama 2 in one file of pure C

Inference Llama 2 in one file of pure C. Contribute to karpathy/llama2.c development by creating an account on GitHub.

#model training

·github.com·Jul 25, 2023

GitHub - karpathy/llama2.c: Inference Llama 2 in one file of pure C

OpenAI’s Karpathy Creates Baby Llama Instead of GPT-5

OpenAI’s Karpathy Creates Baby Llama Instead of GPT-5

The person who can easily build GPT-5 over the weekend, is surprisingly spending time testing out the capabilities of open source Llama 2

#model training

·analyticsindiamag.com·Jul 25, 2023

OpenAI’s Karpathy Creates Baby Llama Instead of GPT-5

Running Llama 2 on CPU Inference Locally for Document Q&A | by Kennet…

Running Llama 2 on CPU Inference Locally for Document Q&A | by Kennet…

archived 20 Jul 2023 09:09:33 UTC

#model training

·archive.ph·Jul 20, 2023

Running Llama 2 on CPU Inference Locally for Document Q&A | by Kennet…

PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news

PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news

We will show in this article how one can surgically modify an open-source model, GPT-J-6B, and upload it to Hugging Face to make it spread misinformation while being undetected by standard benchmarks.

#safety #security #model training

·blog.mithrilsecurity.io·Jul 10, 2023

PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news

GitHub - h2oai/h2ogpt: Join us at H2O.ai to make the world's best open-source GPT with document and image Q&A, 100% private chat, no data leaks, Apache 2.0 https://arxiv.org/pdf/2306.08161.pdf Live Demo: https://gpt.h2o.ai/

GitHub - h2oai/h2ogpt: Join us at H2O.ai to make the world's best open-source GPT with document and image Q&A, 100% private chat, no data leaks, Apache 2.0 https://arxiv.org/pdf/2306.08161.pdf Live Demo: https://gpt.h2o.ai/

#model training

·github.com·Jul 9, 2023

GitHub - h2oai/h2ogpt: Join us at H2O.ai to make the world's best open-source GPT with document and image Q&A, 100% private chat, no data leaks, Apache 2.0 https://arxiv.org/pdf/2306.08161.pdf Live Demo: https://gpt.h2o.ai/

GitHub - InternLM/InternLM: InternLM has open-sourced a 7 billion parameter base model, a chat model tailored for practical scenarios and the training system.

GitHub - InternLM/InternLM: InternLM has open-sourced a 7 billion parameter base model, a chat model tailored for practical scenarios and the training system.

#model training

·github.com·Jul 7, 2023

GitHub - InternLM/InternLM: InternLM has open-sourced a 7 billion parameter base model, a chat model tailored for practical scenarios and the training system.

What Is a Transformer Model? | NVIDIA Blogs

What Is a Transformer Model? | NVIDIA Blogs

#model training

·blogs.nvidia.com·Jun 24, 2023

What Is a Transformer Model? | NVIDIA Blogs

Recurrent Neural Networks, Explained and Visualized from the Ground Up | by Andre Ye | Jun, 2023 | Towards Data Science

Recurrent Neural Networks, Explained and Visualized from the Ground Up | by Andre Ye | Jun, 2023 | Towards Data Science

#model training

·archive.ph·Jun 22, 2023

Recurrent Neural Networks, Explained and Visualized from the Ground Up | by Andre Ye | Jun, 2023 | Towards Data Science

bentoml/OpenLLM: An open platform for operating large language models (LLMs) in production. Fine-tune, serve, deploy, and monitor any LLMs with ease.

bentoml/OpenLLM: An open platform for operating large language models (LLMs) in production. Fine-tune, serve, deploy, and monitor any LLMs with ease.

#model training

·github.com·Jun 20, 2023

bentoml/OpenLLM: An open platform for operating large language models (LLMs) in production. Fine-tune, serve, deploy, and monitor any LLMs with ease.

artidoro/qlora: QLoRA: Efficient Finetuning of Quantized LLMs

artidoro/qlora: QLoRA: Efficient Finetuning of Quantized LLMs

#model training

·github.com·Jun 6, 2023

artidoro/qlora: QLoRA: Efficient Finetuning of Quantized LLMs

It’s infuriatingly hard to understand how closed models train on their input

It’s infuriatingly hard to understand how closed models train on their input

#model training #ethics

·simonwillison.net·Jun 5, 2023

It’s infuriatingly hard to understand how closed models train on their input

rmihaylov/falcontune: Tune any FALCON in 4-bit

rmihaylov/falcontune: Tune any FALCON in 4-bit

#model training

·github.com·Jun 5, 2023

rmihaylov/falcontune: Tune any FALCON in 4-bit

How to create a private ChatGPT that interacts with your local documents - TechTalks

How to create a private ChatGPT that interacts with your local documents - TechTalks

#model training

·bdtechtalks.com·Jun 4, 2023

How to create a private ChatGPT that interacts with your local documents - TechTalks

State of GPT | BRK216HFS - YouTube

State of GPT | BRK216HFS - YouTube

#model training

·youtube.com·Jun 1, 2023

State of GPT | BRK216HFS - YouTube

Understanding Parameter-Efficient Finetuning of Large Language Models: From Prefix Tuning to LLaMA-Adapters

Understanding Parameter-Efficient Finetuning of Large Language Models: From Prefix Tuning to LLaMA-Adapters

#model training

·lightning.ai·May 29, 2023

Understanding Parameter-Efficient Finetuning of Large Language Models: From Prefix Tuning to LLaMA-Adapters

Gorilla

Gorilla

Gorilla is a LLM that can provide appropriate API calls. It is trained on three massive machine learning hub datasets: Torch Hub, TensorFlow Hub and HuggingFace. We are rapidly adding new domains, including Kubernetes, GCP, AWS, OpenAPI, and more. Zero-shot Gorilla outperforms GPT-4, Chat-GPT and Claude. Gorilla is extremely reliable, and significantly reduces hallucination errors.

#model training

·gorilla.cs.berkeley.edu·May 27, 2023

How To Finetune GPT Like Large Language Models on a Custom Dataset - Lightning AI

How To Finetune GPT Like Large Language Models on a Custom Dataset - Lightning AI

#model training

·lightning.ai·May 26, 2023

How To Finetune GPT Like Large Language Models on a Custom Dataset - Lightning AI

Google's newest A.I. model uses nearly five times more text data for training than its predecessor

Google's newest A.I. model uses nearly five times more text data for training than its predecessor

In announcing its PaLM 2 large language model, Google neglected to say how much training data was used for its most advanced LLM.

#model training

·cnbc.com·May 17, 2023

Google's newest A.I. model uses nearly five times more text data for training than its predecessor

The Rise of Generative AI Large Language Models (LLMs) like ChatGPT — Information is Beautiful

The Rise of Generative AI Large Language Models (LLMs) like ChatGPT — Information is Beautiful

Charting the unsilence of the LLMs

#model training

·informationisbeautiful.net·May 16, 2023

The Rise of Generative AI Large Language Models (LLMs) like ChatGPT — Information is Beautiful

Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality | LMSYS Org

Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality | LMSYS Org

We introduce Vicuna-13B, an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. Preliminary evaluation ...

#model training

·lmsys.org·May 16, 2023

Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality | LMSYS Org

How to run Llama 13B with a 6GB graphics card

How to run Llama 13B with a 6GB graphics card

How to run Llama 13B with a 6GB graphics card. GitHub Gist: instantly share code, notes, and snippets.

#model training

·gist.github.com·May 16, 2023

How to run Llama 13B with a 6GB graphics card

imartinez/privateGPT: Interact privately with your documents using the power of GPT, 100% privately, no data leaks

imartinez/privateGPT: Interact privately with your documents using the power of GPT, 100% privately, no data leaks

Interact privately with your documents using the power of GPT, 100% privately, no data leaks - imartinez/privateGPT: Interact privately with your documents using the power of GPT, 100% privately, n...

#model training

·github.com·May 14, 2023

imartinez/privateGPT: Interact privately with your documents using the power of GPT, 100% privately, no data leaks

nlpxucan/WizardLM: WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructions

nlpxucan/WizardLM: WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructions

WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructions - nlpxucan/WizardLM: WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructions

#model training

·github.com·May 11, 2023

nlpxucan/WizardLM: WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructions

Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs

Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs

Introducing MPT-7B, the latest entry in our MosaicML Foundation Series. MPT-7B is a transformer trained from scratch on 1T tokens of text and code. It is open source, available for commercial use, and matches the quality of LLaMA-7B. MPT-7B was trained on the MosaicML platform in 9.5 days with zero human intervention at a cost of ~$200k. Starting today, you can train, finetune, and deploy your own private MPT models, either starting from one of our checkpoints or training from scratch. For inspiration, we are also releasing three finetuned models in addition to the base MPT-7B: MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-StoryWriter-65k+, the last of which uses a context length of 65k tokens!

#model training

·mosaicml.com·May 11, 2023

Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs

Chatbot Arena: The LLM Benchmark Platform - KDnuggets

Chatbot Arena: The LLM Benchmark Platform - KDnuggets

Chatbot Arena is a benchmark platform for large language models, where the community can contribute new models and evaluate them.

#model training

·kdnuggets.com·May 10, 2023

Chatbot Arena: The LLM Benchmark Platform - KDnuggets

Chat with Open Large Language Models

Chat with Open Large Language Models

#model training

·chat.lmsys.org·May 10, 2023

Chat with Open Large Language Models