RedPajama is “a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens”. It’s a collaboration between Together, Ontocord.ai, ETH DS3Lab, Stanford CRFM, …
mallahyari/llm-hub: A curated collection of interesting applications, repos, and tutorials using large language models (LLM) like GPT-3
A curated collection of interesting applications, repos, and tutorials using large language models (LLM) like GPT-3 - mallahyari/llm-hub: A curated collection of interesting applications, repos, an...
Running GPT4All On a Mac Using Python langchain in a Jupyter Notebook
Over the last three weeks or so I’ve been following the crazy rate of development around locally run large language models (LLMs), starting with llama.cpp, then alpaca and most recently (?!) …
RedPajama, a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens — TOGETHER
RedPajama is a project to create a set of leading, fully open-source models. Today, we are excited to announce the completion of the first step of this project: the reproduction of the LLaMA training dataset of over 1.2 trillion tokens.
How I Used Stable Diffusion and Dreambooth to Create A Painted Portrait of My Dog
In this post, we walk through my entire workflow/process for bringing Stable Diffusion to life as a high-quality framed art print. We’ll touch on making art with Dreambooth, Stable Diffusion, Outpainting, Inpainting, Upscaling, preparing for print with Photoshop, and finally printing on fine-art paper with an Epson XP-15000 printer.
Neural Radiance Field training can be accelerated through the use of grid-based representations in NeRF's learned mapping from spatial coordinates to colors and volumetric density. However, these grid-based approaches lack an explicit understanding of scale and therefore often introduce aliasing, usually in the form of jaggies or missing scene content. Anti-aliasing has previously been addressed by mip-NeRF 360, which reasons about sub-volumes along a cone rather than points along a ray, but this approach is not natively compatible with current grid-based techniques. We show how ideas from rendering and signal processing can be used to construct a technique that combines mip-NeRF 360 and grid-based models such as Instant NGP to yield error rates that are 8%-76% lower than either prior technique, and that trains 22x faster than mip-NeRF 360.
Prompt injection: what’s the worst that can happen?
Activity around building sophisticated applications on top of LLMs (Large Language Models) such as GPT-3/4/ChatGPT/etc is growing like wildfire right now. Many of these applications are potentially vulnerable to prompt …
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf - microsoft/JARVIS: JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/...
albumentations-team/albumentations: Fast image augmentation library and an easy-to-use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
Fast image augmentation library and an easy-to-use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about the library: https://www.mdpi.com/2078-2489/11/2/125 -...
ClippyGPT - How I Built Supabase’s OpenAI Doc Search (Embeddings)
Supabase hired me to build ClippyGPT - their next generation doc search. We can ask our old friend Clippy anything you want about Supabase, and it will answer it using natural language. Powered by OpenAI + prompt engineering.
In this video I will be showing you exactly how I did this, and how you can do the same in your projects. We'll be covering:
- Prompt engineering and best practices
- Working with a custom knowledge base via context injection + OpenAI embeddings
- How to store embeddings in Postgres using pgvector
Supabase blog post:
https://supabase.com/blog/chatgpt-supabase-docs
pgvector extension:
https://github.com/pgvector/pgvector
Generate embeddings implementation:
https://github.com/supabase/supabase/blob/54d39d4958575e5b58aa1d5d2a02db863ab4673c/apps/docs/scripts/generate-embeddings.ts
Clippy edge function implementation:
https://github.com/supabase/supabase/blob/54d39d4958575e5b58aa1d5d2a02db863ab4673c/supabase/functions/clippy-search/index.ts
Clippy frontend implementation:
https://github.com/supabase/supabase/blob/54d39d4958575e5b58aa1d5d2a02db863ab4673c/packages/ui/src/components/Command/AiCommand.tsx
Prompt engineering:
https://prmpts.ai/blog/what-is-prompt-engineering
00:00 Why?
01:40 Let's get started
03:15 Custom knowledge base
04:49 Context injection
06:13 Pre-process MDX files
13:40 Embeddings
15:40 Storing in Postgres + pgvector
22:21 API endpoint (edge function)
23:44 Calculating similarity in pgvector
27:55 Prompt engineering
33:15 Prompt best practices
38:37 Demo time!
41:32 Thanks for watching!
In the journal Nature today, my colleagues and I published an article on the future directions of generative A.I. (aka Large Language or Foundation models) for the practice of medicine. These new AI models have generated a multitude of new and exciting opportunities in healthcare that we didn’t have before, along with many challenges and liabilities. I’ll briefly explain how we got here and what’s in store.