Search Learn AI

Found 511 bookmarks

Newest

turbopuffer: fast search on object storage

turbopuffer is a vector database built on top of object storage, which means 10x-100x cheaper, usage-based pricing, and massive scalability

·turbopuffer.com·Jul 10, 2024

turbopuffer: fast search on object storage

Extrinsic Hallucinations in LLMs

Hallucination in large language models usually refers to the model generating unfaithful, fabricated, inconsistent, or nonsensical content. As a term, hallucination has been somewhat generalized to cases when the model makes mistakes. Here, I would like to narrow down the problem of hallucination to be when the model output is fabricated and not grounded by either the provided context or world knowledge. There are two types of hallucination: In-context hallucination: The model output should be consistent with the source content in context.

·lilianweng.github.io·Jul 10, 2024

Extrinsic Hallucinations in LLMs

How to self-host and hyperscale AI with Nvidia NIM

Try out Nvidia NIM in the free playground https://nvda.ws/4avifodLearn how to build a futuristic workforce of AI agents, then self-host and scale them for an...

·youtube.com·Jul 10, 2024

How to self-host and hyperscale AI with Nvidia NIM

15 INSANE Use Cases for NEW Claude Sonnet 3.5! (Outperforms GPT-4o)

The NEW Claude Sonnet 3.5 model is the best model yet. In this video you find out why.👉🏼Join the BEST Ai Community: https://www.skool.com/ai-foundations/ab...

·youtube.com·Jul 1, 2024

15 INSANE Use Cases for NEW Claude Sonnet 3.5! (Outperforms GPT-4o)

nicoalbanese/ai-sdk-chrome-ai

·github.com·Jun 30, 2024

nicoalbanese/ai-sdk-chrome-ai

What I learned from looking at 900 most popular open source AI tools

[Hacker News discussion, LinkedIn discussion, Twitter thread]

·huyenchip.com·Jun 27, 2024

What I learned from looking at 900 most popular open source AI tools

Consistency Large Language Models: A Family of Efficient Parallel Decoders

TL;DR: LLMs have been traditionally regarded as sequential decoders, decoding one token after another. In this blog, we show pretrained LLMs can be easily taught to operate as efficient parallel decoders. We introduce Consistency Large Language Models (CLLMs), a new family of parallel decoders capable of reducing inference latency by efficiently decoding an $n$-token sequence per inference step. Our research shows this process – mimicking human cognitive process of forming complete sentences in mind before articulating word by word – can be effectively learned by simply finetuning pretrained LLMs.

·hao-ai-lab.github.io·Jun 26, 2024

Consistency Large Language Models: A Family of Efficient Parallel Decoders

karpathy/LLM101n: LLM101n: Let's build a Storyteller

LLM101n: Let's build a Storyteller.

·github.com·Jun 26, 2024

karpathy/LLM101n: LLM101n: Let's build a Storyteller

dabochen/spreadsheet-is-all-you-need: A nanoGPT pipeline packed in a spreadsheet

A nanoGPT pipeline packed in a spreadsheet. Contribute to dabochen/spreadsheet-is-all-you-need development by creating an account on GitHub.

·github.com·Jun 15, 2024

dabochen/spreadsheet-is-all-you-need: A nanoGPT pipeline packed in a spreadsheet

Apple Intelligence On Device LLM Details

·reddit.com·Jun 12, 2024

Apple Intelligence On Device LLM Details

The AI Backend

The AI Backend * work in progress, please provide feedback so we can improve Just like in 1995 it was obvious that every business needs an internet presence to stay competitive, in 2024 it's obvious that every software needs intelligence to stay competitive. Software products generally have 3 c...

·docs.google.com·Jun 10, 2024

The AI Backend

Applied LLMs - What We’ve Learned From A Year of Building with LLMs

A practical guide to building successful LLM products.

·applied-llms.org·Jun 7, 2024

Applied LLMs - What We’ve Learned From A Year of Building with LLMs

Reproducing GPT-2 (124M) in llm.c in 90 minutes for $20 · karpathy/llm.c · Discussion #481

Let's reproduce the GPT-2 (124M) in llm.c (~4,000 lines of C/CUDA) in 90 minutes for $20. The 124M model is the smallest model in the GPT-2 series released by OpenAI in 2019, and is actually quite ...

·github.com·Jun 7, 2024

Reproducing GPT-2 (124M) in llm.c in 90 minutes for $20 · karpathy/llm.c · Discussion #481

The Surprising Power of Next Word Prediction: Large Language Models Explained, Part 1 | Center for Security and Emerging Technology

Large language models (LLMs), the technology that powers generative artificial intelligence (AI) products like ChatGPT or Google Gemini, are often thought of as chatbots that predict the next word. But that isn't the full story of what LLMs are and how they work. This is the first blog post in a three-part series explaining some key elements of how LLMs function. This blog post covers pre-training—the process by which LLMs learn to predict the next word—and why it’s so surprisingly powerful.

·cset.georgetown.edu·Jun 7, 2024

The Surprising Power of Next Word Prediction: Large Language Models Explained, Part 1 | Center for Security and Emerging Technology

A Survey of Techniques for Maximizing LLM Performance

Join us for a comprehensive survey of techniques designed to unlock the full potential of Language Model Models (LLMs). Explore strategies such as fine-tunin...

·youtube.com·Jun 5, 2024

A Survey of Techniques for Maximizing LLM Performance

OpenAI Platform

Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.

·platform.openai.com·Jun 5, 2024

OpenAI Platform

neuml/txtai: 💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows - neuml/txtai

·github.com·Jun 5, 2024

neuml/txtai: 💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

A High-level Overview of Large Language Models - Borealis AI

Gain valuable insights into essential topics such as LLM training, prompt engineering, concerns, applications, and more.

·borealisai.com·Jun 3, 2024

A High-level Overview of Large Language Models - Borealis AI

The Expanding Dark Forest and Generative AI

Proving you're a human on a web flooded with generative AI content

·maggieappleton.com·May 27, 2024

The Expanding Dark Forest and Generative AI

AI Engineering: From Agents to LLM OS (plus demos from AI Engineer Singapore meetup)

i gave a talk at the recent AI Eng Singapore meetup (https://www.latent.space/p/community, scroll down) about the past 1 year in agents thinking and the bui...

·youtube.com·May 25, 2024

AI Engineering: From Agents to LLM OS (plus demos from AI Engineer Singapore meetup)

metaskills/experts at labnotes.org

Experts.js is the easiest way to create and deploy OpenAI's Assistants and link them together as Tools to create advanced Multi AI Agent Systems with expanded memory and attention to detail. - ...

·github.com·May 25, 2024

metaskills/experts at labnotes.org

RAG - jxnl.co

Notes about my hobbies and other things I find interesting.

·jxnl.co·May 25, 2024

RAG - jxnl.co

How to build your own Perplexity for any dataset - Learnings from building “Ask Hacker Search”

How does something like Perplexity work, and how do we make our own? And having done that, what turned out to be the most interesting or challenging parts?

·jnnnthnn.com·May 24, 2024

How to build your own Perplexity for any dataset - Learnings from building “Ask Hacker Search”

karpathy/minbpe: Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization. - karpathy/minbpe

·github.com·May 24, 2024

karpathy/minbpe: Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

naklecha/llama3-from-scratch: llama3 implementation one matrix multiplication at a time

llama3 implementation one matrix multiplication at a time - naklecha/llama3-from-scratch

·github.com·May 24, 2024

naklecha/llama3-from-scratch: llama3 implementation one matrix multiplication at a time

VRSEN/agency-swarm: An opensource agent orchestration framework built on top of the latest OpenAI Assistants API.

An opensource agent orchestration framework built on top of the latest OpenAI Assistants API. - VRSEN/agency-swarm

·github.com·May 24, 2024

VRSEN/agency-swarm: An opensource agent orchestration framework built on top of the latest OpenAI Assistants API.

Why I'm Staying Away from Crew AI: My Honest Opinion

Crew AI is not suitable for production use cases. I’ll be going through why I believe this is the case and what you should do instead when building your own ...

·youtube.com·May 22, 2024

Why I'm Staying Away from Crew AI: My Honest Opinion