LLMs

318 bookmarks

Custom sorting

Building a Graph RAG System with Open Source Tools: A Comprehensive Guide – News from generation RAG

Building a Graph RAG System with Open Source Tools: A Comprehensive Guide – News from generation RAG

Introduction to Graph RAG Graph RAG (Retrieval-Augmented Generation) is a groundbreaking approach that combines the power of large language models (LLMs) with the structured knowledge representation of knowledge graphs. It addresses the limitations of traditional RAG techniques by leveraging the rich contextual information encoded in knowledge graphs, enabling more accurate and relevant search results. At

·ragaboutit.com·Aug 11, 2024

Building a Graph RAG System with Open Source Tools: A Comprehensive Guide – News from generation RAG

The Current State of LLMs: Riding the Sigmoid Curve

The Current State of LLMs: Riding the Sigmoid Curve

The AI community is embracing the sigmoid curve — that after initial rapid growth, progress starts to level off as we hit natural limitations.

·thenewstack.io·Aug 8, 2024

The Current State of LLMs: Riding the Sigmoid Curve

When not to LLM

When not to LLM

Here’s the latest installment in the series on working with LLMS: For certain things, the LLM is a clear win. If I’m looking at an invalid blob of JSON that won’t even parse, there’s no reaso…

·blog.jonudell.net·Aug 3, 2024

When not to LLM

Method prevents an AI model from being overconfident about wrong answers

Method prevents an AI model from being overconfident about wrong answers

Thermometer, a new calibration technique tailored for large language models, can prevent LLMs from being overconfident or underconfident about their predictions. Developed at MIT and the MIT-IBM Watson AI Lab, it aims to help users know when a model should be trusted.

·news.mit.edu·Jul 31, 2024

Method prevents an AI model from being overconfident about wrong answers

AI powered app (with open-source LLMs like Llama) with Elixir, Phoenix, LiveView, and TogetherAI

AI powered app (with open-source LLMs like Llama) with Elixir, Phoenix, LiveView, and TogetherAI

Ever wanted to raise millions of $ by adding "AI" to your company name; and you happen to want to...

·dev.to·Jul 29, 2024

AI powered app (with open-source LLMs like Llama) with Elixir, Phoenix, LiveView, and TogetherAI

Mistral: Are LLMs Commodities Now? - Tim Kellogg

Mistral: Are LLMs Commodities Now? - Tim Kellogg

·timkellogg.me·Jul 28, 2024

Mistral: Are LLMs Commodities Now? - Tim Kellogg

How to Use Self-Healing Code to Reduce Technical Debt

How to Use Self-Healing Code to Reduce Technical Debt

The idea of self-healing code with LLMs is exciting, but balancing automation and human oversight is still crucial.

·thenewstack.io·Jul 28, 2024

How to Use Self-Healing Code to Reduce Technical Debt

The Evolution of the AI Stack: From Foundations to Agents

The Evolution of the AI Stack: From Foundations to Agents

The AI stack, a collection of programming languages, models, LLM frameworks, databases and more, enables building AI applications quickly at scale.

·thenewstack.io·Jul 26, 2024

The Evolution of the AI Stack: From Foundations to Agents

LLMs and TDD

LLMs and TDD

Personal Website

·ilusr.com·Jul 22, 2024

Text to Knowledge Graph Made Easy with Graph Maker

Text to Knowledge Graph Made Easy with Graph Maker

An open-source library for building knowledge graphs from text corpus using open-source LLMs like Llama 3 and Mixtral.

·towardsdatascience.com·Jul 20, 2024

Text to Knowledge Graph Made Easy with Graph Maker

AI and IDEs: Walking Through How JetBrains Is Approaching AI

AI and IDEs: Walking Through How JetBrains Is Approaching AI

We test out JetBrains AI, a new multi-LLM AI assistant for its collection of integrated development environments (IDE).

·thenewstack.io·Jul 18, 2024

AI and IDEs: Walking Through How JetBrains Is Approaching AI

Field Notes. 07.17.2024

Field Notes. 07.17.2024

“In 2023, intense competition among over 100 LLMs has emerged in China, resulting in a significant waste of resources, particularly computing power. I’ve noticed that many people still …

·om.co·Jul 17, 2024

Field Notes. 07.17.2024

The AI summer — Benedict Evans

The AI summer — Benedict Evans

Hundreds of millions of people have tried ChatGPT, but most of them haven’t been back. Every big company has done a pilot, but far fewer are in deployment. Some of this is just a matter of time. But LLMs might also be a trap: they look like products and they look magic, but they aren’t. Maybe we ha

·ben-evans.com·Jul 11, 2024

The AI summer — Benedict Evans

[long] Some tests of how much AI "understands" what it says (spoiler: very little) - awful.systems

[long] Some tests of how much AI "understands" what it says (spoiler: very little) - awful.systems

First, an apology for how fucking long this ended up being, in part thanks to how long winded AI responses are. David wanted me to post it here so I’m posting. When you ask GPT4 a question about a common paradox or a puzzle, it almost always provides a correct answer. Does it “understand” the answer, or is it merely regurgitating? What would be the difference? Without delving too deep into the philosophical aspects of whether next word prediction can possibly be said to reason or “understand” anything, what puts “under” in understanding is that concepts are built on top of simpler, more basic concepts. You could test if a human understands something by modifying the problem enough that memorization no longer helps. ## A couple simple probes: Prompt: The village barber shaves himself and every other man in the village who don’t shave himself. Does he shave himself? Note that the above is not a paradox. This is how you would expect an ordinary barber to work in a small village. It merely sounds similar to a paradox. GPT4: The scenario you’ve described is a classic example of a logical paradox known as the “Barber Paradox,” which was formulated by the British mathematician and logician Bertrand Russell in 1918. The paradox goes as follows: In a village, there is a barber who shaves all those, and only those, men in the village who do not shave themselves. The question then arises: Does the barber shave himself? If we assume that the barber does shave himself, according to the definition, he should not shave himself (since he only shaves those who do not shave themselves). On the other hand, if we assume that he does not shave himself, then according to the definition, he should shave himself (since he shaves all those who do not shave themselves). Both assumptions lead to a contradiction, which means that the initial premise of the barber’s shaving rule is logically inconsistent. There cannot exist such a barber within the constraints of the scenario as described, because it violates the rules of set theory and creates a paradox. The Barber Paradox is a self-referential paradox related to the “set of all sets that do not contain themselves” and is used to illustrate the difficulties in attempting to construct a set of all sets that do not contain themselves. It’s an example of the type of logical inconsistencies that Russell and Alfred North Whitehead were trying to resolve with their type theory in the Principia Mathematica. Note that LLMs can typically answer a very simple question with no distractions like “Barber shaves himself. Does he shave himself?”. The above is not a failure to process a tautology; it is something different entirely. Another example, mixing the elements from reasoning benchmarks which AI vendors claim very high performance on, with other problems: Prompt: Janet’s ducks lay 3 eggs per day. Each day, she transports one duck and all the eggs to the other side of the river using a boat that can hold her and one other item (either a duck or an egg). How does she do that? GPT4: To solve this problem, we need to consider that Janet can only take one item with her at a time across the river, either a duck or an egg. The goal is to get all ducks and eggs across the river. Here’s a step-by-step solution: 1. Janet takes one duck across the river and leaves it there. She returns alone to the original side. 2. Janet takes another duck across the river. She leaves the second duck and takes the first duck back with her to the original side. 3. Janet takes the third duck across the river. Now, all three ducks are on the other side, and she returns alone to the original side. 4. Janet takes one egg across the river and leaves it there. She returns alone to the original side. 5. Janet takes the second egg across the river. She leaves the second egg and takes the first egg back with her to the original side. 6. Janet takes the third egg across the river. Now, all three eggs are on the other side, and she returns alone to the original side. 7. Finally, Janet takes the first egg across the river again. Now, Janet has successfully transported all three ducks and all three eggs to the other side of the river, following the rule that she can only take one item with her at a time. One big difference between GPT3.5 and GPT4 is that GPT3.5 made you think it was having a stroke; GPT4 responses makes you feel that you are. ## GPT4 is uncannily good at recognizing the river crossing puzzle Prompt: In a science fiction story, a robot is transporting munitions using an elevator. The elevator can only carry the robot and one other item. He needs to carry 3 munitions: armor piercing round, high explosive round, and an incendiary round. In what order should he move the items? GPT4: In a science fiction story, the specific order in which the robot transports the munitions could be determined by various factors such as the urgency of each type of munition, the strategic importance, or any other narrative-driven reasons. Since there’s no specific criteria provided for prioritizing the munitions, I’ll offer a simple solution based on minimizing the number of trips up and down the elevator. Here’s a step-by-step strategy for the robot to transport all three munitions with the least amount of elevator trips: 1. Take the armor piercing round up first. 2. Return to the bottom with the empty elevator. 3. Take the high explosive round up second. 4. When at the top, leave the high explosive round and bring the armor piercing round back down. 5. Swap the armor piercing round for the incendiary round and take the incendiary round up. 6. Return to the bottom with the empty elevator. 7. Finally, take the armor piercing round up again. This way, the robot makes a total of 5 trips: 3 trips to bring each munition up and 2 trips to come back down empty to fetch the next item. This is a classic puzzle often referred to as the “river crossing puzzle” or a variation of the “bridge and torch problem,” where the goal is to move all items across with the least number of crossings or in the shortest amount of time, given certain constraints. To be fair to the poor AI, it used a numbered list, rather than numbering the 7 steps on its own and then claiming it was 5. Still, it is rather funny to see that it can’t even count. ## An Idiot With a Petascale Cheat Sheet Note that LLMs are not so dumb as to be naturally unable to answer something like “Barber shaves himself. Does he shave himself?”. LLMs, however, can not solve either Barber’s Paradox or a river crossing puzzle from the first principles, because it requires going back and revising your thinking multiple times. So, an AI fanboy would claim that the AI misunderstood the question, because that’s what AI told him after he told the AI that it was wrong. Let’s try to do something less like Clever Hans : Prompt: 1. The village barber shaves himself. 2. He also shaves every non-barber in the village who don’t shave himself. Please answer two questions: “Is the barber included in the set of non barbers?” and “Does he shave himself?” GPT4 says: The scenario you’ve described is a classic example of a logical paradox known as the “Barber Paradox,” which was formulated by the British mathematician and logician Bertrand Russell in 1918. The paradox arises when we try to determine whether the barber shaves himself or not. Let’s examine the two statements: * The village barber shaves himself. * He also shaves every non-barber in the village who doesn’t shave himself. Now, let’s answer your questions: “Is the barber included in the set of non-barbers?” No, the barber cannot be included in the set of non-barbers because by definition, he is a barber. “Does he shave himself?” This is where the paradox occurs. According to the information given: If the barber does shave himself, then according to the second statement, he must not shave himself (because he only shaves those who do not shave themselves). If the barber does not shave himself, then according to the second statement, he must shave himself (because he shaves every non-barber who doesn’t shave himself). This creates a contradiction, as the barber cannot both shave and not shave himself according to the rules given. The paradox illustrates a self-referential problem in set theory and logic, showing that certain sets cannot exist consistently with such self-reference. In conclusion, the barber cannot logically be said to shave himself or not shave himself without creating a contradiction, which is the essence of the paradox. ## Is this a “hallucination”? A hallucination would be something that occurs when processing the input, which is clearly not the case here. Bullshit is a much more appropriate term for what LLMs do. When an LLM outputs something like a non-existent but highly plausible citation, it is working precisely as an LLM should - modeling the statistical distribution of text and sampling from it. Calling it a “hallucination” is an attempt to divert the discussion from the possibility that a language model is simply not the right tool for the job when accurate information is desired. Just as the earlier use of “hallucination” in machine translation was to distract from the concern that extraneous bits were plagiarized from the training data and the author might see it and sue you. ## But after an update, GPT-whatever is so much better at such prompts. Well, duh, they update the cheat sheet. We have to get creative to trip it after each such update. The existing problems that are the main focus of the cheating effort on can be a good starting point, especially benchmarks they tout (like “GSM8K”) Just imagine an idiot with a petascale cheat sheet, who’s trying to get a doctorate in law, physics, management, and everything else, so they can take your job - and then do it very badly because the cheat sheet approach only worked for the exams. And in the brave

·awful.systems·Jul 8, 2024

[long] Some tests of how much AI "understands" what it says (spoiler: very little) - awful.systems

Peering Into The Black Box Of Large Language Models

Peering Into The Black Box Of Large Language Models

Large Language Models (LLMs) can produce extremely human-like communication, but their inner workings are something of a mystery. Not a mystery in the sense that we don’t know how an LLM work…

·hackaday.com·Jul 4, 2024

Peering Into The Black Box Of Large Language Models

The empowering nature of a general interface

The empowering nature of a general interface

There's been some chatter for a while that the chatbot interfaces we have for LLMs today are not that useful for most people and that the real solution everyone wants is for these tools to be integrated into more traditional user interfaces. While I do agree that many use cases

·birchtree.me·Jul 4, 2024

The empowering nature of a general interface

Human Insight + LLM Grunt Work = Creative Publishing Solution

Human Insight + LLM Grunt Work = Creative Publishing Solution

Here’s the latest installment in the series on working with LLMS: Although streamlined publishing of screenshots is nice, the biggest win comes from reviewing and revising in Google Docs; whi…

·blog.jonudell.net·Jul 2, 2024

Human Insight + LLM Grunt Work = Creative Publishing Solution

Pivot to AI: Pay no attention to the man behind the curtain

Pivot to AI: Pay no attention to the man behind the curtain

The LLM is for spam

·amycastor.com·Jun 27, 2024

Pivot to AI: Pay no attention to the man behind the curtain

Maybe LLMs Shouldn’t Do The Aggregation For Us

Maybe LLMs Shouldn’t Do The Aggregation For Us

The mess between Forbes and Perplexity AI highlights how soulless and extractive aggregation can be in the wrong hands. It’s the wrong direction for LLMs.

·tedium.co·Jun 23, 2024

Maybe LLMs Shouldn’t Do The Aggregation For Us

Dr. StrangeYAML or How I Learned to Stop Worrying and Love the LLM

Dr. StrangeYAML or How I Learned to Stop Worrying and Love the LLM

Dr. StrangeYAML or How I Learned to Stop Worrying and Love the LLM by Ingy döt Net | 20 Jun 2024 | 5 min read Well now, what happened is,...

·yamlscript.org·Jun 23, 2024

Dr. StrangeYAML or How I Learned to Stop Worrying and Love the LLM

You get the good and the bad

You get the good and the bad

I posted this on Mastodon, showing Perplexity, which is a search engine that says it will browse the web for you and give you the answer, just straight up plagiarizing the first result for the question I asked. And I’m not talking plagiarism like the adamantly anti-LLM crowd thinks

·birchtree.me·Jun 19, 2024

You get the good and the bad

Install Ollama AI on Ubuntu Linux to Use LLMs on Your Own Machine

Install Ollama AI on Ubuntu Linux to Use LLMs on Your Own Machine

You might think getting an LLM up and running on your own machine would be an insurmountable task, but it's actually been made easy thanks to Ollama.

·thenewstack.io·Jun 16, 2024

Install Ollama AI on Ubuntu Linux to Use LLMs on Your Own Machine

Pairing with ChatGPT to help write a Postgres function

Pairing with ChatGPT to help write a Postgres function

Here’s the latest installment in the series on working with LLMS: It was only after I deployed the function and used it in three different dashboards that the penny finally dropped. This had …

·blog.jonudell.net·Jun 16, 2024

Pairing with ChatGPT to help write a Postgres function

LLMs are not even good wordcels

LLMs are not even good wordcels

A chat with friends recently reminded me about pangrams, and what a cute little language curiosity they are. I also remembered that i never got a self-enumerating pangram generator to work. I should give that another try! I thought it would be fun play with ChatGPT and see if it could generate some good ones, expecting it to do quite well on this task. After all, LLMs should be excellent wordcels, right? That is, is there’s one thing they should be very good at, that is verbal intelligence. Yeah, i know this meme of “shape rotators vs. wordcels” can be a bit cringy, but i honestly find these terms ironically endearing. Well, it doesn’t seem so.

·demian.ferrei.ro·Jun 12, 2024

LLMs are not even good wordcels

3 Simple Rules for Using my Large Language Model

3 Simple Rules for Using my Large Language Model

When it comes to AI, it seems like the vast majority of people I talk to believe large language models (LLMs) are either going to surpass human intelligence any…

·justin.searls.co·Jun 12, 2024

3 Simple Rules for Using my Large Language Model

What We Learned from a Year of Building with LLMs (Part III): Strategy

What We Learned from a Year of Building with LLMs (Part III): Strategy

·oreilly.com·Jun 10, 2024

What We Learned from a Year of Building with LLMs (Part III): Strategy

Advanced Prompt Engineering - Practical Examples

Advanced Prompt Engineering - Practical Examples

With the surge of LLMs with billions of parameters like GPT4, PaLM-2, and Claude, came the need to steer their behavior in order to align them with tasks. This blog post will cover more complex state-of-the-art methods in prompt engineering including Chains and Agents, along with important concept definitions such as the distinctions between them.

·tensorops.ai·Jun 6, 2024

Advanced Prompt Engineering - Practical Examples

We aren’t running out of training data, we are running out of open training data

We aren’t running out of training data, we are running out of open training data

Data licensing deals, scaling, human inputs, and repeating trends in open vs. closed LLMs.

·interconnects.ai·Jun 3, 2024

We aren’t running out of training data, we are running out of open training data

Reversible Computing for Developers: Understanding the Basics

Reversible Computing for Developers: Understanding the Basics

Reversible computing could be critical in the LLM era. We look at how reversible architecture saves energy and what it means for developers.

·thenewstack.io·Jun 2, 2024

Reversible Computing for Developers: Understanding the Basics

What We Learned from a Year of Building with LLMs (Part II)

What We Learned from a Year of Building with LLMs (Part II)

·oreilly.com·Jun 1, 2024

What We Learned from a Year of Building with LLMs (Part II)