So you wanna build a local RAG?
You Should Write An Agent
They're like riding a bike: easy, and you don't get it until you try.
Understanding Transformers Using A Minimal Example
Visualizing the internal state of a Transformer model
llama.cpp guide: running gpt-oss with llama.cpp
Really useful official guide to running the OpenAI gpt-oss models using llama-server from llama.cpp - which provides an OpenAI-compatible localhost API and a neat web interface for interacting with the …
DSPy: Replacing LLM Musings with Modules
LLMs have unlocked powerful new capabilities, but the dominant pattern for using them, writing increasingly desperate prompts, is fragile, unscalable, and of...
LLM Embeddings Explained: A Visual and Intuitive Guide - a Hugging Face Space by hesamation
This app explains how language models transform text into meaningful representations through embeddings. It provides a visual guide to help you understand traditional and modern language model tech...
DSPy: Build and Optimize Agentic Apps - DeepLearning.AI
Build, debug, and optimize AI agents using DSPy and MLflow.
DSPy Tutorial | Build AI Agents with Python (Fundamentals)
Complete introduction to the simplest, most efficient, and yet most powerful way I’ve found to create AI agents, AI workflows, and AI programs in Python. Instead of manual prompting, we use automatic prompt optimization with DSPy and its concept of signatures.
Timestamps / Outline:
00:00 How to Call LLMs from Python, the Simple Way
0:21 Declare Your First AI Program (in 1 LOC)
2:24 Setting Up Your Large Language Model Backend
6:10 Program 2: Processing Images
9:14 Deeper Dive into Signatures
14:01 Program 3: Processing Entities from Paragraphs
19:19 Fetching text from wikipedia with Attachments
20:39 Setting Up a DataFrame
22:22 Apply Gemini Flash lite to each paragraph
23:02 Creating a Synthetic Gold Set
24:35 Quick Baseline Evaluation
25:11 Creating DSPy Examples
25:55 Evaluation Metric
26:10 Prompt Optimization with DSPy
29:10 Final Evaluation
Follow Max:
Twitter: [https://x.com/MaximeRivest](https://x.com/MaximeRivest)
GitHub: [https://github.com/MaximeRivest](https://github.com/MaximeRivest)
Links to Relevant Repositories:
Attachments: [https://github.com/MaximeRivest/attachments](https://github.com/MaximeRivest/attachments)
DSPy: [https://github.com/stanfordnlp/dspy](https://github.com/stanfordnlp/dspy)
FunnyDSPy: [https://github.com/MaximeRivest/funnydspy](https://github.com/MaximeRivest/funnydspy)
Docs:
[https://dspy.ai/](https://dspy.ai/)
[https://maximerivest.github.io/attachments/](https://maximerivest.github.io/attachments/)
If you’re new to my channel, my name is Maxime Rivest. I’m an Applied AI Engineer and Data Engineer. I like to educate people on the best tools in Data Analytics and AI Engineering.
Max
Transformer by hand ✍️
zebbern/claude-code-guide: Full guide on claude tips and tricks and how you can optimise your claude code the best & strive to find every command possible even hidden ones!
Full guide on claude tips and tricks and how you can optimise your claude code the best & strive to find every command possible even hidden ones! - zebbern/claude-code-guide
Become a command-line superhero with Simon Willison’s llm tool
Christopher Smith ran a mini hackathon in Albany New York at the weekend around uses of my LLM - the first in-person event I'm aware of dedicated to that project! …
Proximal Policy Optimization (PPO) for LLMs Explained Intuitively
In this video, I break down Proximal Policy Optimization (PPO) from first principles, without assuming prior knowledge of Reinforcement Learning. By the end, you’ll understand the core RL building blocks that led to PPO, including:
🔵 Policy Gradient
🔵 Actor-Critic Models
🔵 The Value Function
🔵 The Generalized Advantage Estimate
In the LLM world, PPO was used to train reasoning models like OpenAI's o1/o3, and presumably Claude 3.7, Grok 3, etc. It’s the backbone of Reinforcement Learning with Human Feedback (RLHF) -- which helps align AI models with human preferences and Reinforcement Learning with Verifiable Rewards (RLVR), which gives LLMs reasoning abilities.
Papers:
- PPO paper: https://arxiv.org/pdf/1707.06347
- GAE paper: https://arxiv.org/pdf/1506.02438
- TRPO paper: https://arxiv.org/pdf/1502.05477
Well-written blogposts:
- https://danieltakeshi.github.io/2017/04/02/notes-on-the-generalized-advantage-estimation-paper/
- https://huggingface.co/blog/NormalUhr/rlhf-pipeline
- https://iclr-blog-track.github.io/2022/03/25/ppo-implementation-details/
Implementations:
- (Original) OpenAI Baseslines: https://github.com/openai/baselines/blob/ea25b9e8b234e6ee1bca43083f8f3cf974143998/baselines/ppo2
- Hugging Face: https://github.com/huggingface/trl/blob/main/trl/trainer/ppo_trainer.py
- Hugging Face docs: https://huggingface.co/docs/trl/main/en/ppo_trainer
Mother of all RL books (Barto & Sutton):
http://incompleteideas.net/book/RLbook2020.pdf
00:00 Intro
01:21 RL for LLMs
05:53 Policy Gradient
09:23 The Value Function
12:14 Generalized Advantage Estimate
17:17 End-to-end Training Algorithm
18:23 Importance Sampling
20:02 PPO Clipping
21:36 Outro
Special thanks to Anish Tondwalkar for discussing some of these concepts with me.
Note: At 21:10, A_t should have been inside the min. Thanks @t.w.7065 for catching this.
Copilot VS Code Customizations
VS Code AI Customization: Learn to use custom instructions, prompt files, and custom chat modes to personalize AI code generation, reviews, and chat responses.
GraphRAG Local Setup Via vLLM and Ollama : A Detailed Integration Guide.
Introduction to GraphRAG
EASIEST Way to Fine-Tune a LLM and Use It With Ollama
Get started with 10Web and their AI Website Builder API: https://10web.io/website-builder-api/?utm_source=YouTube&utm_medium=Influencer&utm_campaign=TechWithTim
Today, you'll learn how to fine-tune LLMs in Python for use in Ollama. I'll walk you through it step by step, give you all the code and show you how to test it out.
DevLaunch is my mentorship program where I personally help developers go beyond tutorials, build real-world projects, and actually land jobs. No fluff. Just real accountability, proven strategies, and hands-on guidance. Learn more here - https://training.devlaunch.us/tim
⏳ Timestamps ⏳
00:00 | What is Fine-Tuning?
02:25 | Gathering Data
05:52 | Google Collab Setup
09:17 | Fine-Tuning with Unsloth
16:58 | Model Setup in Ollama
🎞 Video Resources 🎞
Code in this video: https://drive.google.com/drive/folders/1p4ZilsJsdxB5lH6ZBMdIEJBt0WVUMsDq?usp=sharing
Notebook Google Collab: https://colab.research.google.com/drive/1NsRGmHVupulRzsq9iUTx8V8WgTSpO_04?usp=sharing
Hashtags
#Python #Ollama #LLM
QWEN-3: EASIEST WAY TO FINE-TUNE WITH REASONING 🙌
Learn how to fine‑tune Qwen‑3‑14B on your own data—with LoRA adapters, Unsloth’s 4‑bit quantization, and just 12 GB of VRAM—while preserving its chain‑of‑thought reasoning. I’ll walk you through dataset prep, the key hyper‑parameters that prevent catastrophic forgetting, and the exact Colab notebook to get you running in minutes. Build a lightweight, reasoning‑ready Qwen‑3 model tailored to your project today!
LINKS:
https://qwenlm.github.io/blog/qwen3/
https://docs.unsloth.ai/basics/unsloth-dynamic-2.0-ggufs
https://huggingface.co/datasets/unsloth/OpenMathReasoning-mini
https://docs.unsloth.ai/basics/qwen3-how-to-run-and-fine-tune
https://huggingface.co/datasets/mlabonne/FineTome-100k
https://docs.unsloth.ai/get-started/fine-tuning-guide
https://arxiv.org/html/2308.08747v5
https://heidloff.net/article/efficient-fine-tuning-lora/
NOTEBOOK: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen3_(14B)-Reasoning-Conversational.ipynb
Fine-tuning Playlist: https://www.youtube.com/playlist?list=PLVEEucA9MYhPjLFhcIoNxw8FkN28-ixAn
Website: https://engineerprompt.ai/
RAG Beyond Basics Course:
https://prompt-s-site.thinkific.com/courses/rag
Let's Connect:
🦾 Discord: https://discord.com/invite/t4eYQRUcXB
☕ Buy me a Coffee: https://ko-fi.com/promptengineering
|🔴 Patreon: https://www.patreon.com/PromptEngineering
💼Consulting: https://calendly.com/engineerprompt/consulting-call
📧 Business Contact: engineerprompt@gmail.com
Become Member: http://tinyurl.com/y5h28s6h
💻 Pre-configured localGPT VM: https://bit.ly/localGPT (use Code: PromptEngineering for 50% off).
Signup for Newsletter, localgpt:
https://tally.so/r/3y9bb0
Fine-Tuning Qwen-3 Models: Step-by-Step Guide
00:00 Introduction to Fine-Tuning Qwen-3
01:24 Understanding Catastrophic Forgetting and LoRa Adapters
03:06 Installing and Using unsloth for Fine-Tuning
04:19 Code Walkthrough: Preparing Your Dataset
07:14 Combining Reasoning and Non-Reasoning Datasets
09:48 Prompt Templates and Fine-Tuning
16:13 Inference and Hyperparameter Settings
18:11 Saving and Loading LoRa Adapters
Using AI Right Now: A Quick Guide
Which AIs to use, and how to use them
copilot-instructions.md has helped me so much. : r/ChatGPTCoding
Your plan MUST include:
- All functions/sections that need modification
- The order in which changes should be applied
- Dependencies between changes
- Estimated number of separate edits required
Smaller prompts, better answers with GitHub Copilot Custom Instructions
Working with GitHub Copilot in VS Code amps out your efficiency as a programmer - but did you know that adding a simple markdown file can boost this efficiency even more, while *also* decreasing the size of your prompt? Custom Instructions can help you and your team do so much more with GitHub Copilot, and @rconery will show you how in this video.
🔎 Chapters:
00:12 Simple, automatic instructions
02:07 Custom Git commit messages
03:26 Customizing Copilot functionality in VS Code
05:00 Going all in with markdown files as instructions
🔗 Links:
Get Copilot: https://aka.ms/get-copilot
Instruction Snippets for JSONC: https://gist.github.com/robconery/f93d016ace16feb7156f9b7905f3f499
🎙️ Featuring: @rconery
#vscode #copilot #githubcopilot
MCP…. So What’s That All About?
✅ Learn how to build robust and scalable software architecture: https://arjan.codes/checklist.
Want your AI tools to actually *do* something? In this video, I’ll show you how to integrate external tools with language models using **MCP (Model Context Protocol)**. You’ll learn two common architecture patterns, see real code examples, and get tips on keeping your setup clean and scalable. Whether you’re building for Claude, ChatGPT, or any other LLM—this is how you connect your backend to AI.
🔥 GitHub Repository: https://git.arjan.codes/2025/mcp-server.
🎓 ArjanCodes Courses: https://www.arjancodes.com/courses.
🔖 Chapters:
0:00 Intro
0:46 What is MCP?
3:14 YouTube MCP Version 1
7:58 YouTube MCP Version 2
12:18 Final Thoughts
#arjancodes #softwaredesign #python
Model Context Protocol (MCP) using Ollama
MCP Servers using Local LLMs tutorial
Model Context Protocol(MCP) with Ollama and Llama 3 : A Step-by-Step Guide — Part 2
In my previous article , we explored Model Context Protocol (MCP) — a standardized way for LLMs to invoke tools or APIs with structured…
Vector Search RAG Tutorial – Combine Your Data with LLMs with Advanced Search
Learn how to use vector search and embeddings to easily combine your data with large language models like GPT-4. You will first learn the concepts and then create three projects.
✏️ Course developed by Beau Carnes.
💻 Code: https://github.com/beaucarnes/vector-search-tutorial
🔗 Access MongoDB Atlas: https://cloud.mongodb.com/
🏗️ MongoDB provided a grant to make this course possible.
⭐️ Contents ⭐️
⌨️ (00:00) Introduction
⌨️ (01:18) What are vector embeddings?
⌨️ (02:39) What is vector search?
⌨️ (03:40) MongoDB Atlas vector search
⌨️ (04:30) Project 1: Semantic search for movie database
⌨️ (32:55) Project 2: RAG with Atlas Vector Search, LangChain, OpenAI
⌨️ (54:36) Project 3: Chatbot connected to your documentation
🎉 Thanks to our Champion and Sponsor supporters:
👾 davthecoder
👾 jedi-or-sith
👾 南宮千影
👾 Agustín Kussrow
👾 Nattira Maneerat
👾 Heather Wcislo
👾 Serhiy Kalinets
👾 Justin Hual
👾 Otis Morgan
👾 Oscar Rahnama
--
Learn to code for free and get a developer job: https://www.freecodecamp.org
Read hundreds of articles on programming: https://freecodecamp.org/news
❤️ Support for this channel comes from our friends at Scrimba – the coding platform that's reinvented interactive learning: https://scrimba.com/freecodecamp
How to Fine Tune your own LLM using LoRA (on a CUSTOM dataset!)
That gameboy blender animation...took 6 hours to render 😅. Anyway, had a ton of fun coding this up and finally getting back to some proper ML. I've been thi...
A practical guide to building agents
anthropics/prompt-eng-interactive-tutorial: Anthropic's Interactive Prompt Engineering Tutorial
Anthropic's Interactive Prompt Engineering Tutorial - anthropics/prompt-eng-interactive-tutorial
Handwritten Text Recognition using OCR
In this article, we carry out handwritten text recognition using OCR. We fine tune the TrOCR model on the GNHK dataset.
How sqlite-vec Works for Storing and Querying Vector Embeddings
Learn how `sqlite-vec` turns SQLite into a fast, embedded vector search engine. With support for float32, int8, and bit vectors, optimized distance metrics, and native SQL integration, it's ideal for offline AI, semantic search, and lightweight ML apps. This post walks through how it works and why it's surprisingly powerful.
Finding the Best Open-Source Embedding Model for RAG
Looking for the best open-source embedding model for your RAG app? We share a comparison workflow so you can stop paying the OpenAI tax.
Unlock Gemma 3's Multi Image Magic
🎬 Ever wondered how AI could turn your life into a documentary? Watch as I create a seemingly professional documentary about myself in minutes using Gemma 3, Ollama, and ElevenLabs - no film crew needed!
🎯 In this video, you'll learn:
• How to use Gemma 3's multimodal capabilities with multiple images
• Building a simple CLI app with Deno/TypeScript for image processing
• Working with n8n workflows for AI integration
• Creating convincing AI-generated narratives with Ollama
• Complete workflow from capture to final video production
⏱️ Timestamps:
00:00 - Start
00:28 - I'm in a Documentary
01:56 - Gemma3
02:13 - Whats new in Gemma3 with Ollama
02:26 - Tell a story with many images
03:54 - Creating the app with Windsurf
04:49 - It's not in x language
05:19 - Let's look at the code
08:32 - The backend in n8n
🛠️ Tools & Resources Mentioned:
• Gemma 3 27b
• Ollama (https://ollama.com)
• ElevenLabs (https://try.elevenlabs.io/tvlst)
• n8n
• Deno/TypeScript
Want to create your own AI-powered content? Drop a comment below with your ideas or questions!
#AIContent #TechTutorial #AIDocumentary
My Links 🔗
👉🏻 Subscribe (free): https://www.youtube.com/technovangelist
👉🏻 Join and Support: https://www.youtube.com/channel/UCHaF9kM2wn8C3CLRwLkC2GQ/join
👉🏻 Newsletter: https://technovangelist.substack.com/subscribe
👉🏻 Twitter: https://www.twitter.com/technovangelist
👉🏻 Discord: https://discord.gg/uS4gJMCRH2
👉🏻 Patreon: https://patreon.com/technovangelist
👉🏻 Instagram: https://www.instagram.com/technovangelist/
👉🏻 Threads: https://www.threads.net/@technovangelist?xmt=AQGzoMzVWwEq8qrkEGV8xEpbZ1FIcTl8Dhx9VpF1bkSBQp4
👉🏻 LinkedIn: https://www.linkedin.com/in/technovangelist/
👉🏻 All Source Code: https://github.com/technovangelist/videoprojects
Want to sponsor this channel? Let me know what your plans are here: https://www.technovangelist.com/sponsor