So you wanna build a local RAG?
GitHub - ivanfioravanti/qwen-image-mps: Qwen Image models through MPS
Qwen Image models through MPS. Contribute to ivanfioravanti/qwen-image-mps development by creating an account on GitHub.
5 Thoughts on Kimi K2 Thinking
Quick thoughts on another fantastic open model from a rapidly rising Chinese lab.
My local LLM turns any file into a mind map and it’s actually brilliant
Turns PDFs into diagrams faster than I could think.
I hooked Obsidian to a local LLM and it beats NotebookLM at its own game
My notes now talk back and it’s terrifyingly useful.
Testing VLMs and LLMs for robotics w/ the Jetson Thor devkit
Exploring the Jetson Thor devkit w/ some local LLMs and VLMs.More info on the Jetson Thor Devkit: https://nvda.ws/45xIU4BNeural Networks from Scratch book: h...
Cline + LM Studio: the local coding stack with Qwen3 Coder 30B
An autonomous AI coding assistant for VS Code with Plan/Act modes, terminal execution, file editing, and Model Context Protocol for custom tools
llama.cpp guide: running gpt-oss with llama.cpp
Really useful official guide to running the OpenAI gpt-oss models using llama-server from llama.cpp - which provides an OpenAI-compatible localhost API and a neat web interface for interacting with the …
What's the strongest AI model you can train on a laptop in five minutes?
What’s the strongest model I can train on my MacBook Pro in five minutes? I’ll give the answer upfront: the best 5-minute model I could train was a ~1.8M-param…
qwen-image-mps
Ivan Fioravanti built this Python CLI script for running the Qwen/Qwen-Image image generation model on an Apple silicon Mac, optionally using the Qwen-Image-Lightning LoRA to dramatically speed up generation. Ivan …
OpenAI’s GPT-OSS Is Already Old News
That’s on OpenAI. I don’t schedule their product releases. Since it takes several days to gather my reports on new models, we are doing our coverage…
OpenAI's new open-source model is basically Phi-5
OpenAI just released its first ever open-source large language models, called gpt-oss-120b and gpt-oss-20b. You can talk to them here. Are they good models…
OpenAI’s new open weight (Apache 2) models are really good
The long promised OpenAI open weight models are here, and they are very impressive. They’re available under proper open source licenses—Apache 2.0—and come in two sizes, 120B and 20B. OpenAI’s …
Trying out Qwen3 Coder Flash using LM Studio and Open WebUI and LLM
Qwen just released their sixth model(!) for this July called Qwen3-Coder-30B-A3B-Instruct—listed as Qwen3-Coder-Flash in their chat.qwen.ai interface. It’s 30.5B total parameters with 3.3B active at any one time. This means …
SmolLM3: smol, multilingual, long-context reasoner
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
The American DeepSeek Project
What I think the next goal for the open-source AI community is.
GraphRAG Local Setup Via vLLM and Ollama : A Detailed Integration Guide.
Introduction to GraphRAG
Introducing Gemma 3n: The developer guide
Learn how to build with Gemma 3n, a mobile-first architecture, MatFormer technology, Per-Layer Embeddings, and new audio and vision encoders.
Introducing Gemma 3n: The developer guide
Extremely consequential new open weights model release from Google today: Multimodal by design: Gemma 3n natively supports image, audio, video, and text inputs and text outputs. Optimized for on-device: Engineered …
Model Context Protocol (MCP) using Ollama
MCP Servers using Local LLMs tutorial
Model Context Protocol(MCP) with Ollama and Llama 3 : A Step-by-Step Guide — Part 2
In my previous article , we explored Model Context Protocol (MCP) — a standardized way for LLMs to invoke tools or APIs with structured…
Introducing the unified multi-modal MLX engine architecture in LM Studio
Leveraging `mlx-lm` and `mlx-vlm` to achieve unified multi-modal LLM inference in LM Studio's `mlx-engine`.
ollama-ocr
OCR package using Ollama vision language models.
petermg/Chatterbox-TTS-Extended: Modified version of Chatterbox that accepts text files as input and no character restrictions
Modified version of Chatterbox that accepts text files as input and no character restrictions - petermg/Chatterbox-TTS-Extended
Passing Images to a Vision-Language Model in Ollama | by Manyi | Apr,…
Trying out llama.cpp’s new vision support
This llama.cpp server vision support via libmtmd pull request—via Hacker News—was merged earlier today. The PR finally adds full support for vision models to the excellent llama.cpp project. It’s documented …
An Intro to RAG with sqlite-vec & llamafile!
A brief introduction to using llamafile (a single-file tool for working with large language models) and sqlite-vec (A SQLite extension for vector search) to build a Retrival Augmentation Generation (RAG) application.
This was a live online event hosted on Dec 17th 2024 in the Mozilla AI Discord, join us for the next event at at https://discord.gg/Ve7WeCJFXk
LINKS:
- Doc w/ links to all mentioned projects/blog posts: https://docs.google.com/document/d/17GYLzlGUyJF9EDeaa1P-dFFZnkwxATnBcg5KnNgpvPE/edit?usp=sharing
- Slides: https://docs.google.com/presentation/d/14Szda-VnZzepL-1U9Nb7sXQg_TTf56OQ-KtUIMQ5xug/edit?usp=sharing
Olow304/memvid: Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.
Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed. - Olow304/memvid
Chatterbox-TTS Apple Silicon - a Hugging Face Space by Jimmi42
Upload a reference audio file and enter text to create audio in that voice. The app automatically chunks long text and uses Apple Silicon's GPU for faster processing.
The best open source OCR models