Found 1798 bookmarks
Newest
The Illustrated GPT-2 (Visualizing Transformer Language Models)
The Illustrated GPT-2 (Visualizing Transformer Language Models)
Discussions: Hacker News (64 points, 3 comments), Reddit r/MachineLearning (219 points, 18 comments) Translations: Simplified Chinese, French, Korean, Russian, Turkish This year, we saw a dazzling application of machine learning. The OpenAI GPT-2 exhibited impressive ability of writing coherent and passionate essays that exceed what we anticipated current language models are able to produce. The GPT-2 wasn’t a particularly novel architecture – it’s architecture is very similar to the decoder-only transformer. The GPT2 was, however, a very large, transformer-based language model trained on a massive dataset. In this post, we’ll look at the architecture that enabled the model to produce its results. We will go into the depths of its self-attention layer. And then we’ll look at applications for the decoder-only transformer beyond language modeling. My goal here is to also supplement my earlier post, The Illustrated Transformer, with more visuals explaining the inner-workings of transformers, and how they’ve evolved since the original paper. My hope is that this visual language will hopefully make it easier to explain later Transformer-based models as their inner-workings continue to evolve.
·jalammar.github.io·
The Illustrated GPT-2 (Visualizing Transformer Language Models)
The Revenge of the Cataloguers
The Revenge of the Cataloguers
Over the past 15 years or so, libraries around the world have de-emphasized cataloguing. While budgetary concerns and technological efficien...
·go-to-hellman.blogspot.com·
The Revenge of the Cataloguers
Beam
Beam
Press a hotkey to chat anywhere on your Mac. No subscription fees. No logins. Your own API key.
·getbeam.ai·
Beam
Dust - Secure AI assistant with your company's knowledge
Dust - Secure AI assistant with your company's knowledge
Dust is an AI assistant that safely brings the best large language models, continuously updated company knowledge, powerful collaboration applications, and an extensible platform to your team's fingertips.
·dust.tt·
Dust - Secure AI assistant with your company's knowledge
Introducing Stable Zero123: Quality 3D Object Generation from Single Images — Stability AI
Introducing Stable Zero123: Quality 3D Object Generation from Single Images — Stability AI
Stable Zero123 is an AI-powered model for generating novel views of 3D objects with improved quality. Released for non-commercial and research purposes, it uses an improved dataset and elevation conditioning for higher-quality predictions. Using the improved open-source code, researchers can use thi
·stability.ai·
Introducing Stable Zero123: Quality 3D Object Generation from Single Images — Stability AI
FunSearch: Making new discoveries in mathematical sciences using Large Language Models
FunSearch: Making new discoveries in mathematical sciences using Large Language Models
We introduce FunSearch, a method for searching for “functions” written in computer code, and find new solutions in mathematics and computer science. FunSearch works by pairing a pre-trained LLM, whose goal is to provide creative solutions in the form of computer code, with an automated “evaluator”, which guards against hallucinations and incorrect ideas.
·deepmind.google·
FunSearch: Making new discoveries in mathematical sciences using Large Language Models
Data exfiltration from Writer.com with indirect prompt injection
Data exfiltration from Writer.com with indirect prompt injection
This is a nasty one. Writer.com call themselves a "secure enterprise generative AI platform", offering collaborative generative AI writing assistance and question answering that can integrate with your company's private …
·simonwillison.net·
Data exfiltration from Writer.com with indirect prompt injection
Mixtral 8X7B — Deploying an *Open* AI Agent
Mixtral 8X7B — Deploying an *Open* AI Agent
Mistral AI's new model — Mixtral 8x7B — is pretty impressive. We'll see how to get set up and deploy Mixtral 8X7B, the prompt format it requires, and how it performs when being used as an Agent — we even add in some Mixtral RAG at the end. As a bit of a spoiler, Mixtral is probably the first open-source LLM that is truly very very good — I say this considering the following key points: - Benchmarks show it to perform better than GPT-3.5. - My own testing shows Mixtral to be the first open weights model we can reliably use as an agent. - Due to MoE architecture it is very fast given its size. If you can afford to run on 2x A100s and latency is good enough to be used in chatbot use cases. 📕 Mixtral 8X7B Page (I'll be publishing soon): https://www.pinecone.io/learn/ 📌 Code Notebook: https://github.com/pinecone-io/examples/blob/master/learn/generation/llm-field-guide/mistral-ai/mixtral-8x7b/00-mixtral-8x7b-agent.ipynb 🌲 Subscribe for Latest Articles and Videos: https://www.pinecone.io/newsletter-signup/ 👋🏼 AI Dev: https://aurelio.ai 👾 Discord: https://discord.gg/c5QtDB9RAP Twitter: https://twitter.com/jamescalam LinkedIn: https://www.linkedin.com/in/jamescalam/ 00:00 Mixtral 8X7B is better than GPT 3.5 00:50 Deploying Mixtral 8x7B 03:21 Mixtral Code Setup 08:17 Using Mixtral Instructions 10:04 Mixtral Special Tokens 13:29 Parsing Multiple Agent Tools 14:28 RAG with Mixtral 17:01 Final Thoughts on Mixtral #artificialintelligence #nlp #ai #chatbot #opensource
·youtube.com·
Mixtral 8X7B — Deploying an *Open* AI Agent
elfvingralf/macOSpilot-ai-assistant: Voice + Vision powered AI assistant that answers questions about any application, in context and in audio.
elfvingralf/macOSpilot-ai-assistant: Voice + Vision powered AI assistant that answers questions about any application, in context and in audio.
Voice + Vision powered AI assistant that answers questions about any application, in context and in audio. - elfvingralf/macOSpilot-ai-assistant: Voice + Vision powered AI assistant that answers qu...
·github.com·
elfvingralf/macOSpilot-ai-assistant: Voice + Vision powered AI assistant that answers questions about any application, in context and in audio.
Deep Learning - Foundations and Concepts
Deep Learning - Foundations and Concepts
This book offers a comprehensive introduction to the central ideas that underpin deep learning. It is intended both for newcomers to machine learning and for those already experienced in the field.
·bishopbook.com·
Deep Learning - Foundations and Concepts
Zero to Hero LLMs with M3 Max BEAST
Zero to Hero LLMs with M3 Max BEAST
M3 Max is a Machine Learning BEAST. So I took it for a spin with some LLM's running locally. Temperature/fan on your Mac: https://www.tunabellysoftware.com/tgpro/index.php?fpr=alex (affiliate link) Run Windows on a Mac: https://prf.hn/click/camref:1100libNI (affiliate) Use COUPON: ZISKIND20 🛒 Gear Links 🛒 * 🍏💥 New MacBook Air M1 Deal: https://amzn.to/3S59ID8 * 💻🔄 Renewed MacBook Air M1 Deal: https://amzn.to/45K1Gmk * 🎧⚡ Great 40Gbps T4 enclosure: https://amzn.to/3JNwBGW * 🛠️🚀 My nvme ssd: https://amzn.to/3YLEySo * 📦🎮 My gear: https://www.amazon.com/shop/alexziskind 🎥 Related Videos 🎥 * 🌗 RAM torture test on Mac - https://youtu.be/l3zIwPgan7M * 🛠️ Set up Conda on Mac - https://youtu.be/2Acht_5_HTo * 👨‍💻 15" MacBook Air | developer's dream - https://youtu.be/A1IOZUCTOkM * 🤖 INSANE Machine Learning on Neural Engine - https://youtu.be/Y2FOUg_jo7k * 💻 M2 MacBook Air and temps - https://youtu.be/R7F-TxEukdY * 💰 This is what spending more on a MacBook Pro gets you - https://youtu.be/iLHrYuQjKPU * 🛠️ Developer productivity Playlist - https://www.youtube.com/playlist?list=PLPwbI_iIX3aQCRdFGM7j4TY_7STfv2aXX 👨‍💻 Primes - https://github.com/PlummersSoftwareLLC/Primes 💻 MacBooks in this video M3 Max (16/40) 16" MacBook Pro 64GB/2TB #m3max #macbook #macbookpro — — — — — — — — — 📱LET'S CONNECT ON SOCIAL MEDIA ALEX ON TWITTER: https://twitter.com/digitalix
·youtube.com·
Zero to Hero LLMs with M3 Max BEAST
danny-avila/LibreChat: Enhanced ChatGPT Clone: Features OpenAI, GPT-4 Vision, Bing, Anthropic, OpenRouter, PaLM 2, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. More features in development
danny-avila/LibreChat: Enhanced ChatGPT Clone: Features OpenAI, GPT-4 Vision, Bing, Anthropic, OpenRouter, PaLM 2, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. More features in development
Enhanced ChatGPT Clone: Features OpenAI, GPT-4 Vision, Bing, Anthropic, OpenRouter, PaLM 2, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-...
·github.com·
danny-avila/LibreChat: Enhanced ChatGPT Clone: Features OpenAI, GPT-4 Vision, Bing, Anthropic, OpenRouter, PaLM 2, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. More features in development
Generating Molecular Conformer Fields
Generating Molecular Conformer Fields
In this paper we tackle the problem of generating conformers of a molecule in 3D space given its molecular graph. We parameterize these conformers as continuous functions that map elements from the molecular graph to points in 3D space. We then formulate the problem of learning to generate conformers as learning a distribution over these functions using a diffusion generative model, called Molecular Conformer Fields (MCF). Our approach is simple and scalable, and achieves state-of-the-art performance on challenging molecular conformer generation benchmarks while making no assumptions about the explicit structure of molecules (e.g. modeling torsional angles). MCF represents an advance in extending diffusion models to handle complex scientific problems in a conceptually simple, scalable and effective manner.
·arxiv.org·
Generating Molecular Conformer Fields