AI as Algorithmic Thatcherism

AI/ML
Frooodle/Stirling-PDF: locally hosted web application that allows you to perform various operations on PDF files
made with AI
"Attention", "Transformers", in Neural Network "Large Language Models"
GroqChat
A GroqLabs AI Language Interface
apple/ml-ferret
Contribute to apple/ml-ferret development by creating an account on GitHub.
Suno AI
We are building a future where anyone can make great music. No instrument needed, just imagination. From your mind to music.
Building a graph convolutional network for molecular property prediction 978b0ae10ec4
Build a search engine, not a vector DB
If you want to build a RAG-based tool, first build search.
explosion/curated-transformers: 🤖 A PyTorch library of curated Transformer models and their composable components
SJTU-IPADS/PowerInfer: High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs - SJTU-IPADS/PowerInfer: High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Apple’s latest AI research could completely transform your iPhone
Apple researchers have introduced new techniques to create photorealistic 3D avatars from video and enable advanced AI systems to run efficiently on devices with limited memory, such as an iPhone or iPad.
Zoo: Introducing Text-to-CAD
How we built “Mistral 7B Fine-Tune Optimized,” the best 7B model for fine-tuning - OpenPipe
Convert expensive LLM prompts into fast, cheap fine-tuned models
Artificial intelligence can find your location in photos, worrying privacy experts
Three Stanford graduate students built an AI tool that can find a location by looking at pictures. Civil rights advocates warn more advanced versions will further erode online privacy.
aymenfurter/microagents: Agents Capable of Self-Editing Their Prompts / Python Code
Agents Capable of Self-Editing Their Prompts / Python Code - aymenfurter/microagents: Agents Capable of Self-Editing Their Prompts / Python Code
The Illustrated GPT-2 (Visualizing Transformer Language Models)
Discussions:
Hacker News (64 points, 3 comments), Reddit r/MachineLearning (219 points, 18 comments)
Translations: Simplified Chinese, French, Korean, Russian, Turkish
This year, we saw a dazzling application of machine learning. The OpenAI GPT-2 exhibited impressive ability of writing coherent and passionate essays that exceed what we anticipated current language models are able to produce. The GPT-2 wasn’t a particularly novel architecture – it’s architecture is very similar to the decoder-only transformer. The GPT2 was, however, a very large, transformer-based language model trained on a massive dataset. In this post, we’ll look at the architecture that enabled the model to produce its results. We will go into the depths of its self-attention layer. And then we’ll look at applications for the decoder-only transformer beyond language modeling.
My goal here is to also supplement my earlier post, The Illustrated Transformer, with more visuals explaining the inner-workings of transformers, and how they’ve evolved since the original paper. My hope is that this visual language will hopefully make it easier to explain later Transformer-based models as their inner-workings continue to evolve.
The Revenge of the Cataloguers
Over the past 15 years or so, libraries around the world have de-emphasized cataloguing. While budgetary concerns and technological efficien...
Advancements in machine learning for machine learning
Beam
Press a hotkey to chat anywhere on your Mac. No subscription fees. No logins. Your own API key.
Dust - Secure AI assistant with your company's knowledge
Dust is an AI assistant that safely brings the best large language models, continuously updated company knowledge, powerful collaboration applications, and an extensible platform to your team's fingertips.
MolSetRep – Macs in Chemistry
🌲 A Personal Take on Using LLMs
Navigating the ethics -- and effectiveness -- of new AI tools as a writer.
Introducing Stable Zero123: Quality 3D Object Generation from Single Images — Stability AI
Stable Zero123 is an AI-powered model for generating novel views of 3D objects with improved quality. Released for non-commercial and research purposes, it uses an improved dataset and elevation conditioning for higher-quality predictions. Using the improved open-source code, researchers can use thi
threestudio-project/threestudio: A unified framework for 3D content generation.
A unified framework for 3D content generation. Contribute to threestudio-project/threestudio development by creating an account on GitHub.
FunSearch: Making new discoveries in mathematical sciences using Large Language Models
We introduce FunSearch, a method for searching for “functions” written in computer code, and find new solutions in mathematics and computer science. FunSearch works by pairing a pre-trained LLM, whose goal is to provide creative solutions in the form of computer code, with an automated “evaluator”, which guards against hallucinations and incorrect ideas.
Data exfiltration from Writer.com with indirect prompt injection
Authors: PromptArmor and Kai Greshake
Data exfiltration from Writer.com with indirect prompt injection
This is a nasty one. Writer.com call themselves a "secure enterprise generative AI platform", offering collaborative generative AI writing assistance and question answering that can integrate with your company's private …
A Remake of the Google Gemini Fake Demo, Except Using GPT-4 and It's Real
update: https://www.youtube.com/watch?v=1RrkRA7wuoEcode: https://github.com/gregsadetsky/sagittarius
Mixtral 8X7B — Deploying an *Open* AI Agent
Mistral AI's new model — Mixtral 8x7B — is pretty impressive. We'll see how to get set up and deploy Mixtral 8X7B, the prompt format it requires, and how it performs when being used as an Agent — we even add in some Mixtral RAG at the end.
As a bit of a spoiler, Mixtral is probably the first open-source LLM that is truly very very good — I say this considering the following key points:
- Benchmarks show it to perform better than GPT-3.5.
- My own testing shows Mixtral to be the first open weights model we can reliably use as an agent.
- Due to MoE architecture it is very fast given its size. If you can afford to run on 2x A100s and latency is good enough to be used in chatbot use cases.
📕 Mixtral 8X7B Page (I'll be publishing soon):
https://www.pinecone.io/learn/
📌 Code Notebook:
https://github.com/pinecone-io/examples/blob/master/learn/generation/llm-field-guide/mistral-ai/mixtral-8x7b/00-mixtral-8x7b-agent.ipynb
🌲 Subscribe for Latest Articles and Videos:
https://www.pinecone.io/newsletter-signup/
👋🏼 AI Dev:
https://aurelio.ai
👾 Discord:
https://discord.gg/c5QtDB9RAP
Twitter: https://twitter.com/jamescalam
LinkedIn: https://www.linkedin.com/in/jamescalam/
00:00 Mixtral 8X7B is better than GPT 3.5
00:50 Deploying Mixtral 8x7B
03:21 Mixtral Code Setup
08:17 Using Mixtral Instructions
10:04 Mixtral Special Tokens
13:29 Parsing Multiple Agent Tools
14:28 RAG with Mixtral
17:01 Final Thoughts on Mixtral
#artificialintelligence #nlp #ai #chatbot #opensource
Google DeepMind used a large language model to solve an unsolvable math problem
They had to throw away most of what it produced but there was gold among the garbage.