AI/ML

AI/ML

2200 bookmarks
Custom sorting
microsoft/table-transformer: Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
microsoft/table-transformer: Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS ev...
·github.com·
microsoft/table-transformer: Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
Knowledge Retrieval Architecture for LLM’s (2023)
Knowledge Retrieval Architecture for LLM’s (2023)
In this guide, I will share the standard architecture for data-informed language model applications and explain forthcoming improvements in knowledge retrieval.
At a high level, there are two primary methods for referencing specific data:Insert data as context in the model prompt, and direct the response to utilize that informationFine-tune a model, by providing hundreds or thousands of prompt <> completion pairs
The design above goes by various names, most commonly "retrieval-augmented generation" or "RETRO"
Retrieval-augmented generation a) retrieves relevant data from outside of the language model (non-parametric) and b) augments the data with context in the prompt to the LLM. The architecture cleanly routes around most of the limitations of fine-tuning and context-only approaches.
·mattboegner.com·
Knowledge Retrieval Architecture for LLM’s (2023)
AI is taking the jobs of Kenyans who write essays for U.S. college students
AI is taking the jobs of Kenyans who write essays for U.S. college students
Ghostwriters say the meteoric rise of ChatGPT has coincided with a drop in income.
In January 2023, online learning platform Study surveyed more than 1,000 American students and over 100 educators. More than 89% of the students said they had used ChatGPT for help with a homework assignment. Nearly half admitted to using ChatGPT for an at-home test or quiz, 53% had used it to write an essay, and 22% had used it for outlining one.
While 17 states in the U.S. have banned contract cheating, it has not been a problem for freelancers in Kenya
Christopher Kanan, an associate professor in the department of computer science at the University of Rochester, has started giving in-person, in-class quizzes due to the popularity of ChatGPT
Others, like Ethan Mollick, an associate professor at the University of Pennsylvania’s Wharton School, have chosen to take a more open approach to ChatGPT in class. “The truth is, I probably couldn’t have stopped them even if I didn’t require it,” he told NPR.
·restofworld.org·
AI is taking the jobs of Kenyans who write essays for U.S. college students
The Dual LLM pattern for building AI assistants that can resist prompt injection
The Dual LLM pattern for building AI assistants that can resist prompt injection
I really want an AI assistant: a Large Language Model powered chatbot that can answer questions and perform actions for me based on access to my private data and tools. …
Confused deputy attacks Confused deputy is a term of art in information security. Wikipedia defines it like this: In information security, a confused deputy is a computer program that is tricked by another program (with fewer privileges or less rights) into misusing its authority on the system. It is a specific type of privilege escalation.
Language model applications work by mixing together trusted and untrusted data sources
For example, if the LLM generates instructions to send or delete an email the wrapping UI layer should trigger a prompt to the user asking for approval to carry out that action.
More to the point, it will inevitably suffer from dialog fatigue: users will learn to click “OK” to everything as fast as possible, so as a security measure it’s likely to catastrophically fail.
Data exfiltration attacks Wikipedia definition: Data exfiltration occurs when malware and/or a malicious actor carries out an unauthorized data transfer from a computer. It is also commonly called data extrusion or data exportation. Data exfiltration is also considered a form of data theft.
Even if an AI agent can’t make its own HTTP calls directly, there are still exfiltration vectors we need to lock down.
Locking down an LLM We’ve established that processing untrusted input using an LLM is fraught with danger. If an LLM is going to be exposed to untrusted content—content that could have been influenced by an outside attacker, via emails or web pages or any other form of untrusted input—it needs to follow these rules: No ability to execute additional actions that could be abused And if it might ever mix untrusted content with private data that could be the target of an exfiltration attack: Only call APIs that can be trusted not to leak data No generating outbound links, and no generating outbound images This is an extremely limiting set of rules when trying to build an AI assistant. It would appear to rule out most of the things we want to build!
For any output that could itself host a further injection attack, we need to take a different approach. Instead of forwarding the text as-is, we can instead work with unique tokens that represent that potentially tainted content.
·simonwillison.net·
The Dual LLM pattern for building AI assistants that can resist prompt injection
deep-floyd/IF
deep-floyd/IF
Contribute to deep-floyd/IF development by creating an account on GitHub.
·github.com·
deep-floyd/IF
A.I. and Stochastic Parrots | FACTUALLY with Emily Bender and Timnit Gebru
A.I. and Stochastic Parrots | FACTUALLY with Emily Bender and Timnit Gebru
SUBSCRIBE TO FACTUALLY: https://link.chtbl.com/nD2_iAuV SUPPORT THE SHOW ON PATREON: http://patreon.com/adamconover So-called “artificial intelligence” is one of the most divisive topics of the year, with even those who understand it in total disagreement about its potential impacts. This week, A.I. reseachers and authors of the famous paper “On the Dangers of Stochastic Parrots,” Emily Bender and Timnit Gebru, join Adam to discuss what everyone gets wrong about A.I.
·youtube.com·
A.I. and Stochastic Parrots | FACTUALLY with Emily Bender and Timnit Gebru
Investigating MiniGPT-4 - The Secret behind GPT-V?
Investigating MiniGPT-4 - The Secret behind GPT-V?
Project: https://minigpt-4.github.io/Demo: https://c8de8ff74b6a6c6a9b.gradio.live/In this video I look at the project MiniGPT-4: Enhancing Vision-language Un...
·youtube.com·
Investigating MiniGPT-4 - The Secret behind GPT-V?
Minigpt-4
Minigpt-4
·minigpt-4.github.io·
Minigpt-4
Meaningful Code Tests for Busy Devs | CodiumAI
Meaningful Code Tests for Busy Devs | CodiumAI
With CodiumAI, you get non-trivial tests suggested right inside your IDE, so you can code smart, create more value, and stay confident when you push.
·codium.ai·
Meaningful Code Tests for Busy Devs | CodiumAI
Instant Plugins for ChatGPT: Introducing the Wolfram ChatGPT Plugin Kit
Instant Plugins for ChatGPT: Introducing the Wolfram ChatGPT Plugin Kit
How to make your own ChatGPT plugin that does specific computations or accesses your own data or services. Deploy to your own machine or the cloud. Plus, Stephen Wolfram explains how it all works.
·writings.stephenwolfram.com·
Instant Plugins for ChatGPT: Introducing the Wolfram ChatGPT Plugin Kit
Dropbox lays off 500 employees, 16% of staff, CEO says due to slowing growth and 'the era of AI'
Dropbox lays off 500 employees, 16% of staff, CEO says due to slowing growth and 'the era of AI'
Cloud storage giant Dropbox today joined the fray of tech companies announcing layoffs. The company today announced that it would be laying off 16% of its staff, equivalent to about 500 employees, due to slowing growth, and — in the words of CEO Drew Houston — because “the AI era of computing has finally arrived.” […]
·techcrunch.com·
Dropbox lays off 500 employees, 16% of staff, CEO says due to slowing growth and 'the era of AI'
News app Artifact can now summarize stories using AI, including in fun styles
News app Artifact can now summarize stories using AI, including in fun styles
Artifact, the personalized news aggregator from Instagram’s founders is further embracing AI with the launch of a new feature that will now summarize news articles for you. The company announced today it’s introducing a tool that generates article summaries with a tap of a button, in order to give readers the ability to understand the […]
·techcrunch.com·
News app Artifact can now summarize stories using AI, including in fun styles
Future Music | Weeknotes - thejaymo
Future Music | Weeknotes - thejaymo
I am fixated on the vocal static. I hear at the edges the AI-anna Grande model unspooling into pure material waveform. This is future music.
·thejaymo.net·
Future Music | Weeknotes - thejaymo
US Supreme Court rejects computer scientist's lawsuit over AI-generated inventions
US Supreme Court rejects computer scientist's lawsuit over AI-generated inventions
The U.S. Supreme Court on Monday declined to hear a challenge by computer scientist Stephen Thaler to the U.S. Patent and Trademark Office's refusal to issue patents for inventions his artificial intelligence system created.
·reuters.com·
US Supreme Court rejects computer scientist's lawsuit over AI-generated inventions
Building A ChatGPT-enhanced Python REPL
Building A ChatGPT-enhanced Python REPL
In this blog I share my experience in building a Python REPL augmented with ChatGPT. I explore how the application is built, and speculate on software engineering patterns and paradigms that might emerge in systems built on Large Language Models (LLMs). GEPL - Generate, Evaluate, Print, Loop Link to this section Introduction The Lisp programming language made REPLs (Read, Evaluate, Print, Loop) famous. REPLs are interactive programming environments where the programmer gets immediate feedback on lines of code they just typed.
·isthisit.nz·
Building A ChatGPT-enhanced Python REPL
Langchain + Zapier Agent
Langchain + Zapier Agent
A brief overview of Langchain's integration with Zapier NLP Actions API Request Zapier NLP Actions: https://zapier.com/l/natural-language-actions Langchain docs: https://langchain.readthedocs.io/en/latest/ Open AI: https://platform.openai.com Other links: twitter: https://twitter.com/0xmerkle friday lunch: https://fridaylunch.studio
·youtube.com·
Langchain + Zapier Agent
Bardeen | Automate your repetitive tasks with one click
Bardeen | Automate your repetitive tasks with one click
Bardeen is an automation app to replace your repetitive tasks with a single shortcut and control your web apps from anywhere. Explore our integrations with your favorite apps and hundreds of pre-built playbooks that help you stay in the flow.
·bardeen.ai·
Bardeen | Automate your repetitive tasks with one click
Feder: A Powerful Visualization Tool for Vector Similarity Search
Feder: A Powerful Visualization Tool for Vector Similarity Search
We are happy to announce the release of Feder, a powerful visualization tool that can help you see the actual structure of an index and the whole process of a vector similarity search.
·zilliz.com·
Feder: A Powerful Visualization Tool for Vector Similarity Search
What is a Vector Database?
What is a Vector Database?
The ever-increasing amount of unstructured data requires a paradigm shift and a new category of database management system - the vector database.
·zilliz.com·
What is a Vector Database?
Query Your Data with GPT-4 | Embeddings, Vector Databases | Langchain JS Knowledgebase
Query Your Data with GPT-4 | Embeddings, Vector Databases | Langchain JS Knowledgebase
Introduction to Langchain Javascript Embeddings, Vectorstorage, Similarity Search. How to Create GPT-3 GPT-4 Chatbots that can contextually reference your data (txt, JSON, webpages, PDF) with embeddings. Discussion into embeddings, vectorstorage options such as Pinecone, Chroma, Langchain, Supabase, Weaviate. Starmorph Resources Website: https://Starmorph.com Starmorph Tools https://starmorph.com/ai-tools Consulting 1hr Call: https://cal.com/starmorphai/consultingcall?duration=60 Langchain Resources Langchain JS Docs: https://js.langchain.com/docs/ OpenAI Embeddings Docs: https://platform.openai.com/docs/guides/embeddings/use-cases
·youtube.com·
Query Your Data with GPT-4 | Embeddings, Vector Databases | Langchain JS Knowledgebase
Atlassian brings an AI assistant to Jira and Confluence
Atlassian brings an AI assistant to Jira and Confluence
Atlassian introduces AI-driven virtual teammate, Atlassian Intelligence, that brings together Atlassian's own model and OpenAI's tools.
·techcrunch.com·
Atlassian brings an AI assistant to Jira and Confluence
Snack Prompt | Discover the Best AI Prompts | AI Collaboration Platform
Snack Prompt | Discover the Best AI Prompts | AI Collaboration Platform
Explore a community-driven platform to discover, upvote, and share the best AI prompts for ChatGPT & Bard. Follow topics, create and organize prompts, and connect with expert prompters. Unlock AI’s full potential with Snack Prompt.
·snackprompt.com·
Snack Prompt | Discover the Best AI Prompts | AI Collaboration Platform