Secure Your RAG App Against Prompt Injection Attacks
Don't skip securing your RAG app like you skip leg day at the gym! Here's what Prompt Injection is, how it works, and what you can do to secure your LLM-powered application.
Introduction to Augmenting LLMs with Private Data using LlamaIndex
In this post, we're going to take a top-level overview of how LlamaIndex bridges AI and private custom data from multiple sources (APIs, PDF, and more), enabling powerful applications.
Use LangChain, Deepgram, and Mistral 7B to Build a Youtube Video Summarization App - Koyeb
This guide explains how to build a YouTube video summarization using Langchain, Deepgram, and Mistral 7B. Deploy your AI workload on Koyeb to enjoy high-performance microVMs, seamless scaling, and fast global deployments.
Thanks for signing up to The Rundown built by @therundownai and @rowancheung! Enjoy our free Advanced ChatGPT Guide as a warm welcome into the world of AI!
Andrej Baranovskij on X: "Running Starling-7B LLM model on local CPU with @Ollama_ai and getting great results for invoice data extraction, even better than Zephyr, Mistral or Llama2. Prompt: retrieve gross worth value for each invoice item from the table. format response as following {\"gross_worth\":… https://t.co/QPPPrV27JU" / X
Running Starling-7B LLM model on local CPU with @Ollama_ai and getting great results for invoice data extraction, even better than Zephyr, Mistral or Llama2.Prompt: retrieve gross worth value for each invoice item from the table. format response as following {\"gross_worth\":… pic.twitter.com/QPPPrV27JU— Andrej Baranovskij (@andrejusb) November 30, 2023
In this third video of our series on Llama-index, we will explore how to use different vector stores in llama-index while building RAG applications. We will ...
Prompt Engineering for Developers: How AI Can Help With Architecture Decisions
Learn effective prompt engineering techniques for Large Language Models (LLMs) based on generative AI, like ChatGPT, to facilitate software architecture decisions.
Step by step guide to learn Prompt Engineering. We also have resources and short descriptions attached to the roadmap items so you can get everything you want to learn in one place.
Discussions:
Hacker News (65 points, 4 comments), Reddit r/MachineLearning (29 points, 3 comments)
Translations: Arabic, Chinese (Simplified) 1, Chinese (Simplified) 2, French 1, French 2, Italian, Japanese, Korean, Persian, Russian, Spanish 1, Spanish 2, Vietnamese
Watch: MIT’s Deep Learning State of the Art lecture referencing this post
In the previous post, we looked at Attention – a ubiquitous method in modern deep learning models. Attention is a concept that helped improve the performance of neural machine translation applications. In this post, we will look at The Transformer – a model that uses attention to boost the speed with which these models can be trained. The Transformer outperforms the Google Neural Machine Translation model in specific tasks. The biggest benefit, however, comes from how The Transformer lends itself to parallelization. It is in fact Google Cloud’s recommendation to use The Transformer as a reference model to use their Cloud TPU offering. So let’s try to break the model apart and look at how it functions.
The Transformer was proposed in the paper Attention is All You Need. A TensorFlow implementation of it is available as a part of the Tensor2Tensor package. Harvard’s NLP group created a guide annotating the paper with PyTorch implementation. In this post, we will attempt to oversimplify things a bit and introduce the concepts one by one to hopefully make it easier to understand to people without in-depth knowledge of the subject matter.
2020 Update: I’ve created a “Narrated Transformer” video which is a gentler approach to the topic:
A High-Level Look
Let’s begin by looking at the model as a single black box. In a machine translation application, it would take a sentence in one language, and output its translation in another.
Vector databases enable fast similarity search and scale across data points. For LLM apps, vector indexes can simplify architecture over full vector databases by attaching vectors to existing storage. Choosing indexes vs databases depends on specialized needs, existing infrastructure, and broader enterprise requirements.
What are vector databasesWhy are vector databases important to AICore concepts of a vector databaseFactors to consider when choosing a vector databasePopular Vector Databases for your considerationStep by Step guide to implementing a Vector databaseStep 1 : Installing MilvusStep 2: Creating a Milvus ClientStep 3 : Create a collectionStep 4: Inserting data
Introducing Vector-Storage: A Lightweight Vector Database for the Browser | by Nitai Aharoni 🎾 - Freedium
In the world of natural language processing (NLP), vector embeddings have become a powerful tool...
Storage Limitations: The browser's local storage has a size limit of approximately 5MB, which may limit the number of document vectors that can be stored. Vector Storage addresses this by implementing an LRU mechanism to manage storage size.