Found 581 bookmarks
Newest
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models - Microsoft Research
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models - Microsoft Research
Text recognition is a long-standing research problem for document digitalization. Existing approaches for text recognition are usually built based on CNN for image understanding and RNN for char-level text generation. In addition, another language model is usually needed to improve the overall accuracy as a post-processing step. In this paper, we propose an end-to-end text […]
·microsoft.com·
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models - Microsoft Research
How I'd Learn AI (If I Had to Start Over)
How I'd Learn AI (If I Had to Start Over)
AI Learning Roadmap (Notion) 👉 https://tinyurl.com/2m5bcyyvAI Learning Roadmap (PDF download) 👉 https://tinyurl.com/3hdjnbaa🔑 TIMESTAMPS==================...
·youtube.com·
How I'd Learn AI (If I Had to Start Over)
liuhaotian/LLaVA-13b-delta-v0 · Hugging Face
liuhaotian/LLaVA-13b-delta-v0 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
·huggingface.co·
liuhaotian/LLaVA-13b-delta-v0 · Hugging Face
haotian-liu/LLaVA: [NeurIPS'23 Oral] Visual Instruction Tuning: LLaVA (Large Language-and-Vision Assistant) built towards GPT-4V level capabilities.
haotian-liu/LLaVA: [NeurIPS'23 Oral] Visual Instruction Tuning: LLaVA (Large Language-and-Vision Assistant) built towards GPT-4V level capabilities.
[NeurIPS'23 Oral] Visual Instruction Tuning: LLaVA (Large Language-and-Vision Assistant) built towards GPT-4V level capabilities. - GitHub - haotian-liu/LLaVA: [NeurIPS'23 Oral] Vis...
·github.com·
haotian-liu/LLaVA: [NeurIPS'23 Oral] Visual Instruction Tuning: LLaVA (Large Language-and-Vision Assistant) built towards GPT-4V level capabilities.
FlowiseAI - Build LLMs Apps Easily
FlowiseAI - Build LLMs Apps Easily
Open source UI visual tool to build your customized LLM flow using Langchain
·flowiseai.com·
FlowiseAI - Build LLMs Apps Easily
Jerry Liu on Twitter / X
Jerry Liu on Twitter / X
A big issue with trying to improve your RAG pipeline is that advanced techniques can require a ton of setup time.We spent this past weekend packaging 7+ advanced techniques in @llama_index, so that you can use all of them through a standardized interface - simply load in your… https://t.co/Pave05z3Ld pic.twitter.com/FnanyoQTfC— Jerry Liu (@jerryjliu0) November 28, 2023
·x.com·
Jerry Liu on Twitter / X
The Weekend AI Engineer: Hassan El Mghari
The Weekend AI Engineer: Hassan El Mghari
How *YOU* can - and should - build great multimodal AI apps that go viral and scale to millions in a weekend. Featuring the Vercel AI SDK and the new v0.dev AI frontend tool. Recorded live in San Francisco at the AI Engineer Summit 2023. See the full schedule of talks at https://ai.engineer/summit/schedule & join us at the AI Engineer World's Fair in 2024! Get your tickets today at https://ai.engineer/worlds-fair About Hassan Creator of RoomGPT
·youtube.com·
The Weekend AI Engineer: Hassan El Mghari
Secure Your RAG App Against Prompt Injection Attacks
Secure Your RAG App Against Prompt Injection Attacks
Don't skip securing your RAG app like you skip leg day at the gym! Here's what Prompt Injection is, how it works, and what you can do to secure your LLM-powered application.
·gettingstarted.ai·
Secure Your RAG App Against Prompt Injection Attacks
Introduction to Augmenting LLMs with Private Data using LlamaIndex
Introduction to Augmenting LLMs with Private Data using LlamaIndex
In this post, we're going to take a top-level overview of how LlamaIndex bridges AI and private custom data from multiple sources (APIs, PDF, and more), enabling powerful applications.
·gettingstarted.ai·
Introduction to Augmenting LLMs with Private Data using LlamaIndex
Advanced ChatGPT Guide.
Advanced ChatGPT Guide.
Thanks for signing up to The Rundown built by @therundownai and @rowancheung! Enjoy our free Advanced ChatGPT Guide as a warm welcome into the world of AI!
·vaulted-polonium-23c.notion.site·
Advanced ChatGPT Guide.
Andrej Baranovskij on X: "Running Starling-7B LLM model on local CPU with @Ollama_ai and getting great results for invoice data extraction, even better than Zephyr, Mistral or Llama2. Prompt: retrieve gross worth value for each invoice item from the table. format response as following {\"gross_worth\":… https://t.co/QPPPrV27JU" / X
Andrej Baranovskij on X: "Running Starling-7B LLM model on local CPU with @Ollama_ai and getting great results for invoice data extraction, even better than Zephyr, Mistral or Llama2. Prompt: retrieve gross worth value for each invoice item from the table. format response as following {\"gross_worth\":… https://t.co/QPPPrV27JU" / X
Running Starling-7B LLM model on local CPU with @Ollama_ai and getting great results for invoice data extraction, even better than Zephyr, Mistral or Llama2.Prompt: retrieve gross worth value for each invoice item from the table. format response as following {\"gross_worth\":… pic.twitter.com/QPPPrV27JU— Andrej Baranovskij (@andrejusb) November 30, 2023
Zephyr
·twitter.com·
Andrej Baranovskij on X: "Running Starling-7B LLM model on local CPU with @Ollama_ai and getting great results for invoice data extraction, even better than Zephyr, Mistral or Llama2. Prompt: retrieve gross worth value for each invoice item from the table. format response as following {\"gross_worth\":… https://t.co/QPPPrV27JU" / X
RAG with Llama-Index: Vector Stores
RAG with Llama-Index: Vector Stores
In this third video of our series on Llama-index, we will explore how to use different vector stores in llama-index while building RAG applications. We will ...
·youtube.com·
RAG with Llama-Index: Vector Stores
Prompt Engineering Roadmap - roadmap.sh
Prompt Engineering Roadmap - roadmap.sh
Step by step guide to learn Prompt Engineering. We also have resources and short descriptions attached to the roadmap items so you can get everything you want to learn in one place.
·roadmap.sh·
Prompt Engineering Roadmap - roadmap.sh
Public: Learning Map to become a Data & AI Scientist at Careem
Public: Learning Map to become a Data & AI Scientist at Careem
Sheet1 Step,Competence,Substep,How to grow,Expected Time 1,Math,Linear Algebra, Calculus, Mathematical Analysis,a href="https://www.coursera.org/specializations/mathematics-machine-learning#courses"https://www.coursera.org/specializations/mathematics-machine-learning#courses/a,1 month Differ...
·docs.google.com·
Public: Learning Map to become a Data & AI Scientist at Careem
The Illustrated Transformer
The Illustrated Transformer
Discussions: Hacker News (65 points, 4 comments), Reddit r/MachineLearning (29 points, 3 comments) Translations: Arabic, Chinese (Simplified) 1, Chinese (Simplified) 2, French 1, French 2, Italian, Japanese, Korean, Persian, Russian, Spanish 1, Spanish 2, Vietnamese Watch: MIT’s Deep Learning State of the Art lecture referencing this post In the previous post, we looked at Attention – a ubiquitous method in modern deep learning models. Attention is a concept that helped improve the performance of neural machine translation applications. In this post, we will look at The Transformer – a model that uses attention to boost the speed with which these models can be trained. The Transformer outperforms the Google Neural Machine Translation model in specific tasks. The biggest benefit, however, comes from how The Transformer lends itself to parallelization. It is in fact Google Cloud’s recommendation to use The Transformer as a reference model to use their Cloud TPU offering. So let’s try to break the model apart and look at how it functions. The Transformer was proposed in the paper Attention is All You Need. A TensorFlow implementation of it is available as a part of the Tensor2Tensor package. Harvard’s NLP group created a guide annotating the paper with PyTorch implementation. In this post, we will attempt to oversimplify things a bit and introduce the concepts one by one to hopefully make it easier to understand to people without in-depth knowledge of the subject matter. 2020 Update: I’ve created a “Narrated Transformer” video which is a gentler approach to the topic: A High-Level Look Let’s begin by looking at the model as a single black box. In a machine translation application, it would take a sentence in one language, and output its translation in another.
·jalammar.github.io·
The Illustrated Transformer