PatternBoost: Constructions in Mathematics with a Little Help from AIView PDF#Mathematics#Transformers#AI#Paper#PDF·arxiv.org·Nov 6, 2024PatternBoost: Constructions in Mathematics with a Little Help from AI
Etched is Making the Biggest Bet in AI#Hardware#Transformers·etched.com·Jun 25, 2024Etched is Making the Biggest Bet in AI
STT: Stateful Tracking with Transformers for Autonomous DrivingView PDF#AVs#Transformers#Machine Learning#Computer Vision#Paper#PDF·arxiv.org·May 2, 2024STT: Stateful Tracking with Transformers for Autonomous Driving
Retrieval Head Mechanistically Explains Long-Context FactualityView PDF#Transformers#Machine Learning#Paper#PDF·arxiv.org·Apr 25, 2024Retrieval Head Mechanistically Explains Long-Context Factuality
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention#Transformers#Memory#Paper#PDF·arxiv.org·Apr 13, 2024Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Does Transformer Interpretability Transfer to RNNs?#RNN#Transformers#Large Language Models#Paper#PDF#EleutherAI·arxiv.org·Apr 10, 2024Does Transformer Interpretability Transfer to RNNs?
DSI++: Updating Transformer Memory with New Documents#Search#Transformers#Paper#PDF#Google#DeepMind·arxiv.org·Jan 29, 2024DSI++: Updating Transformer Memory with New Documents
Transformers at Work 2023: The future of Transformers, LLMs and AI Agents.#Transformers#Forecasting·zeta-alpha.com·Oct 1, 2023Transformers at Work 2023: The future of Transformers, LLMs and AI Agents.
Faith and Fate: Limits of Transformers on Compositionality#Large Language Models#Machine Learning#Compositionality#Paper#PDF#Transformers·arxiv.org·Jun 5, 2023Faith and Fate: Limits of Transformers on Compositionality
CodeTF: One-stop Transformer Library for State-of-the-art Code LLM#Machine Learning#Transformers#Paper#PDF·arxiv.org·Jun 4, 2023CodeTF: One-stop Transformer Library for State-of-the-art Code LLM
Bytes Are All You Need: Transformers Operating Directly On File BytesPDF#Machine Learning#Transformers#Paper#PDF·arxiv.org·Jun 4, 2023Bytes Are All You Need: Transformers Operating Directly On File Bytes
The Impact of Positional Encoding on Length Generalization in TransformersPDF#Machine Learning#Transformers#Paper#PDF·arxiv.org·Jun 4, 2023The Impact of Positional Encoding on Length Generalization in Transformers
Can Transformers Learn to Solve Problems Recursively?#Transformers#Problem-Solving#Recursion#Paper#PDF#EleutherAI·arxiv.org·May 27, 2023Can Transformers Learn to Solve Problems Recursively?
RWKV: Reinventing RNNs for the Transformer EraPDF#Transformers#RNN#Natural Language Processing#Paper#PDF·arxiv.org·May 23, 2023RWKV: Reinventing RNNs for the Transformer Era
Scaling Transformer to 1M tokens and beyond with RMT#Transformers#BERT#Paper#PDF·arxiv.org·Apr 27, 2023Scaling Transformer to 1M tokens and beyond with RMT
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale#Transformers#Image Recognition#Google#Paper#PDF·arxiv.org·Apr 27, 2023An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Transformer Math 101#EleutherAI#Transformers#Mathematics·blog.eleuther.ai·Apr 24, 2023Transformer Math 101
Scaling Vision Transformers to 22 Billion Parameters#Transformers#Computer Vision#Paper#PDF#Google·arxiv.org·Apr 23, 2023Scaling Vision Transformers to 22 Billion Parameters
Exphormer: Sparse Transformers for Graphs#Machine Learning#Transformers#Paper#PDF·arxiv.org·Mar 20, 2023Exphormer: Sparse Transformers for Graphs
RT-1: Robotics Transformer for Real-World Control at Scale#Robotics#Transformers#Google Research·ai.googleblog.com·Dec 14, 2022RT-1: Robotics Transformer for Real-World Control at Scale
Probabilistic Time Series Forecasting with 🤗 Transformers#Transformers#Time Series#Probability#Hugging Face·huggingface.co·Dec 1, 2022Probabilistic Time Series Forecasting with 🤗 Transformers
Wang, W. and others. (2021). Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction Without Convolutions.#Computer Vision#Transformers#CNN·openaccess.thecvf.com·Nov 27, 2022Wang, W. and others. (2021). Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction Without Convolutions.
Teaching AI advanced mathematical reasoning#Mathematics#Transformers#Meta·ai.facebook.com·Nov 19, 2022Teaching AI advanced mathematical reasoning
Get Ready for Transformational Transformer Networks - EE Times#Transformers#Apps·eetimes.com·Oct 7, 2022Get Ready for Transformational Transformer Networks - EE Times
NielsRogge/Transformers-Tutorials: This repository contains demos I made with the Transformers library by HuggingFace.#Transformers#Tutorials#Hugging Face#PyTorch·github.com·Sep 30, 2022NielsRogge/Transformers-Tutorials: This repository contains demos I made with the Transformers library by HuggingFace.
Getting started with IPUs on Paperspace#Paperspace#Graphcore#Notebooks#Transformers#Computer Vision#Multimodal·graphcore.ai·Sep 30, 2022Getting started with IPUs on Paperspace
Introducing Whisper#OpenAI#Speech Recognition#Transformers·openai.com·Sep 21, 2022Introducing Whisper
Release v4.22.0: Swin Transformer v2, VideoMAE, Donut, Pegasus-X, X-CLIP, ERNIE · huggingface/transformers#Transformers#Hugging Face#Video Generation·github.com·Sep 16, 2022Release v4.22.0: Swin Transformer v2, VideoMAE, Donut, Pegasus-X, X-CLIP, ERNIE · huggingface/transformers
How AI Transformers Mimic Parts of the Brain | Quanta Magazine#Brain Science#Neural Networks#Transformers·quantamagazine.org·Sep 13, 2022How AI Transformers Mimic Parts of the Brain | Quanta Magazine
CS25 I Stanford Seminar - Mixture of Experts (MoE) paradigm and the Switch Transformer#Transformers#Mixture of Experts#Machine Learning#Large Language Models·youtube.com·Jul 18, 2022CS25 I Stanford Seminar - Mixture of Experts (MoE) paradigm and the Switch Transformer