jasonjmcghee/rem: An open source approach to locally record and enable searching everything you view on your Apple Silicon.
An open source approach to locally record and enable searching everything you view on your Apple Silicon. - jasonjmcghee/rem: An open source approach to locally record and enable searching everythi...
Alchemite™ Analytics accelerates innovation through applied machine learning. Get deep insights into real-world, sparse and noisy, experimental and process data, Reduce the number of experiments required to achieve your goals by 50-80%.
Many options for running Mistral models in your terminal using LLM
Mistral AI is the most exciting AI research lab at the moment. They’ve now released two extremely powerful smaller Large Language Models under an Apache 2 license, and have a …
SJTU-IPADS/PowerInfer: High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs - SJTU-IPADS/PowerInfer: High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Apple’s latest AI research could completely transform your iPhone
Apple researchers have introduced new techniques to create photorealistic 3D avatars from video and enable advanced AI systems to run efficiently on devices with limited memory, such as an iPhone or iPad.
Artificial intelligence can find your location in photos, worrying privacy experts
Three Stanford graduate students built an AI tool that can find a location by looking at pictures. Civil rights advocates warn more advanced versions will further erode online privacy.
The Illustrated GPT-2 (Visualizing Transformer Language Models)
Discussions:
Hacker News (64 points, 3 comments), Reddit r/MachineLearning (219 points, 18 comments)
Translations: Simplified Chinese, French, Korean, Russian, Turkish
This year, we saw a dazzling application of machine learning. The OpenAI GPT-2 exhibited impressive ability of writing coherent and passionate essays that exceed what we anticipated current language models are able to produce. The GPT-2 wasn’t a particularly novel architecture – it’s architecture is very similar to the decoder-only transformer. The GPT2 was, however, a very large, transformer-based language model trained on a massive dataset. In this post, we’ll look at the architecture that enabled the model to produce its results. We will go into the depths of its self-attention layer. And then we’ll look at applications for the decoder-only transformer beyond language modeling.
My goal here is to also supplement my earlier post, The Illustrated Transformer, with more visuals explaining the inner-workings of transformers, and how they’ve evolved since the original paper. My hope is that this visual language will hopefully make it easier to explain later Transformer-based models as their inner-workings continue to evolve.
Dust - Secure AI assistant with your company's knowledge
Dust is an AI assistant that safely brings the best large language models, continuously updated company knowledge, powerful collaboration applications, and an extensible platform to your team's fingertips.