Inspired by progress in large-scale language modeling, we apply a similar approach towards building a single generalist agent beyond the realm of text outputs. The agent, which we refer to as Gato, works as a multi-modal, multi-task, multi-embodiment generalist policy. The same network with the same weights can play Atari, caption images, chat, stack blocks with a real robot arm and much more, deciding based on its context whether to output text, joint torques, button presses, or other tokens. In this report we describe the model and the data, and document the current capabilities of Gato.
Megatron-Turing Natural Language Generation Megatron-Turing Natural Language Generation model (MT-NLG), is the largest and the most powerful monolithic transformer English language model with 530 billion parameters. This 105-layer, transformer-based MT-NLG improves upon the prior state-of-the-art models in zero-, one-, and few-shot settings. It demonstrates unmatched accuracy in a broad set of natural language tasks such as, Completion prediction, Reading comprehension, Commonsense reasoning, Natural language inferences, Word sense disambiguation, etc.
We’ve created an improved version of OpenAI Codex, our AI system that translates natural language to code, and we are releasing it through our API in private beta starting today. Codex is the model that powers GitHub Copilot, which we built and launched in partnership with GitHub a month
Announcing AI21 Studio and Jurassic-1 language models
AI21 Labs’ new developer platform offers instant access to our 178B-parameter language model, to help you build sophisticated text-based AI applications at scale
A GPT-3 rival by Deepmind Researchers at DeepMind have proposed a new predicted compute-optimal model called Chinchilla that uses the same compute budge...
Language modelling at scale: Gopher, ethical considerations, and retrieval
Language, and its role in demonstrating and facilitating comprehension - or intelligence - is a fundamental part of being human. It gives people the ability to communicate thoughts and concepts, express ideas, create memories, and build mutual understanding. These are foundational parts of social intelligence. It’s why our teams at DeepMind study aspects of language processing and communication, both in artificial agents and in humans.
We’ve trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests. ChatGPT is a sibling model to InstructGPT, which is trained to follow an instruction in