(LLM) Models

(LLM) Models

44 bookmarks
Custom sorting
Introducing Llama 3.1: Our most capable models to date
Introducing Llama 3.1: Our most capable models to date
Bringing open intelligence to all, our latest models expand context length, add support across eight languages, and include Meta Llama 3.1 405B— the first frontier-level open source AI model.
·ai.meta.com·
Introducing Llama 3.1: Our most capable models to date
LGM
LGM
null
·me.kiui.moe·
LGM
What are Large Language Models (LLMs)?
What are Large Language Models (LLMs)?
In this article, we will understand the concept of Large Language Models (LLMs) and their importance in natural language processing.
·analyticsvidhya.com·
What are Large Language Models (LLMs)?
XLNet
XLNet
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
·huggingface.co·
XLNet
DistilBERT
DistilBERT
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
·huggingface.co·
DistilBERT
XLM-RoBERTa
XLM-RoBERTa
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
·huggingface.co·
XLM-RoBERTa
RoBERTa: An optimized method for pretraining self-supervised NLP systems
RoBERTa: An optimized method for pretraining self-supervised NLP systems
Facebook AI’s RoBERTa is a new training recipe that improves on BERT, Google’s self-supervised method for pretraining natural language processing systems. By training longer, on more data, and dropping BERT’s next-sentence prediction RoBERTa topped the GLUE leaderboard.
·ai.meta.com·
RoBERTa: An optimized method for pretraining self-supervised NLP systems
AI21 Studio
AI21 Studio
A powerful language model, with an API that makes you smile
·ai21.com·
AI21 Studio
ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for...
ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for...
Pre-trained models have achieved state-of-the-art results in various Natural Language Processing (NLP) tasks. Recent works such as T5 and GPT-3 have shown that scaling up pre-trained language...
·arxiv.org·
ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for...
GPT-NeoX
GPT-NeoX
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
·huggingface.co·
GPT-NeoX
GPT-J
GPT-J
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
·huggingface.co·
GPT-J
GPT-Neo — EleutherAI
GPT-Neo — EleutherAI
A set of 3 decoder-only LLMs with 125M, 1.3B, and 2.7B parameters trained on the Pile.
·eleuther.ai·
GPT-Neo — EleutherAI
Product
Product
Our API platform offers our latest models and guides for safety best practices.
·openai.com·
Product
Meta Open-Sources 175 Billion Parameter AI Language Model OPT
Meta Open-Sources 175 Billion Parameter AI Language Model OPT
Meta AI Research released Open Pre-trained Transformer (OPT-175B), a 175B parameter AI language model. The model was trained on a dataset containing 180B tokens and exhibits performance comparable with GPT-3, while only requiring 1/7th GPT-3's training carbon footprint.
·infoq.com·
Meta Open-Sources 175 Billion Parameter AI Language Model OPT
OPT
OPT
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
·huggingface.co·
OPT
google/mt5-base · Hugging Face
google/mt5-base · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
·huggingface.co·
google/mt5-base · Hugging Face
google-research/multilingual-t5
google-research/multilingual-t5
Contribute to google-research/multilingual-t5 development by creating an account on GitHub.
·github.com·
google-research/multilingual-t5
Google GLaM | Discover AI use cases
Google GLaM | Discover AI use cases
The Generalist Language Model GLaM is a mixture of experts (MoE) model, a type of model that can be thought of as having different submodels (or expert...
·gpt3demo.com·
Google GLaM | Discover AI use cases
Switch Transformers by Google Brain | Discover AI use cases
Switch Transformers by Google Brain | Discover AI use cases
Scaling to Trillion Parameter Models with Simple and Efficient Sparsity *In deep learning, models typically reuse the same parameters for all inputs. Mi...
·gpt3demo.com·
Switch Transformers by Google Brain | Discover AI use cases
Chinchilla by DeepMind | Discover AI use cases
Chinchilla by DeepMind | Discover AI use cases
A GPT-3 rival by Deepmind Researchers at DeepMind have proposed a new predicted compute-optimal model called Chinchilla that uses the same compute budge...
·gpt3demo.com·
Chinchilla by DeepMind | Discover AI use cases