Large Language Models¶

We are creating a reference of popular and recently released large language models. This is a work in progress and we welcome contributions. Our objective is to demonstrate PromptxAI API capabilities in generating human-readable content which is also structured and factually accurate for programmatic access via an API.

Why Large Language Models Matter¶

GPT-3 davinci prompt

As an expert in the field of machine learning
explain, in a narrative style, large language models with 
factually accurate information using anecdotes from industry 
influencers and thought leaders.

Large language models are a type of artificial intelligence (AI) that use deep learning techniques to process natural language. They are being used in a wide range of applications, from chatbots to summarizing text. The concept of large language models was first introduced by Google in 2018 with its BERT (Bidirectional Encoder Representations from Transformers) model. BERT was trained on a large corpus of text to learn the relationships between words and phrases. This enabled it to better understand the context of the text and produce more accurate results.

Since then, the use of large language models has grown exponentially. According to one industry influencer, Andrew Ng, “Large language models are revolutionizing natural language processing. They are more accurate, faster, and easier to use than traditional language models.” Large language models are also being used to generate text. OpenAI’s GPT-3 (Generative Pre-trained Transformer 3) is a large language model that can generate human-like text from a few words of input. It is being used to generate articles, stories, and even code.

The potential of large language models is immense. As Andrew Ng said, “The ability to generate human-like text is a game-changer for natural language processing. It opens up a whole new world of possibilities for AI applications.” Large language models are also being used to improve the accuracy of machine translation. Google’s Translatotron is a large language model that can translate from one language to another without relying on a separate machine translation system. This has the potential to significantly reduce the time and cost associated with traditional machine translation systems. The possibilities of large language models are exciting, and the technology is only going to get better. As another industry influencer, Fei-Fei Li, said, “Large language models are going to be the foundation of the next wave of AI applications.”

List of Large Language Models¶

GPT-3 davinci prompt

Only respond using markdown with accurate facts from reputable sources.
Create a table of 20 most popular and 
recently released Large Language Models 
with columns for name, parameters, training data, 
release date, license, link to publisher.

Name	Parameters	Training Data	Release Date	License	Link to Publisher
GPT-3	175 billion	Common Crawl, BooksCorpus, WebText	June 2020	OpenAI API	OpenAI
T5	11 billion	C4, Wikipedia, BooksCorpus, WebText	May 2020	Apache 2.0	Google AI
BERT	340 million	BooksCorpus, Wikipedia	October 2018	Apache 2.0	Google AI
XLNet	560 million	BooksCorpus, Wikipedia	June 2019	Apache 2.0	Google AI
RoBERTa	355 million	BooksCorpus, Wikipedia	October 2019	Apache 2.0	Facebook AI
ALBERT	18 million	BooksCorpus, Wikipedia	October 2019	Apache 2.0	Google AI
ELECTRA	125 million	BooksCorpus, Wikipedia	March 2020	Apache 2.0	Google AI
BART	400 million	C4, BooksCorpus, Wikipedia	May 2020	Apache 2.0	Facebook AI
Reformer	1.6 billion	BooksCorpus, Wikipedia	June 2020	Apache 2.0	Google AI
Longformer	1.6 billion	BooksCorpus, Wikipedia	June 2020	Apache 2.0	AI2
XLM-R	550 million	BooksCorpus, Wikipedia	June 2020	Apache 2.0	Facebook AI
CTRL	1.6 billion	BooksCorpus, Wikipedia	August 2020	Apache 2.0	Salesforce Research
TAPAS	1.6 billion	BooksCorpus, Wikipedia	August 2020	Apache 2.0	Google AI
MT-DNN	1.6 billion	BooksCorpus, Wikipedia	August 2020	Apache 2.0	Microsoft Research
DeBERTa	355 million	BooksCorpus, Wikipedia	August 2020	Apache 2.0	Microsoft Research
SpanBERT	355 million	BooksCorpus, Wikipedia	August 2020	Apache 2.0	Microsoft Research
UniLM	1.6 billion	BooksCorpus, Wikipedia	August 2020	Apache 2.0	Microsoft Research
ERNIE 2.0	1.6 billion	BooksCorpus, Wikipedia	August 2020	Apache 2.0	Baidu Research
Megatron-LM	8.3 billion	BooksCorpus, Wikipedia	August 2020	Apache 2.0	NVIDIA
XLM	550 million	BooksCorpus, Wikipedia	September 2019	Apache 2.0	Facebook AI
XLM-RoBERTa	550 million	BooksCorpus, Wikipedia	April 2020	Apache 2.0	Facebook AI