The History of Large Language Models …

Bret Kinsella

Feb 28, 2023

Let us know how you would edit the timeline!

Read →

4 Comments

Uwe PLEBAN

Mar 10, 2023

Great timeline. Comments on inaccuracies concerning OpenAI models, plus spelling issues.

A. The GPT-1 model is dated 2016, but this cannot be correct, because it uses the transformer architecture described in the Google paper of 2017. The OpenAI announcement of GPT-1 is dated June 11, 2018. Reference here: https://openai.com/research/language-unsupervised . The paper itself has no date attached; at the bottom of page 1 it merely says "Preprint. Work in progress." So your timeline should have the entry for GPT-1 moved down by 2 years. Note: There is an OpenAI research web page on Generative Models dated June 16, 2016 here: https://openai.com/research/generative-models, but it discusses these models in the context of image generation (GANs and friends), not for text generation.

B. Misspelling: Universal Setnence Encoder - should be "Sentence"

C. Bert should always be spelled BERT.

D. GPT-2 was announced on February 14, 2019 - link here: https://openai.com/research/better-language-models. Your timeline dates it to 2018. The final release of the 1.5B parameter model was actually in November 2019.

E. The announcement of GPT-3 is dated May 28, 2020. Link: https://openai.com/research/language-models-are-few-shot-learners. Your timeline says June 2020.

F. The OpenAI announcement of Codex is dated July 07, 2021. Link here: https://openai.com/research/evaluating-large-language-models-trained-on-code. Your timeline says August 2020.

G. The evolution of GPT-3 includes WebGPT and InstructGPT. You may want to include their announcements by OpenAI on your timeline. WebGPT (Dec. 16, 2021): https://openai.com/research/webgpt; InstructGPT(Jan. 27, 2022): https://openai.com/research/instruction-following

Expand full comment

Reply (1)

Bret Kinsella

Mar 11, 2023

Thanks. Appreciate it.

Quite right on the GPT dates. Not quite sure how that happened unless one typo led to the others in sequence. For Codex, they announced it in July but I believe launched it in August. So, I think July probably makes sense as you suggest. It was already in use even before that in limited release. Also, definitely agree on adding WebGPT and InstructGPT, particularly the latter.

Expand full comment

Howard Horvath

Mar 2, 2023Edited

I think there will be a tremendous demand for industry-specific LLMs, possibly further tuned to a particular state or even city user base. And given we've already seen the capacity/processing power of previous mainframes reduced to a size and cost (over time) that allows us to now carry them around in our pocket - I'm wondering if we'll ever see the day people will have access to their own personal LLM tuned to what the individual wants/needs most.

Expand full comment

Reply (1)

Uwe PLEBAN

Mar 10, 2023Edited

Industry-specific, company-specific, community-specific, and personalized versions are all handled via the concept of embeddings. Gather all the user community specific text, push it through the embeddings end-point of the API (OpenAI and co:here provide such endpoints), then store the generated internal representation (dense numeric vectors) in a vector DB like Pinecone. You can then query the vector DB via natural language. This is called semantic search, and allows you to point a model like ChatGPT to specific knowledge bases, including your own personal ones. No additional training or fine-tuning required. This approach also gets rid of the issue of hallucinations/confabulations.

Expand full comment