GenAI: Large Language Models - How do they work?

Go behind the screen of the Large Language Models like ChatGPT, Claude, Gemini and CoPilot to understand how this technology actually works!

This video walks you through creating your own Generative Pretrained Transformer (GPT) language model. First we explore the background science, then we will look at tokens, the atomic units of these large language models. Finally we will move on to data preparation, training a model and ultimately, generating text with the model.

Class Jupyter Notebook:
* https://github.com/jasonacox/ProtosAI/blob/master/notebooks/gpt.ipynb

Script to generate text:
* https://github.com/jasonacox/ProtosAI/blob/master/notebooks/gen.py

Other References:
* Standford Univesity – CS231n: Deep Learning for Computer Vision – https://cs231n.github.io/neural-networks-1/
* Visual Transformer, Explained – https://poloclub.github.io/transformer-explainer/
* Cousera: Generative AI with LLMs – https://www.coursera.org/learn/generative-ai-with-llms
* Attention is All You Need by Vaswani et al. in 2017 – https://arxiv.org/abs/1706.03762
* The Illustrated Transformer by Jay Alammar – https://jalammar.github.io/illustrated-transformer/
* Visualizing Attention, a Transformer’s Heart – https://www.3blue1brown.com/lessons/attention
* Let’s build GPT: from scratch, in code, spelled out. – by Andrej Karpathy – https://www.youtube.com/watch?v=kCc8FmEb1nY
* nanoGPT by Andrej Karpathy – https://github.com/karpathy/nanoGPT
* OpenAI GPT-2 – https://github.com/openai/gpt-2/blob/master/src/model.py