Machine Learning Research News

Building LLMs from the Ground Up: A 3-hour Coding Workshop

Sebastian Rasc…
2024-08-31 16 min read
Building LLMs from the Ground Up: A 3-hour Coding Workshop
Building LLMs from the Ground Up: A 3-hour Coding Workshop

If your weekend plans include catching up on AI developments and understanding Large Language Models (LLMs), I've prepared a 1-hour presentation on the development cycle of LLMs, covering everything f...

If you’d like to spend a few hours this weekend to dive into Large Language Models (LLMs) and understand how they work, I've prepared a 3-hour coding workshop presentation on implementing, training, and using LLMs.

Below, you'll find a table of contents to get an idea of what this video covers (the video itself has clickable chapter marks, allowing you to jump directly to topics of interest):

0:00 – Workshop overview

2:17 – Part 1: Intro to LLMs

9:14 – Workshop materials

10:48 – Part 2: Understanding LLM input data

23:25 – A simple tokenizer class

41:03 – Part 3: Coding an LLM architecture

45:01 – GPT-2 and Llama 2

1:07:11 – Part 4: Pretraining

1:29:37 – Part 5.1: Loading pretrained weights

1:45:12 – Part 5.2: Pretrained weights via LitGPT

1:53:09 – Part 6.1: Instruction finetuning

2:08:21 – Part 6.2: Instruction finetuning via LitGPT

02:26:45 – Part 6.3: Benchmark evaluation

02:36:55 – Part 6.4: Evaluating conversational performance

02:42:40 – Conclusion

It's a slight departure from my usual text-based content, but the last time I did this a few months ago, it was so well-received that I thought it might be nice to do another one!

Happy viewing!

References

  1. Build an LLM from Scratch book

  2. Build an LLM from Scratch GitHub repository

  3. GitHub repository with workshop code

  4. Lightning Studio for this workshop

  5. LitGPT GitHub repository


This magazine is a personal passion project. For those who wish to support me, please consider purchasing a copy of my Build a Large Language Model (From Scratch) book. (I am confident that you'll get lots out of this book as it explains how LLMs work in a level of detail that is not found anywhere else.)

Build a Large Language Model (From Scratch) now available on Amazon

If you read the book and have a few minutes to spare, I'd really appreciate a brief review. It helps us authors a lot!

Alternatively, I also recently enabled the paid subscription option on Substack to support this magazine directly.

Ahead of AI is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Source: Ahead of AI Word count: 4807 words
Published on 2024-08-31 18:39