Overview#
Introduction to Interpretability for Language Models is a three-week crash course on interpretability and language modeling. The course starts with an introduction to Python, moves on to language models in natural language processing, and ends with a week on large language models like BERT and the GPT series.
Syllabus#
Day |
Title |
Topics |
---|---|---|
1 |
Getting Started |
terminal, environment managment, Python introduction |
2 |
Python Basics |
control flow, data structures, functions |
3 |
Data Analysis in Python |
tabular data, plotting, corpus analytics |
4 |
N-gram Models |
n-grams, probability models, sampling |
5 |
Vectorization |
the document-term matrix, weighting, classification |
6 |
Vector Space Semantics |
semantic spaces, vector operations, static embeddings |
7 |
Introduction to LLMs |
subwords, model architectures, dynamic embeddings |
8 |
BERT |
fine tuning, model evaluation, SHAP values |
9 |
GPT |
next token prediction, reverse engineering, activation patching |
Data#
As of this writing (July 2024), a zipped data directory for the course,
dtl_2024.zip
, may be found at tylershoemaker.info/data. Download this
file, move it to the location on your computer where you’ll be working from,
and unzip it.