Culture Magazine

Visualizing Attention: A Transformer's Heart

By Bbenzon @bbenzon

This is the second of three videos from 3Blue1Brown about how transformers work. Here's the first.

Timestamps:
0:00 - Recap on embeddings:
1:39 - Motivating examples:
4:29 - The attention pattern:
11:08 - Masking:
12:42 - Context size:
13:10 - Values:
15:44 - Counting parameters:
18:21 - Cross-attention:
19:19 - Multiple heads:
22:16 - The output matrix:
23:19 - Going deeper:
24:54 - Ending


Back to Featured Articles on Logo Paperblog