The Hydra Effect: Emergent Self-repair in Language Model Computations
— AK (@_akhaliq) August 1, 2023
paper page: https://t.co/e8oycGaZCv
investigate the internal structure of language model computations using causal analysis and demonstrate two motifs: (1) a form of adaptive computation where ablations of… pic.twitter.com/0X92Jo0Bmp
Culture Magazine
Author's Latest Articles
-
Creating Imaginary Bank Notes with ChatGPT: AI as Cultural Technology and Collective Creativity
-
Keeping AI Slop out of the Kitchen
-
Yesterday's Breakfast Adventure
-
Conversations with Tyler: A Special Conversation with Nicholas Copernicus, Adam Smith, and William Stanley Jevons
