Culture Magazine

Polysemantic "neurons" in LLMs

By Bbenzon @bbenzon

We hope this will eventually enable us to diagnose failure modes, design fixes, and certify that models are safe for adoption by enterprises and society. It's much easier to tell if something is safe if you can understand how it works!

— Anthropic (@AnthropicAI) October 5, 2023

Last year, we conjectured that polysemanticity is caused by "superposition" – models compressing many rare concepts into a small number of neurons. We also conjectured that "dictionary learning" might be able to undo superposition.https://t.co/bgJdScRcay

— Anthropic (@AnthropicAI) October 5, 2023

There are more tweets in the thread. Check it out.


Back to Featured Articles on Logo Paperblog