If you (like me) have wondered what the feed-forward layers in transformer models are actually doing, this is a pretty interesting paper on that topic:https://t.co/cqs1OksVR5 pic.twitter.com/BiplVDxS3e
— Karl Higley (@karlhigley) April 30, 2022
If you (like me) have wondered what the feed-forward layers in transformer models are actually doing, this is a pretty interesting paper on that topic:https://t.co/cqs1OksVR5 pic.twitter.com/BiplVDxS3e
— Karl Higley (@karlhigley) April 30, 2022