Language is more "semantically dense" than other training signals, leading to more data-efficient learning than either traditional classification pretraining or recent unsupervised pretraining (e.g. MoCo / PIRL) pic.twitter.com/F4KMunhlhL — Justin Johnson (@jcjohnss) June 12, 2020
Hmmmm....Are things getting interesting?
