Culture Magazine

HOME ›

DeepSeek's Approach Only Works in Limited Technical Domains

By Bbenzon @bbenzon

If you look at their excellent paper & code, the reward model is a logical function that was handcrafted & progammed by engineers.
DeepSeek RL approach is impressive in the sense that it reduces the need for tedious supervised fine tuning (SFT) but isn't really general.
— Chomba Bupe (@ChombaBupe) February 1, 2025

Back to Featured Articles on

About the author

Bbenzon 4802 shares View profile
View Blog

Author's Latest Articles

Magazines

Tweets by @paperblog