Follow

random thought...
it's nearly 4 years since Rich Sutton wrote his Bitter Lesson blogpost: incompleteideas.net/IncIdeas/B
he wrote before the explosion of interest in large transformer models. His claim is that "maximally general methods are always better". Maximally general means two things: 1) avoid human priors about cognition; 2) avoid human training data. It's interesting to reflect that he was both very right and very wrong.
- very right, that simple algorithms massively scaled can give you (decent) systematicity, without the need for symbolic bells and whistles. That's not to say that there aren't still important ingredients of intelligence missing in large transfomer models. But the level of composition you get in large generative models is much more impressive than most of us predicted. So he was broadly right about (1)
- very wrong, because the really impressive stuff relies on volumes of human feedback. As RLHF comes to the fore, it's become clear that self-play is only going to work for a very narrow set of problems. You need human data in spades.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.