**Eric Florenzano** @ericflo@mastodon.xyz · Apr 05, 2025, 20:59

**Eric Florenzano** @ericflo@mastodon.xyz · Apr 05, 2025, 20:59

Eric Florenzano @ericflo@mastodon.xyz

Apr 05, 2025, 20:59

Eric Florenzano @ericflo@mastodon.xyz

It's interesting - Meta said to achieve good performance on their top model, they had to throw away 95% of their SFT data! Less really is more for alignment (ref to their now several year old LIMA paper.) That sticks out to me, because what other skills can be taught to the largest model with few examples and then distilled?

0978f215fa6ca716.jpg

**l'empathie mécanique** @dpwiz@qoto.org · 2025-04-06T08:44:35Z

l'empathie mécanique @dpwiz@qoto.org

@ericflo all of them.

I'm particularly interested if it can pick up "getting shit done" from a few examples and stop being an over-enthusiastic junior on Adderall.

Apr 06, 2025, 08:44 · · · ·

Trending now

Resources

Developers

What is Mastodon?

qoto.org

More…