There's a funny thing you see in many scientific papers - especially #AI papers: The paper will prominently include a link to a GitHub repository with claims of code availability "soon" but when you go there (months after the paper was released) there's either just a placeholder or the paper text.

People use GitHub links to score browny points for "doing open science" but most of it is just not there. Especially with statistical systems when you realize that you don't get the training data, you don't get the code, you don't get model weights what you get are results and a "trust me bro".

Follow

@tante
An exception that proves the rule: llm360.ai/

They release everything you mentioned, plus intermediate checkpoints *mapped to the training data*, and various metrics.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.