The reason "AI"s don't say "I don't know" is pretty obvious. Nobody writes books, or answers questions on forums if they don't know. It's the "I know" answers that get recorded and end up in training.
It'd be funny if there were more "I don't know" examples to learn from. LLMs would probably use "I don't know" even if they did know, just to look like the examples.
@dpiponi "to know" doesn't have meaning when applied to an LLM, but yes, it would respond "I don't know" even if the other responses modeled as having high likelihood represent factually correct text.
@jedbrown If someone asks me if an LLM knows X I'll answer yes or no according to whether or not it can give useful answers about X, with any necessary caveats, just as I would with a human. I care nothing for political slogans or metaphysics.
@jedbrown @dpiponi This is true only for the specific setups of a model, training, and sampling. Unless you hit some idiosyncrasies the amount of follow-up questioning needed to distinguish understanding can be made arbitrary high. This method will "detect away" human understanding before the LLM-based system breaks.