**Greg Egan** @gregeganSF@mathstodon.xyz · Jun 28, 2025, 03:22

**Greg Egan** @gregeganSF@mathstodon.xyz · Jun 28, 2025, 03:22

Greg Egan @gregeganSF@mathstodon.xyz

Jun 28, 2025, 03:22

Greg Egan @gregeganSF@mathstodon.xyz

“Potemkin Understanding in Large Language Models”

A detailed analysis of the incoherent application of concepts by LLMs, showing how benchmarks that reliably establish domain competence in humans can be passed by LLMs lacking similar competence.

H/T @acowley

Link: https://arxiv.org/abs/2506.21521

c921495ba33e33b8.png

**GLC** @glc@mastodon.online · Jun 28, 2025, 08:43

**GLC** @glc@mastodon.online · Jun 28, 2025, 08:43

Jun 28, 2025, 08:43

GLC @glc@mastodon.online

@gregeganSF @acowley

"incoherent application of concepts"

Reminder that no concepts are involved. in a random (Markov) walk through word space. Shannon 1948.

From Pogo: "We could eat this picture of a chicken, if we had a picture of some salt."

**l'empathie mécanique** @dpwiz@qoto.org · Jun 28, 2025, 16:10

**l'empathie mécanique** @dpwiz@qoto.org · Jun 28, 2025, 16:10

Jun 28, 2025, 16:10

l'empathie mécanique @dpwiz@qoto.org

@glc @gregeganSF @acowley But that is not the word space they're walking...

**GLC** @glc@mastodon.online · Jun 28, 2025, 16:52

**GLC** @glc@mastodon.online · Jun 28, 2025, 16:52

Jun 28, 2025, 16:52

GLC @glc@mastodon.online

@dpwiz @gregeganSF @acowley

I suppose you must be referring to Pogo, which is not, for the present purposes, even a word space (or: not fruitfully treated as such).

**l'empathie mécanique** @dpwiz@qoto.org · Jun 28, 2025, 16:59

**l'empathie mécanique** @dpwiz@qoto.org · Jun 28, 2025, 16:59

Jun 28, 2025, 16:59

l'empathie mécanique @dpwiz@qoto.org

@glc @gregeganSF @acowley no, the LLMs aren't operating in **word**-space.

**GLC** @glc@mastodon.online · Jun 29, 2025, 09:26

**GLC** @glc@mastodon.online · Jun 29, 2025, 09:26

Jun 29, 2025, 09:26

GLC @glc@mastodon.online

@dpwiz @gregeganSF @acowley

Are you trying to distinguish tokens and words?

Or do you have a point? If so, what is it?

**l'empathie mécanique** @dpwiz@qoto.org · Jun 29, 2025, 12:28

**l'empathie mécanique** @dpwiz@qoto.org · Jun 29, 2025, 12:28

Jun 29, 2025, 12:28

l'empathie mécanique @dpwiz@qoto.org

@glc @gregeganSF @acowley No, bytes/tokens/words/whatever is irrelevant. The important part that's wrong in the "word-space" model is that it misses the context. The "language" part is a red herring. What's really going on is a tangle of suspended code that's getting executed step by step. And yes there are concepts, entities, and all that stuff in there.

**GLC** @glc@mastodon.online · Jun 29, 2025, 12:35

**GLC** @glc@mastodon.online · Jun 29, 2025, 12:35

Jun 29, 2025, 12:35

GLC @glc@mastodon.online

@dpwiz

I'd say there is syntax without semantics (in the traditional sense of formal logic, that is).

You have some other view evidently.
That much is now clear.

I don't see much difference from Markov and Shannon, apart from some compression tricks which are needed to get a working system.

**l'empathie mécanique** @dpwiz@qoto.org · Jun 29, 2025, 13:03

**l'empathie mécanique** @dpwiz@qoto.org · Jun 29, 2025, 13:03

Jun 29, 2025, 13:03

l'empathie mécanique @dpwiz@qoto.org

@glc Perhaps. I just hope this not another "X is/has/... Y" claim.
What's your favorite or most important consequence of this distinction?

**GLC** @glc@mastodon.online · Jun 29, 2025, 13:19

**GLC** @glc@mastodon.online · Jun 29, 2025, 13:19

Jun 29, 2025, 13:19

GLC @glc@mastodon.online

@dpwiz

That no concepts are involved, and the numerous corollaries of that, I suppose. At least, that's what I find myself harping on now and then.

I have no strong interest in the details. though considerable interest in watching this play out.

—Someone like Cosma Shalizi is going to actually get into the weeds a bit more:
http://bactra.org/notebooks/nn-attention-and-transformers.html

You'll probably find much to agree with and much to disagree with there. And at adequate length.

**l'empathie mécanique** @dpwiz@qoto.org · 2025-06-30T09:48:35Z

l'empathie mécanique @dpwiz@qoto.org

@glc > I find this literature irritating and opaque.

That's a promising start! (8

Jun 30, 2025, 09:48 · · · ·

Resources

Developers

What is Mastodon?

qoto.org

More…