C is a garrulous and compulsive liar and can't wait to spin you this incredible yarn about where they went last weekend. But C doesn't like to be caught in an obvious inconsistency.

Cleverly, you tricked C into starting their side of the conversation by revealing they were at home all last weekend. And now it's going to be a lot harder for C to spin their yarn. Never mind, they say the best way to lie is to tell the truth and nobody will notice when C starts sneaking in the fibs.

This is exactly what LLMs are. They're devices to spew out random text with correlations between different parts (rather than just being independent draws of tokens). But if you get them to start with a "prompt" they'll generate random text that's appropriately correlated with that prompt.

@dpiponi It doesn't stop with association though. For some tasks it has pretty strong causal model. I.e. you can test interventions and counterfactuals on it.

@dpwiz @dpiponi LLMs literally do not have a 'causal model' though. Like, you are making a claim that is just obviously false.

@flaviusb @dpwiz @dpiponi all the things I have seen proposed as demonstrating some kind of higher-level world model, turn out to be things there are thousands of webpages about…eg the gear-rotation-direction thing is a standard textbook engineering problem

@rms80 @flaviusb @dpwiz @dpiponi there was this serious attempt to try to figure out if model building is happening: thegradient.pub/othello/ not sure what came out of it though. they did find traces of some sort internal model of the game. i can see this being possible in this highly constrained environment.

@uwedeportivo @rms80 @flaviusb @dpwiz I think there is a phenomenon here that is similar to what happens in mathematics.

Here's my favourite quote from Atiyah:

“Algebra is the offer made by the devil to the mathematician. The devil says: I will give you this powerful machine, it will answer any question you like. All you need to do is give me your soul: give up geometry and you will have this marvelous machine.”

Something similar holds if you replace algebra with language. We can replace skill in many domains with linguistic skill (and maybe allow the original skill to wither away). I don't have to visualise cube A on cube B on cube C to determine that A is above C, I can just manipulate words. The same applies for both machines and humans.

@dpiponi @rms80 @flaviusb @dpwiz I see what you mean. It's a little harsh on Algebra :-). A mathematician goes back and forth between different representations and uses them as needed (I imagine 🙂, I’m not a mathematician). I think that's also the case with humans and language. That's one of the weak spot of LLMs. They rely solely on the word/token representation and that's what ultimately limits them.

@uwedeportivo @rms80 @flaviusb @dpwiz Yes, there's probably only so far you can go with just words and I look forward to when other kinds of models are hooked up to these things.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.