C is a garrulous and compulsive liar and can't wait to spin you this incredible yarn about where they went last weekend. But C doesn't like to be caught in an obvious inconsistency.

Cleverly, you tricked C into starting their side of the conversation by revealing they were at home all last weekend. And now it's going to be a lot harder for C to spin their yarn. Never mind, they say the best way to lie is to tell the truth and nobody will notice when C starts sneaking in the fibs.

This is exactly what LLMs are. They're devices to spew out random text with correlations between different parts (rather than just being independent draws of tokens). But if you get them to start with a "prompt" they'll generate random text that's appropriately correlated with that prompt.

Follow

@dpiponi It doesn't stop with association though. For some tasks it has pretty strong causal model. I.e. you can test interventions and counterfactuals on it.

@dpwiz @dpiponi LLMs literally do not have a 'causal model' though. Like, you are making a claim that is just obviously false.

@flaviusb @dpwiz @dpiponi all the things I have seen proposed as demonstrating some kind of higher-level world model, turn out to be things there are thousands of webpages about…eg the gear-rotation-direction thing is a standard textbook engineering problem

@rms80 @flaviusb @dpwiz @dpiponi there was this serious attempt to try to figure out if model building is happening: thegradient.pub/othello/ not sure what came out of it though. they did find traces of some sort internal model of the game. i can see this being possible in this highly constrained environment.

@uwedeportivo @rms80 @flaviusb @dpwiz I think there is a phenomenon here that is similar to what happens in mathematics.

Here's my favourite quote from Atiyah:

“Algebra is the offer made by the devil to the mathematician. The devil says: I will give you this powerful machine, it will answer any question you like. All you need to do is give me your soul: give up geometry and you will have this marvelous machine.”

Something similar holds if you replace algebra with language. We can replace skill in many domains with linguistic skill (and maybe allow the original skill to wither away). I don't have to visualise cube A on cube B on cube C to determine that A is above C, I can just manipulate words. The same applies for both machines and humans.

@dpiponi @rms80 @flaviusb @dpwiz I see what you mean. It's a little harsh on Algebra :-). A mathematician goes back and forth between different representations and uses them as needed (I imagine 🙂, I’m not a mathematician). I think that's also the case with humans and language. That's one of the weak spot of LLMs. They rely solely on the word/token representation and that's what ultimately limits them.

@uwedeportivo @rms80 @flaviusb @dpwiz Yes, there's probably only so far you can go with just words and I look forward to when other kinds of models are hooked up to these things.

@dpiponi @uwedeportivo @rms80 @dpwiz This is a disanalogy. Algebra has a meaning grounding and it operates on a semantic level, as do the kinds of reasoning tasks 'using language' you mention. LLMs do not and can not have meaning grounding or work on a semantic level - they are literally built to not be able to do that, on the obviously wrong theory that the key part of something like 'algebra' that makes it 'work' is that it 'looks algebra-ey' rather than anything else.

@flaviusb @uwedeportivo @rms80 @dpwiz I think this is a difference of degree. I have solved problems, proved theorems, derived novel (to me) algorithms, purely by syntactic operations making little use of the semantics of the symbols and only after the fact tried to find a meaning for what I did so I eventually go "oh yeah, that's what's really going on". When using algebraic methods you can go back and forth between really understanding the meaning of what you are doing, and operations like rewrite rules and substitutions.

@dpiponi @uwedeportivo @rms80 @dpwiz Right, I think I see the disconnect. Mathematicians and logicians have done a huge amount of work over the centuries tying the syntax to the semantics with those formal systems; there are very deep connection, such that you can even transport theories (with the right setup) - hence why the common practice shifts as to where you prove certain properties of systems depending on whether proofs at one level or the other are easier (eg ncatlab.org/nlab/show/syntax-s ).

@dpiponi @uwedeportivo @rms80 @dpwiz LLMs 'use of language' is not like this though. There is no meaning grounding, no duality proof, nothing like that; so it isn't 'like algebra'. And where you talk about 'using language to solve problems' that is like how an Engineer or philosopher uses language to solve problems; LLMs are more like how a used car salesman emits sentence shaped objects to try and trick you, as you end up doing the work of 'supplying meaning'.

@dpiponi @uwedeportivo @rms80 @dpwiz And, as far as 'using LLMs as glorified execution engines for rewrite systems or etc but with googley eyes that make people think it is "smart" in a way that Mathematica or Maple are obviously not', my response here mastodon.social/@flaviusb/1102 in this thread also applies to that.

@rms80 @flaviusb @dpiponi Does it matter, though? There's no "fake addition" that brings you the same results in arithmetic, but without the "real math" behind it. The addition that works is just... addition. Ditto for every other task.

If the next token batch is merely predicted to be some bytes from Metasploit payload your system would be pwned.

@dpwiz @rms80 @dpiponi It's 'fake addition' because there is no addition behind it. Eg if you give it some addition task that is not in its training set, it gets the wrong answer. Like, testing LLMs - especially the ones being popularised like ChatGPT that have ingested ~ most of the internet - on 'addition problems' to prove what we might call 'math ability' or 'model building' suffers from both 'testing on the training set' and 'construct validity' issues.

@flaviusb @dpwiz @rms80 I'm not sure what you mean by "not in its training set" because I can give ChatGPT additions it probably hasn't seen before and get correct results.

(For my tests I used pseudorandomly generated integers because a neural network could, in principle, learn the kinds of biases humans have when generating "random" numbers.)

@dpiponi @dpwiz @rms80 Interesting. I have seen people generate pseudorandom test vectors like that and get reasonable rates of failure. I wonder what the difference in test vector generation or etc was. (From my perspective this is quite frustrating as the 'LLM' parts of eg ChatGPT can't train to produce arithmetic machines, but we have very little clarity on what eg 'ChatGPT' *actually* is, due to things like secrecy around parts of the architecture + the pervasive use of ghost labour).

@flaviusb @dpwiz @rms80 Like you, I'm not sure what ChatGPT actually is. That's part of why I downloaded llama.cpp - to see what other models are, and also what you could possibly attach to an LLM (and how).

@dpiponi @dpwiz @rms80 Like, to be clear, some nontrivial percentage of the things 'attached' to ChatGPT et al are more like the 'human fallback' of 'Brenda' (obligatory nplusone link: nplusonemag.com/issue-44/essay ), rather than like the integration of 'theories' you get with SMT, and their interventions only have continued effect through the rest of the conversation because of the same 'simulate memory via smuggling state in the input' technique that ChatGPT already uses for conversations.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.