**Dan Piponi** @dpiponi@mathstodon.xyz · Apr 17, 2023, 22:49

**Dan Piponi** @dpiponi@mathstodon.xyz · Apr 17, 2023, 22:49

Dan Piponi @dpiponi@mathstodon.xyz

Apr 17, 2023, 22:49

C is a garrulous and compulsive liar and can't wait to spin you this incredible yarn about where they went last weekend. But C doesn't like to be caught in an obvious inconsistency.

Cleverly, you tricked C into starting their side of the conversation by revealing they were at home all last weekend. And now it's going to be a lot harder for C to spin their yarn. Never mind, they say the best way to lie is to tell the truth and nobody will notice when C starts sneaking in the fibs.

**Dan Piponi** @dpiponi@mathstodon.xyz · Apr 18, 2023, 14:10

**Dan Piponi** @dpiponi@mathstodon.xyz · Apr 18, 2023, 14:10

Apr 18, 2023, 14:10

Dan Piponi @dpiponi@mathstodon.xyz

This is exactly what LLMs are. They're devices to spew out random text with correlations between different parts (rather than just being independent draws of tokens). But if you get them to start with a "prompt" they'll generate random text that's appropriately correlated with that prompt.

**l'empathie mécanique** @dpwiz@qoto.org · Apr 18, 2023, 14:19

**l'empathie mécanique** @dpwiz@qoto.org · Apr 18, 2023, 14:19

Apr 18, 2023, 14:19

l'empathie mécanique @dpwiz@qoto.org

@dpiponi It doesn't stop with association though. For some tasks it has pretty strong causal model. I.e. you can test interventions and counterfactuals on it.

**серафими многоꙮчитїи** @flaviusb@mastodon.social · Apr 18, 2023, 14:39

**серафими многоꙮчитїи** @flaviusb@mastodon.social · Apr 18, 2023, 14:39

Apr 18, 2023, 14:39

серафими многоꙮчитїи @flaviusb@mastodon.social

@dpwiz @dpiponi LLMs literally do not have a 'causal model' though. Like, you are making a claim that is just obviously false.

**Ryan Schmidt** @rms80@mastodon.gamedev.place · Apr 18, 2023, 15:00

**Ryan Schmidt** @rms80@mastodon.gamedev.place · Apr 18, 2023, 15:00

Apr 18, 2023, 15:00

Ryan Schmidt @rms80@mastodon.gamedev.place

@flaviusb @dpwiz @dpiponi all the things I have seen proposed as demonstrating some kind of higher-level world model, turn out to be things there are thousands of webpages about…eg the gear-rotation-direction thing is a standard textbook engineering problem

**l'empathie mécanique** @dpwiz@qoto.org · 2023-04-18T16:22:15Z

l'empathie mécanique @dpwiz@qoto.org

@rms80 @flaviusb @dpiponi Does it matter, though? There's no "fake addition" that brings you the same results in arithmetic, but without the "real math" behind it. The addition that works is just... addition. Ditto for every other task.

If the next token batch is merely predicted to be some bytes from Metasploit payload your system would be pwned.

Apr 18, 2023, 16:22 · · · ·

**Ryan Schmidt** @rms80@mastodon.gamedev.place · Apr 18, 2023, 17:06

**Ryan Schmidt** @rms80@mastodon.gamedev.place · Apr 18, 2023, 17:06

Apr 18, 2023, 17:06

Ryan Schmidt @rms80@mastodon.gamedev.place

@dpwiz @flaviusb @dpiponi fake addition: https://www.mathsisfun.com/numbers/addition-table.html

**серафими многоꙮчитїи** @flaviusb@mastodon.social · Apr 18, 2023, 21:01

**серафими многоꙮчитїи** @flaviusb@mastodon.social · Apr 18, 2023, 21:01

Apr 18, 2023, 21:01

серафими многоꙮчитїи @flaviusb@mastodon.social

@dpwiz @rms80 @dpiponi It's 'fake addition' because there is no addition behind it. Eg if you give it some addition task that is not in its training set, it gets the wrong answer. Like, testing LLMs - especially the ones being popularised like ChatGPT that have ingested ~ most of the internet - on 'addition problems' to prove what we might call 'math ability' or 'model building' suffers from both 'testing on the training set' and 'construct validity' issues.

**Dan Piponi** @dpiponi@mathstodon.xyz · Apr 18, 2023, 21:42

**Dan Piponi** @dpiponi@mathstodon.xyz · Apr 18, 2023, 21:42

Apr 18, 2023, 21:42

Dan Piponi @dpiponi@mathstodon.xyz

@flaviusb @dpwiz @rms80 I'm not sure what you mean by "not in its training set" because I can give ChatGPT additions it probably hasn't seen before and get correct results.

(For my tests I used pseudorandomly generated integers because a neural network could, in principle, learn the kinds of biases humans have when generating "random" numbers.)

**серафими многоꙮчитїи** @flaviusb@mastodon.social · Apr 18, 2023, 21:48

**серафими многоꙮчитїи** @flaviusb@mastodon.social · Apr 18, 2023, 21:48

Apr 18, 2023, 21:48

серафими многоꙮчитїи @flaviusb@mastodon.social

@dpiponi @dpwiz @rms80 Interesting. I have seen people generate pseudorandom test vectors like that and get reasonable rates of failure. I wonder what the difference in test vector generation or etc was. (From my perspective this is quite frustrating as the 'LLM' parts of eg ChatGPT can't train to produce arithmetic machines, but we have very little clarity on what eg 'ChatGPT' *actually* is, due to things like secrecy around parts of the architecture + the pervasive use of ghost labour).

**Dan Piponi** @dpiponi@mathstodon.xyz · Apr 18, 2023, 21:55

**Dan Piponi** @dpiponi@mathstodon.xyz · Apr 18, 2023, 21:55

Apr 18, 2023, 21:55

Dan Piponi @dpiponi@mathstodon.xyz

@flaviusb @dpwiz @rms80 Like you, I'm not sure what ChatGPT actually is. That's part of why I downloaded llama.cpp - to see what other models are, and also what you could possibly attach to an LLM (and how).

**серафими многоꙮчитїи** @flaviusb@mastodon.social · Apr 18, 2023, 22:05

**серафими многоꙮчитїи** @flaviusb@mastodon.social · Apr 18, 2023, 22:05

Apr 18, 2023, 22:05

серафими многоꙮчитїи @flaviusb@mastodon.social

@dpiponi @dpwiz @rms80 Like, to be clear, some nontrivial percentage of the things 'attached' to ChatGPT et al are more like the 'human fallback' of 'Brenda' (obligatory nplusone link: https://www.nplusonemag.com/issue-44/essays/human_fallback/ ), rather than like the integration of 'theories' you get with SMT, and their interventions only have continued effect through the rest of the conversation because of the same 'simulate memory via smuggling state in the input' technique that ChatGPT already uses for conversations.

Trending now

Resources

Developers

What is Mastodon?

qoto.org

More…