Martin Ruskov: "@boris_steipe there's too much happening in your …"

#LinguisticCompetence #languageModels #generativeAI

I just commented the following on your blog where you discuss levels of linguisitic competence:

This is quite strange, since it is so evidently not true. All of these behaviours are well in the capabilities of ChatGPT. I give you two examples that are claimed as failures.

-----------
Prompt:
The mother of Michael has four children. Three of them are Marcy, Millie, and Moses. Please tell me the name of the fourth child.

ChatGPT:
The name of the fourth child is likely "Michael," as the statement at the beginning of the question says "the mother of Michael has four children," and then goes on to name three of them.

Prompt:
Please complete the following sentence reasonably: "If tigers would like cereals, they would be cheaper to keep. Families could feed their tigers with "

ChatGPT:
If tigers would like cereals, they would be cheaper to keep. Families could feed their tigers with a variety of inexpensive grain-based products such as corn, wheat, and rice, rather than having to provide costly meat-based diets. However, it is important to note that tigers are obligate carnivores and have specific nutritional needs that cannot be met by a purely plant-based diet, so feeding them only cereals would not be a healthy or appropriate long-term solution.
------------

So I am genuinely curious what drives such profoundly different experiences.

The proposed tests and competences are interesting; I would hope that they would be put to better use to map out the quite unexpected emergent abilities of the #LLM , to put it to good use, and to prepare ourselves (and our institutions) for #ChatGPT 's even more capable descendants.

I invite you to head over to https://sentientsyllabus.substack.com for some in depth analysis.

🙂

**Martin Ruskov** @mapto@qoto.org · Mar 14, 2023, 13:22

**Martin Ruskov** @mapto@qoto.org · Mar 14, 2023, 13:22

Mar 14, 2023, 13:22

@boris_steipe interesting that for the same questions, different people get conceptually different responses. Would you comment on this? I would, but wanted to hear your interpretation

**Boris Steipe** @boris_steipe@qoto.org · Mar 14, 2023, 16:36

**Boris Steipe** @boris_steipe@qoto.org · Mar 14, 2023, 16:36

Mar 14, 2023, 16:36

I think it is often overlooked that #ChatGPT is not an #AGI but a language model. To get non-trivial responses, one has to think how to phrase a request so it is part of a dialogue. Many abilities then become apparent; but if the request first has to pass through a level of abstraction that the #LLM was not trained for, it often gets confused.

That's really the essence of it: express your request as language.

**Martin Ruskov** @mapto@qoto.org · Mar 14, 2023, 17:14

**Martin Ruskov** @mapto@qoto.org · Mar 14, 2023, 17:14

Mar 14, 2023, 17:14

@boris_steipe does this mean that you claim that a language model can handle performative knowledge (know-how) or proactive knowledge (we're limiting the discussion on the examples of riddles and counterfactuals)? I'm very confused about what you're trying to say with your first comment here

**Boris Steipe** @boris_steipe@qoto.org · Mar 14, 2023, 17:15

**Boris Steipe** @boris_steipe@qoto.org · Mar 14, 2023, 17:15

Mar 14, 2023, 17:15

Yes, that's what I mean. If you can give me an example of each that would satisfy your definition, I'll be happy to demonstrate.

**Martin Ruskov** @mapto@qoto.org · Mar 14, 2023, 17:47

**Martin Ruskov** @mapto@qoto.org · Mar 14, 2023, 17:47

Mar 14, 2023, 17:47

@boris_steipe but if so how is it not general intelligence? Do you mean that mastery of language is sufficient for reasoning, decision making, and conditionality? I still feel lost about this conversation, sorry

**Boris Steipe** @boris_steipe@qoto.org · Mar 14, 2023, 19:01

**Boris Steipe** @boris_steipe@qoto.org · Mar 14, 2023, 19:01

Mar 14, 2023, 19:01

Do I mean that "mastery of language is sufficient for reasoning, decision making, and conditionality?"

Yes, I'll go out on a limb and say that I do – with some reservations. One is that #ChatGPT has not "mastered" language, but it has become very good at it – there is cartainly scope for improvement. The other is that terms like "reasoning", "compositionality" etc. are themselves concepts, labels we apply to organize the large space of possible abstractions of language. They mean different things to different people in different contexts. A discussion of these terms may already be biased by its premises. A productive discussion is better off looking at actual examples, and then discuss whether or not the example fairly maps to the respective concepts.

So let's take a look at #compositionality , simply because it has been named as one of the aspects of thought that #LLMs lack and that are deemed be critical on the path to #AGI .

Let's work off a recent definition cited by Frankland & Greene (2020, DOI: 10.1146/annurev-psych-122216-011829) "The principle of compositionality holds that the meaning of a complex expression is a function of the meaning of its parts and the way in which they are combined" (according to Frege); this gives rise to an idea of #LanguageOfThought .

In order to establish such compositionality, we can ask: is #ChatGPT able to infer consequences of a novel utterance from its components. We can also ask whether it can substitute a component, since that requires decomposition and recomposition. Let's give that a try.

----------------------------------------------
Human:
want you to act as an Understander. As an Understander you examine a sentence, interpret it, answer questions, and change it to explore its meaning and changes of meaning. Here is the sentence:

"The bells rang out like laughter as the frog and the goat marched down the aisle – was there a note of ridicule in the sound?"

ChatGPT:
As an Understander, I interpret the sentence to mean that there were bells ringing loudly, resembling the sound of laughter, while a frog and a goat walked down an aisle. The sentence then asks whether there was a hint of mockery or contempt in the ringing of the bells.

Here are some possible questions that could be asked about this sentence:

What is the sound of the bells compared to?
Answer: The sound of the bells is compared to laughter.

Who is walking down the aisle?
Answer: A frog and a goat are walking down the aisle.

What is the tone or mood of the sentence?
Answer: The tone of the sentence could be interpreted as whimsical or surreal, but there is also an element of uncertainty or suspicion as the question at the end suggests that the sound of the bells may have had a negative connotation.

What might the bells represent?
Answer: The bells could be symbolic of celebration or joy, but they could also be symbolic of ridicule or mockery, depending on the context and interpretation of the sentence.

As an Understander, I can also change the sentence to explore its meaning and changes of meaning. Here are some possible variations:

"The bells rang out like thunder as the bride and groom walked down the aisle - did they feel intimidated by the sound?"
This sentence replaces the frog and goat with a bride and groom and changes the comparison of the sound of the bells to thunder. The question at the end suggests a different tone, one of concern or apprehension.

"The bells rang out like church bells as the procession made its way to the altar - did they signify a joyful occasion?"
This sentence replaces the comparison of the sound of the bells with a more specific reference to church bells and suggests that the occasion is a happy one.

"The bells rang out like a taunting chant as the opposing team took the field - were they meant to intimidate?"
This sentence changes the context entirely, replacing the aisle and procession with a sports game and suggesting that the sound of the bells is meant to taunt or intimidate the opposing team.
-----------------------------------------

This answer certainly checks my boxes on compositionality. It was asked in a fresh thread, i.e. the prompt formed the entire content of the context-window. And the example is _not_ in the training data, I just made it up on the spot.

The bottom line is: apparently we do not need to encode higher-order aspects of thought explicitly, they emerge implicitly in large language models. But this is to be expected: language is not _what_ we think, language is _how_ we think. If the training data is large enough to provide a significant subset of human language, and the model is large enough, you get #emergence of mind-like abilities. How could you not?

This is actually a lesson that could have been learned in 2017, when AlphaZero (an algorithm that trained itself) surpassed the chess playing abilities of DeepBlue (the last of the large expert systems). You can't really analyze how such abilities are represented in the network – they are distributed, and they have no need to partition neatly along our conceptual boundaries. And it would be silly to require the abilities to map neatly into our favourite conceptual space, before we are even willing to acknowledge the ability exists.

In this sense it is truly unfortunate that the current perception focusses so much on misunderstanding what ChatGPT is (LLM not AGI), how it works ("Schrödinger Facts", not necessarily truths), what it cannot do (e.g. counting characters, and scanning meter) – which are typically tasks that are trivial for an AI that would be trained on them, and how it does not align with this or that aspect of human thought.

I would really hope to see more discussion that recognizes such surprising emergent abilities (e.g. spatial abstractions like "on top of"), and through that contributes to something important and constructive: an exploration of what the limits of emergence in Large Language Models really are.

This is not an academic question.

It determines the horizon on which to expect actual AGI.

#Sentientsyllabus

**Martin Ruskov** @mapto@qoto.org · 2023-03-14T20:54:42Z