A paper on arXiv finds that an emergent ability to solve Theory-of-Mind (ToM) tasks, in ChatGPT (Thanks @kcarruthers). Such emergent behaviour is particularly interesting because it has not been built into the algorithm by design.

arxiv.org/abs/2302.02083

I find particularly intriguing (although the authors don't discuss that point) how beliefs change simply with the length of the conversation, even when no new facts are added. The philosopher Paul Grice stated four maxims of communication: quantity, quality, relation, and manner; aspects that allow speakers and listeners to establish contextual information _implicitly_. It is intriguing to think that this need to evaluate implicit context is a necessary condition for natural communication, and that this is the stimulus for ToM emergence.

I'm intrigued - but not totally surprised. The ability of LLMs to pass the "Winograd Schema Challenge" already showed that there is something going on. Example:

Human:
(1) The cat ate the mouse, it was tasty. Who was tasty: the cat or the mouse?
(2) The cat ate the mouse, it was hungry. Who was hungry: the cat or the mouse?

AI:
(1) The mouse was tasty.
(2) The cat was hungry.

... and you can easily try that for yourself.

That paper is here:
arxiv.org/abs/2201.02387

Theory of Mind May Have Spontaneously Emerged in Large Language Models

Theory of mind (ToM), or the ability to impute unobservable mental states to others, is central to human social interactions, communication, empathy, self-consciousness, and morality. We tested several language models using 40 classic false-belief tasks widely used to test ToM in humans. The models published before 2020 showed virtually no ability to solve ToM tasks. Yet, the first version of GPT-3 ("davinci-001"), published in May 2020, solved about 40% of false-belief tasks-performance comparable with 3.5-year-old children. Its second version ("davinci-002"; January 2022) solved 70% of false-belief tasks, performance comparable with six-year-olds. Its most recent version, GPT-3.5 ("davinci-003"; November 2022), solved 90% of false-belief tasks, at the level of seven-year-olds. GPT-4 published in March 2023 solved nearly all the tasks (95%). These findings suggest that ToM-like ability (thus far considered to be uniquely human) may have spontaneously emerged as a byproduct of language models' improving language skills.

arxiv.org

@boris_steipe @kcarruthers

Tangent: I find it surprising that OpenAI has not done a better job correcting gender bias, as clearly reflected by a simple he/she extension of your example:

@austegard @kcarruthers

Ok - there's something interesting to be said about that.

The algorithm is not biased. The data is biased. And whether we should wish for @openai to correct this bias algorithmically is far from clear.

Let me quote from an insightful toot by @gaymanifold yesterday: "I love how people are discovering that is or or at least show these traits for some prompts. ChatGPT is the best approximation to human written content [...] we can still tell right now that it's being bigoted. As computer scientists tweak it more [...] it won't be human noticable that it is bigoted. Thank you for coming to my where machines are institutionalizing bigotry in a way that looks utterly impartial."

I think that's an important perspective.

I've written elsewhere: we need computers to think with us, not for us.

@boris_steipe @kcarruthers @gaymanifold

Certainly true, but they have made a lot of efforts in correcting other algorithmic biases, why not one that affects 50% of the population? Further, as demonstrated by Si et al, you can preface prompts with debias statements that, while not a solution per se at least makes GPT conscious of bias arxiv.org/abs/2210.09150

Follow

@austegard @kcarruthers @gaymanifold

Thank you for the link to the Si et al. 2022 paper.

It is difficult to bring a thread like this to any form of closure. But thank you for sharing your views.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.