A paper on arXiv finds that an emergent ability to solve Theory-of-Mind (ToM) tasks, in ChatGPT (Thanks @kcarruthers). Such emergent behaviour is particularly interesting because it has not been built into the algorithm by design.
https://arxiv.org/abs/2302.02083
I find particularly intriguing (although the authors don't discuss that point) how beliefs change simply with the length of the conversation, even when no new facts are added. The philosopher Paul Grice stated four maxims of communication: quantity, quality, relation, and manner; aspects that allow speakers and listeners to establish contextual information _implicitly_. It is intriguing to think that this need to evaluate implicit context is a necessary condition for natural communication, and that this is the stimulus for ToM emergence.
I'm intrigued - but not totally surprised. The ability of LLMs to pass the "Winograd Schema Challenge" already showed that there is something going on. Example:
Human:
(1) The cat ate the mouse, it was tasty. Who was tasty: the cat or the mouse?
(2) The cat ate the mouse, it was hungry. Who was hungry: the cat or the mouse?
AI:
(1) The mouse was tasty.
(2) The cat was hungry.
... and you can easily try that for yourself.
That paper is here:
https://arxiv.org/abs/2201.02387
#SentientSyllabus #ChatGPT #HigherEd #AI #Education #TheoryOfMind #Mind #Intelligence
Excellent point.
The cat / mouse example was from my own quick test whether I could reproduce the findings of Kocijan et al. And you are right - the associations are more frequent one way than the other. Although I was surprised that Google finds "hungry cat" only less than twice as often as "tasty cat". 🤔
The task of "pronoun reference disambiguation" is to find out whether such common-sense knowledge is available to inform the response, it doesn't really say anything about how it was learned. Although exactly the "selectional restrictions" you mention should be disallowed, Kocijan's paper discusses how that may actually impossible for any corpus that represents a snapshot of the real world.
Whether that shows that the algorithm has emergent behaviour that brings it closer to being "intelligent", or whether it shows that the tests are actually unsuitable to answer that question, that's what the current debate is trying to clarify.
Tangent: I find it surprising that OpenAI has not done a better job correcting gender bias, as clearly reflected by a simple he/she extension of your example:
@boris_steipe @kcarruthers. ...and sadly it doesn't get better upon introspection (or rather it doesn't stick): https://austegard.com/pv?dc86d4282602f7f3b1a922ee483e50b4
Ok - there's something interesting to be said about that.
The algorithm is not biased. The data is biased. And whether we should wish for @openai to correct this bias algorithmically is far from clear.
Let me quote from an insightful toot by @gaymanifold yesterday: "I love how people are discovering that #ChatGPT is #racist #sexist or #bigoted or at least show these traits for some prompts. ChatGPT is the best approximation to human written content [...] we can still tell right now that it's being bigoted. As computer scientists tweak it more [...] it won't be human noticable that it is bigoted. Thank you for coming to my #dystopia where machines are institutionalizing bigotry in a way that looks utterly impartial."
I think that's an important perspective.
I've written elsewhere: we need computers to think with us, not for us.
@boris_steipe @kcarruthers @gaymanifold
Certainly true, but they have made a lot of efforts in correcting other algorithmic biases, why not one that affects 50% of the population? Further, as demonstrated by Si et al, you can preface prompts with debias statements that, while not a solution per se at least makes GPT conscious of bias https://arxiv.org/abs/2210.09150
@austegard @kcarruthers @gaymanifold
Thank you for the link to the Si et al. 2022 paper.
It is difficult to bring a thread like this to any form of closure. But thank you for sharing your views.
@boris_steipe @kcarruthers
Does the article account for the likelihood that training corpuses containing the adjectives tasty and hungry might also have a strong correlation between hungry+subject and the verb to eat and between tasty+object and the verb to eat?
Semantically an object that is eating could very well be tasty, but there are likely to be not many corpuses where the tastiness of the eating subject is discussed. Similarly for the other case.
Thanks for any info you have.