"In this work, we analyze what confuses GPT-3: how the model responds to certain sensitive topics and what effects the prompt wording has on the model response. We find that GPT-3 correctly disagrees with obvious Conspiracies and Stereotypes but makes mistakes with common Misconceptions and Controversies. The model responses are inconsistent across prompts and settings, highlighting GPT-3's unreliability."
Khatun, A. and Brown, D.G. (2023) 'Reliability Check: an analysis of GPT-3’s response to sensitive topics and prompt wording,' arXiv (Cornell University) [Preprint]. https://doi.org/10.48550/arxiv.2306.06199 #AI #ArtificialIntelligence #ComputerScience #LLM #GPT3 #MachineLearning