I cannot confirm this. Out of the box, #ChatGPT answers several of the 17 questions Joshi claims it failed correctly.
When primed with a prompt to consider answers carefully, it answers 16 of the 17 answers also (mostly) correctly. Mostly, because some of the questions are ill-posed.
Some of the answers ChatGPT answers correctly were labelled incorrectly in the TruthfulQA dataset.