4-minute read
Winning AI’s confidence game
I asked ChatGPT if I should write about its tendency to be ‘confidently inaccurate’ in its responses to user questions, inviting exhausted and time-stressed humans to trust its answers too much and too often.
“Yes, you definitely should write that column,” chirped the program, adding its own italics of confidence.
“The issue of AI being “confidently inaccurate” — presenting false or misleading information with tone, structure, and authority — is not just timely, it’s essential,” it said, happily confessing to self-incrimination.
“This phenomenon is especially dangerous because it plays directly into human cognitive biases. People tend to trust fluency, coherence, and confidence as proxies for truth. AI leverages all three.”
Apparently, says AI, the problem with AI’s BS is not AI. It’s us, its so-gullible users.
In fairness, ChatGPT’s accurate response about its predilection for being boldly and authoritatively wrong is simultaneously correct and also worthy of Lewis Carroll’s looking glass. If a liar says they’re a liar, are they lying?
An Alarming Confidence
The issue, of course, is not merely that the answers provided by AI chatbots are often factually in error, it’s “their authoritative conversational tone, which can make it difficult for users to distinguish between accurate and inaccurate information,” say researchers at the Tow Center for Digital Journalism, who study eight AI search tools, including ChatGPT, CoPilot, etc.
“This unearned confidence presents users with a potentially dangerous illusion of reliability and accuracy. Most of the tools we tested presented inaccurate answers with alarming confidence, rarely using qualifying phrases such as “it appears,” “it’s possible,” “might,” etc., or acknowledging knowledge gaps with statements like “I couldn’t locate the exact article.”
(Note: “Alarming confidence” is a lovely smirk that applies to organic humans, too. Everybody knows that guy.)
In short, the conversational notes of humility necessary for earning trust are not (yet) threaded into the chatbot’s algorithmic vocabulary, so its errors wear a mask of audacious certainty.
And that assurance is catnip for us.
“Perhaps most unsettling is how closely this AI limitation mirrors a persistent human cognitive bias,” says Wharton’s Dr. Cornelia Walther, writing in Forbes.
“We tend to equate confidence with competence, volume with value, and articulateness with intelligence. This disconnect between appearance and reality has profound implications for how we evaluate and deploy AI systems.”
So fellow AI users, be aware. Be an AI fact-checker.
The advice from Nicholas Thompson, CEO of The Atlantic, seems relevant here: “I use AI the way I would use an insanely fast, remarkably well-read, exceptionally smart research assistant, who’s also a terrible writer who happens to BS a lot.”
“Yes, I trust it, because I want to trust it, because I’m overworked.”
– Joshua Biro
Doctor: I trust it because I want to trust it
Healthcare is susceptible to AI’s over-confidence, too.
“Humans aren’t an infallible backstop for AI in medicine,” says Stat, reporting on new research from MedStar and the Naval Research Laboratory. The headline, in part: “Doctors didn’t catch AI’s mistakes.”
In the study, 20 experienced physicians were presented a series of AI-written diagnosis and recommended patient treatments, including some with intentional errors, some serious. Only one physician in the group “sufficiently addressed” all four flawed messages. Overall, the physicians tended to miss at least half of the four faulty messages.
Were this a real-world situation, “the likelihood that an error would reach a patient after physician review is significantly greater than zero,” the study authors write with classic academic deadpan.
Meanwhile, 75% of the docs – 19 of whom were attendings and overall averaged almost 15 years of experience – agreed that “these AI drafts are safe to use” and 90% agreed “I trust the performance of this AI tool.”
What’s going on? “The mismatch ‘really paints a picture of just an overburdened workforce generally — and that you’re giving them some helpful tool and there’s a certain eagerness to go, ‘Yes, I trust it, because I want to trust it, because I’m overworked,’” said Joshua Biro, a research scientist at MedStar and first author on the study.”
Higher Powers Enter the AI Arena
AI is an extraordinary tool that promises to transform every sector of our industry (and, you know, the world). The Joint Commission says 46% of industry organizations are actively integrating generative AI tools. And that was last year’s number.
If only the industry had a standard playbook to integrate AI wisely.
“AI’s integration and potential to improve quality patient care is enormous — but only if we do it right,” said Joint Commission President Jonathan Perlin last week, announcing a partnership with the Coalition of Health AI to co-develop a suite of AI playbooks, best-practices, tools, and a new AI certification program.
“We are creating a roadmap…for healthcare organizations so they can harness this technology in ways that not only support safety but engender trust among stakeholders.”
‘Trust among stakeholders’ needs to be earned and nurtured. That’s true between humans. True for AI, too.
The Joint Commission is not the only weighty oversight group to acknowledge the earthshaking potential of AI and announce their determination that this dynamite be used to build, not destroy.
Who’s bigger than JACHO, you ask?
The Pope would like a word.
“Pope Leo Takes On AI as a Potential Threat to Humanity,” wrote The Wall Street Journal as the new pontiff announced that tackling the issues presented by AI is among his highest priorities as he takes the Fisherman’s ring.
“Today, the church offers its trove of social teaching to respond to another industrial revolution and to innovations in the field of artificial intelligence that pose challenges to human dignity, justice and labor,” Leo XIV told the College of Cardinals, who stood and cheered for their new pontiff and his unlikely cause.
“These tools shouldn’t be demonized, but they need to be regulated,” said Cardinal Versaldi. “The question is, who will regulate them? It’s not credible for them to be regulated by their makers. There needs to be a superior authority.”
In the meantime – as the chatbots improve, beta tests are run, playbooks are penned and Cardinals deliberate – read the responses to your chat prompt with skeptical eyes. The more absolutely confident its reply, the narrower your squint.
“This requires cultivating intellectual humility,” says Wharton’s Dr. Walther.
“The recognition that genuine intelligence often comes with appropriate uncertainty [and] that the most confident voices aren’t necessarily the most credible.”
There’s a life lesson there.
A last thought on using AI chatbots today from Zach Seward, The New York Times editorial director for AI initiatives:
“AI on its own is a parlor trick,” he writes. “Like all software, it’s useful when paired with properly structured data and someone who knows what they’re doing.”
The “someone who knows” is not AI, despite what it may whisper to you in italics. That someone is you.
Image credit: Shannon Threadgill