There’s an old saying in the journalism business: If your mother tells you she loves you, check it out. The point is that you need to be skeptical even of your most trusted sources. But what if, instead of your mother, it’s a generative AI model like OpenAI’s ChatGPT telling you something? Should you trust the computer?
The takeaway from a talk given by a pair of Carnegie Mellon University computer scientists at South by Southwest this week? No. Check it out.
This week, the Austin, Texas, conference has spotlighted artificial intelligence. Experts discussed the future and the big picture, with talks on trust, the changing workplace and more. CMU assistant professors Sherry Wu and Maarten Sap focused more on the here and now, with some tips on how best to use, and not misuse, the most common generative AI tools out there, like AI chatbots trained on large language models.
“They’re actually far from perfect and not actually suited for all the use cases that people want to use them for,” Sap said.
Here are five bits of advice on how to be smarter than the AI.
Be clear about what you want
Anyone who’s had a joke fall flat on a social media site like Twitter or Bluesky will tell you how hard it is to convey sarcasm in text. And the posters on those sites (at least the human ones) know social cues that indicate when you’re not being literal. An LLM doesn’t.
Today’s LLMs take non-literal statements literally more than half of the time, Sap said, and they struggle with social reasoning.
The solution, Wu said, is to be more specific and structured with your prompts. Make sure the model knows what you’re asking it to produce. Focus on what exactly you want, and don’t assume the LLM will extrapolate your actual question.
Bots are confident but not accurate
Perhaps the biggest issue with generative AI tools is that they hallucinate, meaning they make stuff up. Sap said hallucinations can happen up to a quarter of the time, with higher rates in more specialized areas like law and medicine.
The problem goes beyond just getting things wrong. Sap said chatbots can appear confident in an answer while being completely wrong.
“This leaves humans vulnerable to relying on these expressions of certainty when the model is incorrect,” he said.
The solution to this is simple: Check the LLM’s answers. You can check its consistency with itself, Wu said, by asking the same question several times or variations on the same question. You might see different outputs. “Sometimes you will see that the model doesn’t really know what it is saying,” she said.
The most important thing is to verify with external sources. That also means you should be careful about asking questions to which you don’t know the answer. Wu said generative AI’s answers are most useful when they’re on a subject you’re familiar with, so you can tell what is real and what isn’t.
“Make conscious decisions about when to rely on a model and when not to,” she said. “Do not trust a model when it tells you it is very confident.”
AI can’t keep a secret
The privacy concerns with LLMs are abundant. It goes beyond giving information you wouldn’t want to see on the internet to a machine that might regurgitate it to anyone who asks nicely. Sap said a demonstration with OpenAI’s ChatGPT showed that, when asked to organize a surprise party, it told the person who was supposed to be surprised about the party.
“LLMs are not good at reasoning who should know what and when and what information should be private,” he said.
Don’t share sensitive or personal data with an LLM, Wu said.
“Whenever you share anything produced by you to the model, always double-check if there’s anything in that that you don’t want to release to the LLM,” she said.
Remember, you’re talking to a machine
Chatbots have caught on partly because of how well they mimic human speech. But it’s all mimicry; it’s not truly human, Sap said. Models say things like “I wonder” and “I imagine” because they’re trained on language that includes those words, not because they have an imagination. “The way that we use language, these words all imply cognition,” Sap said. “It implies that the language model imagines things, that it has an internal world.”
Thinking of AI models as human can be dangerous it can lead to misplaced trust. LLMs don’t operate the same way humans do, and treating them as if they’re human can reinforce social stereotypes, Sap said.
“Humans are much more likely to over-attribute human-likeness or consciousness to AI systems,” he said.
Using an LLM may not make sense
Despite claims about LLMs being capable of advanced research and reasoning, they just don’t work that well yet, Sap said. Benchmarks that suggest a model can perform at the level of a human with a Ph.D. are just benchmarks, and the tests behind those analyses don’t mean a model can work at that level for what you want to use it for.
“There’s this illusion of the robustness of AI capabilities going around that leads people to make rash decisions in their businesses,” he said.
When deciding whether or not you should use a generative AI model for a task, consider what are the benefits and potential harms of using it, and what are the benefits and potential harms of not using it, Wu said.
Add comment