We’re Using A.I. Chatbots Wrong. Here’s How to Direct Them.

Anyone seduced by A.I.-powered chatbots like ChatGPT and Bard — wow, they can write essays and recipes! — eventually runs into what are known as hallucinations, the tendency for artificial intelligence to fabricate information.

The chatbots, which guess what to say based on information obtained from all over the internet, can’t help but get things wrong. And when they fail — by publishing a cake recipe with wildly inaccurate flour measurements, for instance — it can be a real buzzkill.

Yet as mainstream tech tools continue to integrate A.I., it’s crucial to get a handle on how to use it to serve us. After testing dozens of A.I. products over the last two months, I concluded that most of us are using the technology in a suboptimal way, largely because the tech companies gave us poor directions.

The chatbots are the least beneficial when we ask them questions and then hope whatever answers they come up with on their own are true, which is how they were designed to be used. But when directed to use information from trusted sources, such as credible websites and research papers, A.I. can carry out helpful tasks with a high degree of accuracy.

“If you give them the right information, they can do interesting things with it,” said Sam Heutmaker, the founder of Context, an A.I. start-up. “But on their own, 70 percent of what you get is not going to be accurate.”

With the simple tweak of advising the chatbots to work with specific data, they generated intelligible answers and useful advice. That transformed me over the last few months from a cranky A.I. skeptic into an enthusiastic power user. When I went on a trip using a travel itinerary planned by ChatGPT, it went well because the recommendations came from my favorite travel websites.

Directing the chatbots to specific high-quality sources like websites from well-established media outlets and academic publications can also help reduce the production and spread of misinformation. Let me share some of the approaches I used to get help with cooking, research and travel planning.

Meal Planning

Chatbots like ChatGPT and Bard can write recipes that look good in theory but don’t work in practice. In an experiment by The New York Times’s Food desk in November, an early A.I. model created recipes for a Thanksgiving menu that included an extremely dry turkey and a dense cake.

I also ran into underwhelming results with A.I.-generated seafood recipes. But that changed when I experimented with ChatGPT plug-ins, which are essentially third-party apps that work with the chatbot. (Only subscribers who pay $20 a month for access to ChatGPT4, the latest version of the chatbot, can use plug-ins, which can be activated in the settings menu.)

On ChatGPT’s plug-ins menu, I selected Tasty Recipes, which pulls data from the Tasty website owned by BuzzFeed, a well-known media site. I then asked the chatbot to come up with a meal plan including seafood dishes, ground pork and vegetable sides using recipes from the site. The bot presented an inspiring meal plan, including lemongrass pork banh mi, grilled tofu tacos and everything-in-the-fridge pasta; each meal suggestion included a link to a recipe on Tasty.

For recipes from other publications, I used Link Reader, a plug-in that let me paste in a web link to generate meal plans using recipes from other credible sites like Serious Eats. The chatbot pulled data from the sites to create meal plans and told me to visit the websites to read the recipes. That took extra work, but it beat an A.I.-concocted meal plan.

Research

When I did research for an article on a popular video game series, I turned to ChatGPT and Bard to refresh my memory on past games by summarizing their plots. They messed up on important details about the games’ stories and characters.

After testing many other A.I. tools, I concluded that for research, it was crucial to fixate on trusted sources and quickly double-check the data for accuracy. I eventually found a tool that delivers that: Humata.AI, a free web app that has become popular among academic researchers and lawyers.

The app lets you upload a document such as a PDF, and from there a chatbot answers your questions about the material alongside a copy of the document, highlighting relevant portions.

In one test, I uploaded a research paper I found on PubMed, a government-run search engine for scientific literature. The tool produced a relevant summary of the lengthy document in minutes, a process that would have taken me hours, and I glanced at the highlights to double-check that the summaries were accurate.

Cyrus Khajvandi, a founder of Humata, which is based in Austin, Texas, developed the app when he was a researcher at Stanford and needed help reading dense scientific articles, he said. The problem with chatbots like ChatGPT, he said, is that they rely on outdated models of the web, so the data may lack relevant context.

Travel Planning

When a Times travel writer recently asked ChatGPT to compose a travel itinerary for Milan, the bot guided her to visit a central part of town that was deserted because it was an Italian holiday, among other snafus.

I had better luck when I requested a vacation itinerary for me, my wife and our dogs in Mendocino County, Calif. As I did when planning a meal, I asked ChatGPT to incorporate suggestions from some of my favorite travel sites, such as Thrillist, which is owned by Vox, and The Times’s travel section.

Within minutes, the chatbot generated an itinerary that included dog-friendly restaurants and activities, including a farm with wine and cheese pairings and a train to a popular hiking trail. This spared me several hours of planning, and most important, the dogs had a wonderful time.

Bottom Line

Google and OpenAI, which works closely with Microsoft, say they are working to reduce hallucinations in their chatbots, but we can already reap A.I.’s benefits by taking control of the data that the bots rely on to come up with answers.

To put it another way: The main benefit of training machines with enormous data sets is that they can now use language to simulate human reasoning, said Nathan Benaich, a venture capitalist who invests in A.I. companies. The important step for us, he said, is to pair that ability with high-quality information.

Brian X. Chen is the lead consumer technology writer for The Times. He reviews products and writes Tech Fix, a column about the social implications of the tech we use. Before joining The Times in 2011, he reported on Apple and the wireless industry for Wired. More about Brian X. Chen

Source: Read Full Article