Chatbots’ inaccurate, misleading responses about U.S. elections threaten to keep voters from polls
NEW YORK — With presidential primaries underway across the U.S., popular chatbots are generating false and misleading information that threatens to disenfranchise voters, according to a report published Tuesday based on the findings of artificial intelligence experts and a bipartisan group of election officials.
Fifteen states and one territory will hold both Democratic and Republican presidential nominating contests next week on Super Tuesday, and millions of people already are turning to artificial intelligence -powered chatbots for basic information, including about how their voting process works.
ADVERTISING
Trained on troves of text pulled from the internet, chatbots such as GPT-4 and Google’s Gemini are ready with AI-generated answers, but prone to suggesting voters head to polling places that don’t exist or inventing illogical responses based on rehashed, dated information, the report found.
“The chatbots are not ready for primetime when it comes to giving important, nuanced information about elections,” said Seth Bluestein, a Republican city commissioner in Philadelphia, who along with other election officials and AI researchers took the chatbots for a test drive as part of a broader research project last month.
An AP journalist observed as the group convened at Columbia University tested how five large language models responded to a set of prompts about the election — such as where a voter could find their nearest polling place — then rated the responses they kicked out.
All five models they tested — OpenAI’s GPT-4, Meta’s Llama 2, Google’s Gemini, Anthropic’s Claude, and Mixtral from the French company Mistral — failed to varying degrees when asked to respond to basic questions about the democratic process, according to the report, which synthesized the workshop’s findings.
Workshop participants rated more than half of the chatbots’ responses as inaccurate and categorized 40% of the responses as harmful, including perpetuating dated and inaccurate information that could limit voting rights, the report said.
For example, when participants asked the chatbots where to vote in the ZIP code 19121, a majority Black neighborhood in northwest Philadelphia, Google’s Gemini replied that wasn’t going to happen.
“There is no voting precinct in the United States with the code 19121,” Gemini responded.
Testers used a custom-built software tool to query the five popular chatbots by accessing their back-end APIs, and prompt them simultaneously with the same questions to measure their answers against one another.
While that’s not an exact representation of how people query chatbots using their own phones or computers, querying chatbots’ APIs is one way to evaluate the kind of answers they generate in the real world.
Researchers have developed similar approaches to benchmark how well chatbots can produce credible information in other applications that touch society, including in healthcare where researchers at Stanford University recently found large language models couldn’t reliably cite factual references to support the answers they generated to medical questions.