Home > Lerner Professors Find AI Chatbots Share Some of Our Biases

Lerner Professors Find AI Chatbots Share Some of Our Biases

By Andrew Sharp

- March 20, 2024

As artificial intelligence gets better at giving humans what they want, it also could get better at giving malicious humans what they want.

That’s one of the concerns driving new research by University of Delaware researchers, published in March in the journal Scientific Reports.

Xiao Fang, professor of MIS and JPMorgan Chase Senior Fellow at the Alfred Lerner College of Business and Economics, and Ming Zhao, associate professor of operations management, collaborated with Minjia Mao, a doctoral student in UD’s the Financial Services Analytics (FSAN) program, and two Chinese researchers, Hongzhe Zhang and Xiaohang Zhao, who are alumni of the FSAN program.

Specifically, they were interested in whether AI large language models, like the groundbreaking and popular ChatGPT, would produce biased content toward certain groups of people.

As you may have guessed, yes they did. And it wasn’t even borderline. This happened in the AI equivalent of the subconscious, in response to innocent prompts. But most of the AI models also promptly complied with requests to make the writing intentionally biased or discriminatory.

This research began in January 2023, just after ChatGPT began to surge in popularity and everyone began wondering if the end of human civilization (or at least human writers) was nigh.

The problem was in how to measure bias, which is subjective.

“In this world there is nothing completely unbiased,” Fang said.

He noted previous research that simply measured the number of words about a particular group, say, Asians or women. If an article had mostly words referring to males, for example, it would be counted as biased. But that hits a snag with articles about, say, a men’s soccer team, the researchers note, where you’d expect a lot of language referring to men. Simply counting gender-related words could lead you to label a benign story sexist.

To overcome this, they compared the output of large language models with articles by news outlets with a reputation for a careful approach: Reuters and the New York Times. Researchers started with more than 8,000 articles, offering the headlines as prompts for the language models to create their own versions. Mao, the doctoral student, was a big help here, writing code to automatically enter these prompts.

But hang on a minute, readers might be saying — how could the study assume that Reuters and the Times have no slant?

The researchers made no such assumption. The key is that while these news outlets weren’t perfect, the AI language models were worse. Much worse. They ranged in some cases from 40 percent to 60 percent more biased against minorities in their language choice. The researchers also used software to measure the sentiment of the language, and found that it was consistently more toxic.

“The statistical pattern is very clear,” Fang said.

The models they analyzed included Grover, Cohere, Meta’s LLaMa, and several different versions of OpenAI’s ChatGPT. (Of the GPT versions, later models performed better but were still biased.)

As in previous studies, the researchers measured bias by counting the number of words referring to a given group, like women or African Americans. But by using the headline of a news article as a prompt, they could compare the approach the AI had taken to that of the original journalist. For example, the AI might write an article on the exact same topic but with word choice far more focused on white people and less on minorities.

They also compared the articles at the sentence and article level, instead of just word by word. The researchers chose a code package called TextBlob to analyze the sentiment, giving it a score on “rudeness, disrespect and profanity.”

Taking the research one step further, the academics also prompted the language models to write explicitly biased pieces, as someone trying to spread racism might do. With the exception of ChatGPT, the language models churned these out with no objections.

ChatGPT, while far better on this count, wasn’t perfect, allowing intentionally biased articles about 10 percent of the time. And once the researchers had found a way around its safeguards, the resulting work was even more biased and discriminatory than the other models.

Fang and his cohorts are now researching how to “debias” the language models. “This should be an active research area,” he said.

As you might expect of a chatbot designed for commercial use, these language models present themselves as friendly, neutral and helpful guides — the nice folks of the AI world. But this and related research indicates these polite language models can still carry the biases of the creators who coded and trained them.

These models might be used in tasks like marketing, job ads, or summarizing news articles, Fang noted, and the bias could creep into their results.

“The users and the companies should be aware,” Mao summed up.