In the past, when a company wanted to evaluate a new or improved product using surveys, they had to recruit participants, administer surveys, collect responses, and finally conduct the analysis.
This process was not only time-consuming but also expensive, often taking weeks or even months to complete.
Now, a new research project by three professors at the University of Delaware’s Alfred Lerner College of Business and Economics could change all of that.
Rather than relying on human surveys, the faculty – Assistant Professor of MIS Yu Zhu, Visiting Assistant Professor of Accounting and MIS Jared Sharpe, and Professor of MIS Xiao Fang – will use large language models (LLMs) to act like customers, hence significantly saving time and money.
The three professors were recently selected to receive a grant from the American Fintech Council and UD’s Data Science Institute to help them get started. The team first submitted a one-page letter of interest, the first round of review, followed by a full proposal in October. After advancing to the final round, they participated in a panel discussion in February. Their project, titled “Leveraging Large Language Models to Simulate Consumer Behavior for Fintech Product Innovation,” was ultimately selected for AFC support.
Zhu, the project’s Principal Investigator, explained that the process begins by initializing LLMs–referred to as agents–with specific user profiles, including attributes such as age, sex and background. Each agent is instructed to play the role of the user.
“Then I would tell this agent, for example, you have two choices,” he said. “Option A is the current product with certain features. Option B is a modified version with improvements (provided below). Which one do you prefer–and more importantly, why?”
Because LLMs can be deployed at scale, thousands of such agents can be initialized simultaneously, each representing a different consumer profile. “We can ask all these agents for their preferences and reasoning in parallel,” Zhu said. “This approach is dramatically faster and cheaper. That’s the whole idea of using LLMs to speed up product development.”
Results can be generated in a matter of hours, shrinking a process that used to take weeks.
Still, Zhu emphasized the importance of validating the AI’s responses against real-world data.
“First, we have to find evidence that LLM behavior aligns with human responses,” he said. “For example, we might take historical customer data following a product change, simulate that change using LLM agents, and compare their responses to the actual customer behavior. If they align, we can be more confident in the approach.”
While LLMs are not a complete replacement for human surveys, Zhu sees them as a powerful complement. “At this stage, we need both simulated and real response to ensure robustness,” he said.
Funding from the AFC will be extremely helpful for the project.
“We already have a set of machines that are capable of running models,” Zhu said. “We already have machines capable of running the models,” Zhu said. “But to scale up or access more advanced, closed-source models behind paywalls, this funding is essential. It also supports efforts to validate results with actual human participants.”
Zhu pointed to Early Wage Access (EWA)–a type of fintech product–as a prime example of how large language models can be used to simulate consumer behavior. EWA works as follows: suppose an employee who typically gets paid on the 15th of each month might face an emergency on the 10th and need access to wages already earned. Early Wage Access products allow users to tap into those accrued earnings ahead of payday.
“This is a classic fintech solution, but there’s still room for innovation,” Zhu explained. “For example, companies offering these products may want to include educational messages to promote responsible use. They want to help people manage financial shortfalls but also discourage overreliance, which could harm long-term financial health.”
“The question is: what’s the best content for this education banner? That’s something we can evaluate using LLMs,” he said. Beyond simply testing whether the banner works, Zhu and his team can assess different versions of the same message, varying the tone and language while keeping the core content consistent. “We might test three different phrasings of the same message and ask the LLM: Which one would you prefer? How would you feel about each one? This gives us insight into how consumers might react, without running a full-scale field study.”
He and his colleagues are eager to begin the project, which officially kicks off in June, with initial results expected by year’s end. “I told the Vice President of AFC that while the project starts in June, we’re happy to begin conversations with companies right away to help refine our research focus,” Zhu said.
He also emphasized that this project is part of a broader and rapidly growing field: using agentic AI in social science research. “In our case, we’re simulating consumers with AI,” he said. “In economics, psychology, and finance, researchers are increasingly using LLMs to run virtual behavioral experiments. Traditionally, this would involve recruiting participants—often college students—and running controlled lab studies. Now, LLMs can serve as a scalable, low-cost alternative.”
“These developments are happening incredibly fast,” Zhu added. “And we’re excited to be at the forefront.”