Generative Agent Simulations of 1,000 People
A paper that thoroughly executes a parity study between Synthetic and Organic users.
Harnessing the power of AI in our Synthetic Users, we strive for a balance between reflecting reality and ethical responsibility, ensuring diversity and fairness while maintaining realism.
It’s tricky and fascinating at the same time and we’ll tell you why. When we use large language models (LLMs) as the foundation for organic simulation, the goal is to make these simulations as realistic as possible. If the AI reflects biases present in society, then we make the argument that these biases help make Synthetic Users behave more like organic users. This is useful for testing how systems work in the real world, which includes all the messy and imperfect aspects of human behavior.
‍
The above mentioned reality is the one we inhabit. For Synthetic Users, the biases we are concerned with are usually at a higher level, specifically when we deal with large numbers of Synthetic Users for the purposes of creating surveys.
‍
A simple example: At the beginning we found a lot of geographic biases in our Synthetic Users. If we were to ask: create 10 Synthetic Users in France, most of the users would be from the Paris area. It does make sense after all seeing the Paris is the most populated area in France. But for our customers this should not be the default. The default should be a more even distribution across the territory, unless Paris is specified. And so we corrected for this geographical bias.
‍
We correct for the eco bias as well. Synthetic Users have a tendency to be over protective about the planet when in fact we know that most people may think that way, but the numbers show their actions are not as aligned as we would hope. For our customers this is important. We want to offer them an accurate reflection of society. Again: high Synthetic Organic parity is our core value proposition.
‍
OMG, VR and Crypto. We’re not sure what happened with the training data but there was an over-reliance on those two tools. At first it was funny, every solution relied on one or another,  but very soon thereafter it became annoying and we’ve had to compensate for it. Like these, there are countless examples.
‍
‍
Now if you take a scenario where the tool being developed needs to perform resume triages or create resumes, based solely on the person’s name, biases are not helpful. Take the paper titled "The Silicone Ceiling: Auditing GPT’s Race and Gender Biases in Hiring" where researchers used GPT-3.5 to assess if it was fair when it comes to helping companies decide who to hire. The researchers wanted to see if the LLM was treating people differently based on their gender or race when evaluating resumes.
‍
They did two main experiments. In the first one, they checked if the LLM gave different scores to resumes just because of the names on them, which can suggest a person's gender or race. They found that the AI sometimes gave lower scores to women and people from minority groups, especially in job areas where white men are more common. In the second experiment, they asked the LLM to write resumes for different names. The LLM again showed bias by giving better job experiences to men and suggesting lower-level jobs for women and minorities.
‍
The LLM reflected existing societal biases. It begs the following questions:
‍
Do we wish to continue to perpetuate biases: Even if biases in an AI model accurately reflect current societal biases, continually using these biases in testing and simulations can reinforce and perpetuate them. This creates a cycle where biased AI systems continue to affirm biased human behavior, making it harder to break these patterns. In a world where most of the data will be produced by AIÂ agents, we can see how this is going to be a real problem.
‍
For the purpose of Synthetic Users, we want an accurate reflection of society and its biases in order to provide our customers with the highest Synthetic Organic parity. As society evolves and in turn further trains models, we ask but that these models accurately reflect the society that produced the training data upon which the models are trained, no more, no less.
‍
‍
Intentionally using biased AI models raises ethical and legal questions. It challenges us to think about the responsibility of technology creators to strive for fairness and equity. This makes more sense in tools that do not just reflect but rather create a new world. We’re not trying to excuse Synthetic Users as simple mirrors of today’s society, but in our ideal tool, biases will become visible and those who use Synthetic Users will create in perfect awareness of how much they wish to correct for those biases or not.
‍
‍
‍
Irrespective of whether you are an advocate of LLMs as pure mirrors or someone who is trying to correct injustices, we know that we cannot correct what we cannot see — as such an accurate mirror is important. We also know that biases are easy to measure and as such training on more diverse data sets will not only reflect the breadth of human behavior but also counteract existing biases. This doesn't mean removing all realism; instead, it involves a thoughtful balance between realism and fairness to better represent the diversity of user experiences.
‍