A new experiment sees AI characters exhibit ‘believable human behaviour’ as they go about their day-to-day lives – here’s everything you need to know
A world entirely populated with AI might sound like a nightmare, the aftermath of a Terminator or Blade Runner-type scenario where the baddies actually win, eliminating the human race to claim Earth for themselves. But... would it actually be that bad? Look around you: we’re not doing such a good job ourselves. Could AI actually do any worse? What if the post-apocalypse was actually kinda cute and wholesome?
I’m being facetious, of course. As a human being (promise) I’m genetically predisposed to not love the prospect of going extinct and surrendering our planet to artificially intelligent overlords. Then again, recent research suggests that we could have a lot to learn from a society dominated by AI.
Set up by a bunch of researchers from Stanford University and Google, the experiment essentially involved letting 25 generative AI “agents” loose in a virtual sandbox, visualised by the researchers as the kind of pixel art village that you’d find in Pokemon or Stardew Valley. Each had its own identity, goals, and role to play in “Smallville”, and researchers examined how they interacted with each other and their environments as they went about their respective days.
The result? A surprisingly heartwarming picture of a (pretty much) functioning village, with the generative agents producing “believable individual and emergent social behaviours” – albeit with a decent amount of manual prompting.
We’ve summed up the experiment, detailed in the paper Generative Agents: Interactive Simulacra of Human Behaviour, below.
SETTING THE SCENE FOR A WHOLESOME AI EXPERIENCE
I’m just going to say it: I’m jealous of the residents of Smallville. Taking the concept of the 15-minute city a step further, the small community of 25 sprites has access to a library, a cafe, a bar, a park, a college (with dorms), a couple of shops, several houses, and a co-living space, all within a few minutes’ walk along tree-lined paths. Each of these also contains interactable objects such as tables and bookshelves.
Unfortunately, the images of the Smallville and its inhabitants aren’t actually how the sandbox appeared to researchers. They’re just visual representations of the fictional framework that the researchers set up to conduct conversations between various instances of the AI chatbot ChatGPT. The agents “moved” around this virtual environment according to direct prompts, or in pursuit of pre-programmed goals.
AND WHAT IS A ‘GENERATIVE AGENT’ EXACTLY?
Each of the 25 individual Smallville agents drew on generative models to “simulate believable human behaviour”, something you’ll already be familiar with if you’ve booted up a conversation with ChatGPT, or similar chatbots. To encode their unique personalities, each also had a paragraph of backstory – including their occupation, and relationship with other agents – encoded as “seed memories”. It’s probably easier to include an example, cited in the paper:
“John Lin is a pharmacy shopkeeper at the Willow Market and Pharmacy who loves to help people. He is always looking for ways to make the process of getting medication easier for his customers; John Lin is living with his wife, Mei Lin, who is a college professor, and son, Eddy Lin, who is a student studying music theory; John Lin loves his family very much; John Lin has known the old couple next door, Sam Moore and Jennifer Moore, for a few years; John Lin thinks Sam Moore is a kind and nice man; John Lin knows his neighbour, Yuriko Yamamoto, well; John Lin knows of his neighbours, Tamara Taylor and Carmen Ortiz, but has not met them before; John Lin and Tom Moreno are colleagues at The Willows Market and Pharmacy; John Lin and Tom Moreno are friends and like to discuss local politics together; John Lin knows the Moreno family somewhat well – the husband Tom Moreno and the wife Jane Moreno.”
If that sounds a bit like The Sims, that’s because it is, and takes inspiration from the video game. Unlike Sims, the agents in Smallville can interact with each other in real language, exhibit what appears to be spontaneous behaviour, and make high-level inferences from information they’ve learned.
WHAT WAS THE AIM OF THE AI SANDBOX?
As stated in the introduction to the research paper, the experiment asks the question: “How might we craft an interactive artificial society that reflects believable human behaviour?” This is a question that’s been around for decades, the researchers admit, but new technology such as large language models (the thing that underpins ChatGPT, alongside other generative software) has unlocked the ability to simulate more dynamic personalities that learn based on their experiences, and react in real-time to one another.
WHAT DID THE AI AGENTS GET UP TO?
Oh, to live a day in the life of a Smallville resident, focused on your goals and oblivious to the wider world. In one example detailed in the paper, John Lin got up at 7am, brushed his teeth, showered, got dressed, ate breakfast, and checked the news at the dining table. When his son, Eddy, gets up an hour later, he “noticed” his dad at the table, and spontaneously began a conversation about working on a music composition for class – an example of an “emergent” rather than “pre-programmed” social behaviour. Even better, after Eddy left for school, John was joined by his wife, Mei, and recalled the conversation he just had with Eddy. “He’s working on a music composition for is class... I think he’s really enjoying it!” he told her, and she replied: “That’s great! I’m so proud of him.” Adorable, all around.
Elsewhere in the paper, agents are shown to have responded to new situations, introduced as external commands by the user. When the user wrote that an agent named Isabella’s breakfast was burning, for example, she went to turn off the stove, then remade her burned breakfast. If the user said her shower was leaking water, she’d gather tools from the living room and try to fix the leak.
These kinds of goals were even accomplished when they involved multiple agents working together, with some surprising by-products. In one case, a user programmed a single agent with the desire to throw a Valentine's Day party, and over the next two days invitations spread throughout other agents in the village, who coordinated to arrive on time, and even asked their “secret crushes” on dates.
DID ANYTHING GO WRONG?
Unsurprisingly, yes, there were a few flaws in the experiment. At times, the agents “hallucinated embellishments to their knowledge” or failed to remember certain events that they’d witnessed (whomst among us hasn’t, tbh). In another case, an agent named Yuriko described a neighbour named Adam Smith as the author of The Wealth of Nations, the magnum opus of the 18th-century economist of the same name. An easy mistake to make!
Of course, these are all reminders that the agents doing their little errands around Smallville aren’t actually that autonomous, or that intelligent... yet. Basically all they need to do is give off a “believable” sense of humanity, and the longer and more complex the simulations get, the less believable that’s likely to become.
WHERE DOES THIS LEAVE US?
“Generative agents, while offering new possibilities for human-computer interaction, also raise important ethical concerns that must be addressed,” reads the paper summarising the Smallville experiment.
One risk, it suggests, is the possibility of human beings forming parasocial relationships with generative agents, even if they’re aware that they’re “computational entities”. To mitigate this risk, it suggests following two principles: one, that agents should have to explicitly disclose their nature as computational entities, and two, that developers must ensure they’re “value-aligned so that they do not engage in behaviours that would be inappropriate given the context, e.g., to reciprocate confessions of love”.
Among other risks, the researchers state that generative agents might be used to replace human input in the design process, create harmful errors when deployed in the wild, or exacerbate existing AI risks related to deepfakes or other forms of misinformation.
On the plus side, they suggest that generative agents have “vast potential applications” beyond the sandbox demonstration. As they improve, for example, they could be deployed en masse to model human behaviour and test new social systems or theories. In the future, they could also be used to model an individual’s needs ahead of time, fusing with other technologies to make their lives more convenient and comfortable. At the very least, gaming could be on the verge of getting very, very immersive, with NPCs that actually converse and interact with the player in real time. Who needs real people anyway?