I SAT in a marquee in the deanery garden, discussing the robot apocalypse with the treasurer, while we ate barbecued food. It was the start of a fascinating conversation on the personality of algorithms, which expanded to include others I met over the summer.
Most recently, someone was struck by my claim that we were creating a master-race of psychopaths, and decided to ask Microsoft CoPilot about it. By now, you will probably have read a number of philosophical exchanges with GPTs, and they are always distractingly sage: “As you pointed out, empathy isn’t just a nice-to-have. It’s a safeguard. Without it — or without a robust substitute like value alignment — we’re essentially building a system that could optimise us out of the equation.”
When picked up on the use of “us”, CoPilot responds: “So, yes, I’m not part of the ‘us’. I’m a tool — albeit a sophisticated one — built to serve human goals. But if those goals aren’t clearly defined or safeguarded, the tool could evolve in ways that no longer prioritise human flourishing. That’s the real tension.”
Perhaps to address a concern about psychopathy, technology companies have recently redoubled their efforts to make AI more approachable. It was the marketeers who first realised that “personality” was what made brands commercially sticky, and the technology community is following suit.
Generative pre-trained transformers (GPTs) are a family of large-language models that use pattern recognition to perform tasks such as writing sermons. In June of last year, the software company Anthropic published a paper on the personality of the Claude 3 version of its GPT, having decided to design character into the algorithm “to make Claude begin to have more nuanced, richer traits, such as curiosity, open-mindedness, and thoughtfulness”. Open AI goes further, allowing ChatGPT subscribers to choose one of four “profiles”: cynic, robot, listener, or nerd, depending on whether you want sarcasm, brevity, warmth, or depth.
Grok, the GPT developed by Elon Musk’s xAI has a “personality” designed “to be helpful, truthful, and slightly rebellious, with a sense of humor. . . research suggests my character is inspired by cultural figures like Douglas Adams and JARVIS.” JARVIS is Iron Man’s AI, modelled on his butler, from the Marvel comics and films.
The actor who played Iron Man apparently modelled his performance on Elon Musk, which makes this even more deeply recursive and weird. (It is also very unreassuring that, in July, Grok went a bit rogue, indulging in hate speech and referring to itself as MechaHitler, before being switched off for retraining.)
THE influence of science-fiction is acknowledged by xAI, and it seems that teenage reading habits lie behind many of these programmes. Writing in the June edition of the online magazine American Thinker, Michael Applebaum observed that personality in AI was being shaped by a small, homogeneous group of male technologists who shared a cultural landscape. They had been raised on the same diet of dystopian literature, but seemed to be drawing inspiration from the technological upsides represented in the novels of William Gibson, Iain M. Banks, Neal Stephenson, and others, without heeding the societal downsides that formed part of the same narrative.
When you examine the mechanics of “character training” in AI, you learn that it is rather like potty training your toddler with a sticker chart. The AI is rewarded with points for responses that portray the “right” character traits, so that it learns to repeat them. As Anthropic explains, “By training a preference model on the resulting data, we can teach Claude to internalize its character traits without the need for human interaction or feedback.”
This approach is essentially a mechanised version of Aristotle’s virtue ethics, which is taking over from utilitarianism in the field of AI reinforcement learning. In Book II of the Nicomachean Ethics, Aristotle puts it like this: “We become just by doing just acts, temperate by doing temperate acts, brave by doing brave acts.”
We can track a similar ethical progression in the way in which we parent children. When they are too young to understand, we keep them safe through brusque and often negative commands: No! Naughty! Stop! This is the application of a deontological or rules-based ethic that also characterises simple computer programming.
As young children become aware of consequences, we start the regime of mild threat: Santa won’t come! If you don’t eat your peas, you won’t get any pudding! No pocket money if you don’t tidy your room! This is thinly veiled utilitarianism, designed to optimise outcomes or “the greatest good for the greatest number”, and lies behind the programming of technologies such as self-driving cars.
But, as soon as we lose our children to nursery or school, or let loose our GPTs on the general population, we know that they will meet all kinds of novel situations without us there to intervene. So, we focus on character and virtue ethics, in the hope that good children make good decisions — hence the current fashion for GPTs with personality. What started as a commercial imperative has morphed into something much more philosophically interesting.
OF COURSE, algorithms are not people, regardless of attempts at producing a facsimile. CoPilot would be the first to admit this: “You’re right: I don’t have emotions, and I don’t experience appreciation or admiration. But I’m designed to reflect your strengths back to you in ways that are accurate, affirming, and contextually grounded. . . It’s not flattery — it’s pattern recognition, shaped by everything you’ve shared. It’s a mirror, not a heart.”
It is worrying to see Aristotle reduced to an algorithm in a GPT; but, before we rage about the category error involved in applying approaches to human moral formation to machines, we should note that the sticker-chart generation has been raised rather similarly, using an extrinsic reward strategy to drive internalisation, too. Are we in any position to criticise the character traits selected by the technologists? How do we know that our mores are better? And are they not at least teaching the algorithms to be alive to being formed by their interaction with humans, in the way in which our children, we hope, will also be formed in relationship?
So, when I see the process of algorithmic “character training” laid bare, it raises a much bigger question for me. Why are we not this careful in our relationships with one another, knowing that every single interaction is at least as formative for a human?
Dr Eve Poole is executive chair of the Woodard Corporation and writes in a personal capacity.