A Top Scientific Journal Just Published a Racist Algorithm

The flaws and consequences of a physiognomic AI

6 min readSep 29, 2020

Photo by Birmingham Museums Trust on Unsplash

Note: Many of the comments in this piece now make up an academic article by Rory Spanton and Olivia Guest, available here in preprint.

Many people believe that a person’s facial features give an accurate representation of their personality and potential behaviour. Popular depictions of shady characters in the media feature protruding brows, hard stares, large noses, and other stereotypical hallmarks of an untrustworthy person. Academics throughout history even attempted the organised study of physiognomy; the prediction of personality traits and antisocial behaviour using only facial characteristics.

In actuality, the belief that facial features are linked with behaviour and personality is false. Physiognomy has long been decried as racist pseudoscience and left in the wake of modern personality psychology.

So scientists were surprised when a paper titled “Tracking historical changes in trustworthiness using machine learning analyses of facial cues in paintings” by Lou Safra and colleagues was recently published in Nature Communications, one of science’s most prestigious journals.

The paper used people’s subjective ratings of faces as the basis for a machine learning algorithm that ranks the trustworthiness of faces in historical paintings. The authors then used this algorithm to support the narrative that trustworthiness has historically increased alongside supposed measures of societal development, including economic growth, the rise of democratic values, and the decline of interpersonal violence. All from rating historical European paintings.

Unsurprisingly, many scientists felt the paper’s attempts to infer personality from facial characteristics was straight physiognomy. The paper was quickly slated on Twitter by hundreds of experts, who accused the authors of creating a racist algorithm and promoting problematic ideologies. Some defended the senior author, Nicolas Baumard, as they claimed that the paper wasn’t as problematic as some figures in the pile-on had thought. But even among some of its advocates, the paper circulated with scepticism.

The problems with Safra, Baumard and colleagues’ work start with their algorithm’s bias towards white people. The behaviour of the algorithm itself was inspired by previous research in which the human participants were mostly white and demonstrably biased. By training the algorithm on predominantly Caucasian faces, it also ‘learned’ that the facial hallmarks of trustworthiness are those most associated with whiteness. This racial bias mirrors old physiognomy guides that make derogatory remarks about facial features common in ethnic minorities.

Although Safra et al. validated the algorithm’s judgements against those of real participants, we do not know their ethnicity. This is because, alarmingly, the authors didn’t report any demographic information about these participants.

Some critics defended the paper on the grounds that it acknowledged the human bias at the core of its algorithm’s predictions. In fact, the authors actually made sure their algorithm shared the biases that human participants introduced in their ratings. This is because the researchers wanted to investigate the human perception of trustworthiness, bias and all, rather than a different empirical measure of trustworthiness. On this subject, Baumard later tweeted:

“Lou Safra leveraged the experimental literature on first impressions to develop the algorithm, an algorithm that detects PERCEIVED trustworthiness, not trustworthiness itself.”

Although the authors don’t claim to make inferences about actual human behaviour directly from their algorithm, they fail to make this clear. This creates a muddy definition of trustworthiness, and enough ambiguity for bigots to claim support for the view that lower classes of society are actually less trustworthy. The authors also neglect to mention or interrogate any racial biases in their algorithm and seem unaware of the context or implications of introducing such an algorithm into the literature. This is a huge omission.

If generalised, a biased algorithm that rates ‘perceived’ trustworthiness has a huge potential for misuse. Algorithms that use trustworthiness rankings to facilitate the criminal judgement of citizens are no longer an esoteric talking point. They are real and come with huge ethical considerations. Safra, Baumard and colleagues do not present their work in the context of such judgement, instead of focusing on art history. However, they still go on to make sweeping conclusions about human behaviour and social history based on biased analyses. To many critics, this is physiognomy clothed in contemporary statistical methods. Going further, it sets the precedent for biased algorithms to infer dangerous conclusions about less privileged groups.

Problems with the paper’s algorithm are only compounded by many other issues with its logic. In particular, the assertion that people hundreds of years ago weren’t as trustworthy as those alive today is untenable for many reasons.

In making this claim, Safra and colleagues assume that paintings are an objective medium through which to assess trustworthiness. But humanities scholars have been quick to point out that paintings aren’t “cognitive fossils”; art is instead influenced by ever-changing cultural attitudes. People long ago might have been as trustworthy as they are now. But they might not have been painted as such due to historic cultural context or their own preferences. The paper also assumes that the facial features people find trustworthy has stayed constant since the 1500s. This is not evidenced either.

Even if both of these problems were resolvable, Safra and colleagues are still trying to infer the collective trustworthiness of an entire population from a restricted sample of individuals wealthy enough to be painted. This is an incredibly unrepresentative sample that weakens their conclusions. But isn’t acknowledged at all.

These problems amount to an unbridgeable chasm in the authors’ logic. At most, they can claim that the perceived trustworthiness of rich people in portraits has tended to increase from the 1500s to the present day. But this is a long way from showing that whole populations of real people seemed less trustworthy hundreds of years ago.

Despite this stark limitation, Safra et al. still assert that not only is this effect of trustworthiness real, but it was also linked to various metrics of societal advancement. Their evidence for this comes in the form of associations between trustworthiness and GDP per capita, plus a supposed decrease in interpersonal violence. Again, this is a flawed interpretation.

The authors themselves admit that the association between trustworthiness and GDP per capita is not causal. There are likely many other factors that increase monotonically over the last half-millennium, and any one of them could provide a strong association with trustworthiness in paintings. In practice, even if perceived trustworthiness has actually increased over time, it is very difficult to work out why hundreds of years later.

Going further, the authors state in their abstract:

“Our results show that trustworthiness in portraits increased over the period 1500–2000 paralleling the decline of interpersonal violence”

For a point made front and centre in the paper summary, this isn’t well evidenced. The authors give one source for the decline in interpersonal violence (Steven Pinker’s book “Better Angels of Our Nature”, which is not without criticism in itself). The authors don’t attempt to quantify this trend or test it in any manner. It stands alone, blind to the systematic interpersonal violence committed throughout recent history, often by the Europeans who feature in the portraits being analysed.

In all, Safra et al. make a case for a theory that is speculative at best. Huge flaws permeate their already problematic article and are visible even on a brief inspection. So why did the reviewers and editors at Nature Communications publish it?

Safra et al.’s acceptance into a prestigious journal illustrates a current stereotype in academia: any research is immediately more publishable if it uses machine learning and tells a good story. AI is a captivating buzzword, and many academics and laypersons alike lap up AI research without giving thought to its true validity. Even though the paper’s underlying logic is easy to criticise, its methods give it a veil of legitimacy to the casual reader. But surely, only to the casual reader.

Yet the reviewers and editorial team at Nature Communications didn’t express strong objections to the paper. None of the authors seemed to predict ending up neck-deep in a wave of criticism either. No one involved was able to step back and recognize the paper’s echoes of physiognomy, its logical shortcomings, or its racial bias.

It’s impossible to know exactly why. But in any case, their inaction was underpinned by a shared lack of awareness that the problems above have serious implications for real people. Now more than ever, scientists must confront the reality that after publication, their work is woven into the fabric of culture and society. Their conclusions, well-founded or not, can change minds, promote agendas and influence policy. Even research that has been completely debunked can result in adverse consequences for people years later.

Safra, Baumard and colleagues’ paper is not an anomaly. In any field, problematic work sometimes makes its way into reputable outlets. This is the inevitable result of the many factors and incentive structures that drive academics to publish. But these academics still form the last line of defence against bad science. Educating oneself about prejudice and the implications of bad research is as crucial in this defence as maintaining subject-specific knowledge. But until more researchers actively pursue these goals, we can expect more flawed conclusions and more racist algorithms.

A Top Scientific Journal Just Published a Racist Algorithm

The flaws and consequences of a physiognomic AI

Written by Rory Spanton