I Replicated a Famous Political Psychology Experiment. The AI Did Better Than Humans.
A modern multimodal AI classified a balanced set of congressional portraits with 76% accuracy using grayscale face crops alone, substantially exceeding the approximately 57% accuracy reported for human participants in Rule and Ambady (2010).
Part I: The Original Experiment
In 2010, psychologists Nicholas Rule and Nalini Ambady published a paper with a provocative claim: Americans could distinguish Democrats from Republicans simply by looking at their faces. The study, published in PLOS ONE under the title Democrats and Republicans Can Be Differentiated from Their Faces, explored a question that sits at the intersection of psychology, politics, and human perception.
Rule and Ambady assembled photographs of candidates who had competed in United States Senate races and presented those images to participants without names, party labels, or other identifying information. Participants were asked a simple question: was the person in the photograph a Democrat or a Republican? Human observers classified political affiliation correctly at rates of approximately 57%, modestly but significantly above the 50 percent accuracy expected from random guessing.
The researchers then sought to understand why such judgments were possible. Additional analyses suggested that observers associated certain facial impressions with particular political identities. Faces perceived as more powerful, dominant, or authoritative tended to be judged Republican. Faces perceived as warmer and more approachable tended to be judged Democratic. The authors argued that participants may have been relying on political stereotypes that had become linked to visual impressions rather than consciously identifying party affiliation itself.
The study attracted attention because it suggested that political judgments begin long before voters evaluate policies, speeches, or voting records. Human beings appear capable of forming rapid impressions from faces alone, and those impressions may influence perceptions of leadership, competence, trustworthiness, and ideology. Yet the study left an important question unresolved. Were observers detecting genuine differences among politicians, or were they projecting cultural stereotypes onto unfamiliar faces?
Fifteen years later, advances in artificial intelligence provide a new way to revisit that question.

Congressional portrait dataset used in the AI replication of Rule and Ambady (2010). One hundred official congressional portraits, balanced between Democratic caucus members and Republicans, were collected, processed, and converted into grayscale face crops for classification. Claude Sonnet 4.5 achieved 76 percent accuracy on the resulting dataset. Composite image assembled by the author from official congressional portraits.
Part II: My AI Replication
Rather than relying on human participants, I conducted a modern replication using a multimodal AI model. The goal was not to determine whether political affiliation can be inferred from facial structure alone. A more modest objective guided the experiment: could a contemporary AI system reproduce or exceed the performance observed in the original human study?
Unlike the original study, which relied on human participants recruited to evaluate photographs, the replication required building a complete image processing and evaluation pipeline. I assembled a balanced dataset of one hundred congressional portraits consisting of fifty members of the Democratic caucus and fifty Republicans. Publicly available congressional portraits were identified, cataloged, and downloaded using Python scripts that automatically retrieved images and organized them into a structured dataset.
Image preparation proved more challenging than expected. Portraits varied widely in composition, resolution, cropping, lighting, and orientation. Automated face detection initially failed for several images, requiring additional quality control, validation, and replacement of a small number of portraits. Missing records were identified and corrected, face extraction routines were rerun, and the resulting dataset was manually reviewed to ensure that each image contained a single identifiable face. The final dataset contained one hundred usable portraits evenly divided between the two political groups.
To approximate the conditions of the original Rule and Ambady study, the images underwent substantial preprocessing. Each portrait was cropped to isolate the face and converted to grayscale. Names, state affiliations, party labels, background details, clothing cues, and color information were removed. The objective was not to maximize classification performance but to determine whether a detectable political signal remained after much of the surrounding contextual information had been stripped away.
The resulting dataset consisted of one hundred anonymous grayscale face crops. A second Python script presented each image individually to Claude Sonnet 4.5 through the Anthropic API. The model was instructed to classify each portrait as either Democratic caucus or Republican and return a confidence estimate. Responses were stored automatically and compared against the known labels. Additional code generated the confusion matrix, accuracy statistics, precision, recall, and F1 scores used in the analysis.
The experiment therefore involved several distinct stages: dataset construction, automated image acquisition, computer vision based face extraction, grayscale transformation, multimodal AI classification, and statistical evaluation. Modern AI systems make such replications possible at a scale and speed that would have been difficult to imagine when the original study was published in 2010.
The results were striking.
The model achieved an overall accuracy of 76%, substantially exceeding the approximately 57% accuracy reported in the original human experiment. Out of one hundred portraits, seventy six were classified correctly.
A closer examination of the confusion matrix revealed an important asymmetry:
| Actual | Predicted Democratic Caucus | Predicted Republican |
|---|---|---|
| Democratic Caucus | 30 | 20 |
| Republican | 4 | 46 |
Republicans proved considerably easier for the model to identify. Forty six of fifty Republicans were classified correctly, producing a recall rate of 92%. Democratic caucus members proved more challenging. Only thirty of fifty were correctly identified, yielding a recall rate of 60%.
The asymmetry suggests that the model may possess a stronger internal representation of what a Republican politician looks like than what a Democratic politician looks like. When uncertainty arose, the model frequently assigned Democratic caucus members to the Republican category. Twenty Democrats were classified as Republicans, whereas only four Republicans were classified as Democrats.
Such findings should be interpreted cautiously. The experiment does not demonstrate that Republicans and Democrats possess inherently different facial structures. Numerous alternative explanations remain plausible. The model may be responding to expression, grooming conventions, age distributions, portrait composition, candidate selection processes, or visual stereotypes embedded within its training data. Political professionals often cultivate distinct public images, and those presentation choices may create patterns that an AI system can detect even after substantial preprocessing.
Nevertheless, the findings are difficult to dismiss. Color information was removed. Names were removed. Backgrounds were removed. Party labels were removed. Despite those restrictions, the model identified political affiliation at rates well above chance and substantially above the performance reported for human participants in the original study.
The findings are also broadly consistent with later work by Michal Kosinski, who reported that machine learning systems could infer political orientation from facial photographs at rates substantially above chance. Kosinski relied on much larger datasets and different machine learning approaches, making the studies difficult to compare directly. Nevertheless, both investigations point toward the same conclusion: modern AI systems appear capable of detecting political signals in facial images more effectively than human observers.
The experiment therefore raises a deeper question than the one posed by Rule and Ambady. Their study asked whether humans could perceive politics in faces. A modern AI system suggests that the answer may be yes, but it also shifts attention toward a more challenging problem. What exactly is being detected?
Future experiments may provide clues. Classification accuracy could be compared across grayscale faces, color face crops, and full portraits. If accuracy remains high after progressively removing contextual information, the argument for facial morphology becomes stronger. If performance rises dramatically when color and presentation cues are restored, then the signal may reside less in faces themselves than in the cultural conventions surrounding political self presentation.
Either outcome would be informative. Fifteen years after Rule and Ambady first reported that humans could glimpse political affiliation in a face, artificial intelligence has reopened the question and produced an answer that is both more powerful and more mysterious.
Conclusion
The original Rule and Ambady study suggested that ordinary people could distinguish Democrats from Republicans from facial photographs at rates modestly above chance. Fifteen years later, a modern multimodal AI appears capable of making the same distinction with substantially greater accuracy. Using a balanced dataset of one hundred congressional portraits, grayscale face crops, and no identifying information, Claude Sonnet achieved 76% accuracy compared with the approximately 57% reported for human participants in the original experiment.
Accuracy alone, however, does not resolve the central question. The experiment demonstrates the existence of a detectable signal, but it does not reveal the source of that signal. Facial structure, expression, grooming, age, portrait conventions, and learned political stereotypes remain plausible explanations. The asymmetry observed in the results, particularly the tendency to classify many Democratic caucus members as Republicans, suggests that the model may have learned visual associations that extend beyond simple facial morphology.
Perhaps the most important lesson is methodological rather than political. Artificial intelligence now provides researchers with a powerful new instrument for revisiting classic findings in psychology. Questions that once required large groups of human participants can be explored rapidly, systematically, and at scale. In that sense, the experiment serves not only as a replication of an influential study but also as an illustration of how AI may become a valuable tool for testing and extending decades of social science research.
The next challenge is to determine exactly what the model sees when it looks at a political face.
Further Reading
- Rule, Nicholas O., and Nalini Ambady. "Democrats and Republicans Can Be Differentiated from Their Faces." PLOS ONE 5, no. 1 (2010): e8733.
- Todorov, Alexander, Anesu N. Mandisodza, Amir Goren, and Crystal C. Hall. "Inferences of Competence from Faces Predict Election Outcomes." Science 308, no. 5728 (2005): 1623–1626.
- Olivola, Christopher Y., and Alexander Todorov. "Elected in 100 Milliseconds: Appearance Based Trait Inferences and Voting." Journal of Nonverbal Behavior 34, no. 2 (2010): 83–110.
- Kosinski, Michal. "Facial Recognition Technology Can Expose Political Orientation from Naturalistic Facial Images." Scientific Reports 11 (2021): 100.