It Turns Out There’s Not a Lot of Science Linking Testosterone to Violence
When Bad Studies Become Pro-War Political Tools
In a military courtroom at Ft. Benning, Georgia, on March 15th, 1971, prosecutors made their closing arguments against Lt. William L. Calley for the most notorious massacre of the Vietnam War. Three years earlier, dozens of US soldiers had murdered, raped and mutilated scores of unarmed civilians of all ages in the Vietnamese hamlet of My Lai. Though more than a hundred men participated in the atrocities, Calley was the only person ever convicted.
The brutality and scale of the violence at My Lai were beyond belief to most Americans at the time. But an explanation of sorts could be found on the front page of the Washington Post: nestled alongside an article reporting Calley’s trial was a second article proclaiming, “Army psychiatrists are studying the relationship between male sex hormones and aggression to find a way to keep irrational killers out of the military.”
No one was directly saying that this particular war crime could be chalked up to a ferocious case of testosterone (T) poisoning, but the juxtaposition of the articles, and the quotes from research psychiatrists about their T studies, undoubtedly alluded to a connection between T and the massacre.
“We’re trying to weed out people who can’t handle their aggressions—people who are so aggressive that they haven’t learned how to control it,” said Dr. Robert Rose, one of the psychiatrists whose work was featured in the report.
One of the most familiar and enduring stories in testosterone’s authorized biography is that it drives violent aggression. This association didn’t start with the My Lai massacre. In their examination of concepts of race in US medical and scientific thinking, the historians of science Evelynn Hammonds and Rebecca Herzig show that mid-20th-century experts considered endocrine disorders, also known as “endocrinopathies,” to be an important cause of crime.
Nevertheless, the Washington Post’s pairing of Calley’s trial and Rose’s research is a stark example of how scientific facts are, to quote the anthropologist Amade M’charek, “less about discovery and more about the making of reality by assembling heterogeneous material.” It’s also a great example of how scientific facts, once established, are so difficult to dislodge.
The notion that T drives violent crime is like a zombie, a fact that seemingly can’t be killed with new research or even new models that would make old research irrelevant or subject to new interpretations. Because it is so widely accepted, this zombie fact shapes understandings of criminal violence as a matter of individual or group biologies, constraining the remedies we can pursue or even imagine.
From the beginning, the science and scientific narratives that link T to aggression have been about both individual bodies and broad social problems or trends: Why are men violently aggressive more often than women? How to weed out the overly aggressive soldier? How to know who is destined to be a violent recidivist, versus a person who could be rehabilitated? These aren’t mere matters of scientific interest; they are political as well.
At the time his T studies were paired in the Washington Post with the story of My Lai, Rose was a young civilian psychiatrist at Walter Reed Army Medical Center. He would go on to become a superstar in psychosomatic medicine, a new interdisciplinary field that concerned the complicated and reciprocal relationships between physiology and psychology.
Though most of Rose’s T research on human aggression was conducted together with Major Leo Kreuz, a psychiatrist at Walter Reed Army Hospital, Rose had an especially high profile because of the extensive research on T in primates that he conducted before and during his human studies. While he hasn’t worked on T in decades, his work with soldiers and prisoners helped to shape the field and still anchors the idea that high-T men are more violent.
When the Washington Post previewed Rose’s research, he and Kreuz had just completed a landmark study on prisoners at the Patuxent Institution in Jessup, Maryland. Right around the time the study was published, it was also covered in the New York Times, where the question of how to measure aggression took center stage. Because many of the ultimate findings would counter existing research on T and aggression, both researchers and journalists wondered if the discrepancy might point to the problem of how to measure aggression.
As the Times reporter put it, “Is it observable outward acts . . . or is it negative aggressive feelings?,” noting that one researcher had identified at least nine different kinds of aggressive behavior, including fights over territory, fear-based aggression, and males fighting over females. Jostling over which measure was best for studies of humans was more than an abstract debate; saying that the researchers must have just used a bad measure was also a way to salvage the idea that T is linked with aggression even when studies failed to support that idea.
In retrospect, given these measurement concerns, Kreuz and Rose’s choice of the maximum-security prison at Patuxent may seem ideal. Studies of prisoners had always held a special place in the literature on T and aggression, as being convicted for violent crime is often seen as an especially valid measure of real-world aggression, both more meaningful and more objective than measures based on self-report, personality assessments, or behavior in contrived laboratory situations.
When Kreuz and Rose did their study, inmates were sentenced to Patuxent under the “Defective Delinquent Statute,” a law that was unique to Maryland and allowed indefinite confinement of “habitual criminals considered to be a clear danger to society for psychiatric treatment at this institution.”
Kreuz and Rose set out to test the idea that aggressive prisoners would have higher T levels than nonaggressive prisoners. They selected 21 inmates and divided them into two groups, “fighters” and “nonfighters.” “Nonfighters” were reported to have been in one or no prison fights; “fighters” had to have been in two or more. Three different aggression scores were calculated for each inmate.
The first score was culled from prison records and included items you might expect, such as physical fights, making threats, or destroying property, but also some items that you might not expect in an inventory of violent behavior, such as cursing or refusing to obey officers. The second set of data came from three standardized, written psychological tests to assess things like subjective feelings of aggressiveness. Finally, they looked at past criminal offenses, including the type of offense, frequency of offending, and age at each offense. All told, they examined close to two dozen behavior measures in the sample of just 21 men.
And then there was the testosterone. To make sure they got reliable T measures, they drew each man’s blood multiple times, always first thing in the morning to control for T’s diurnal rhythm. This was considered top-notch research at the time: they were being very thorough, and if there was a relationship between T and aggression, they were bound to find it.Despite hundreds of citations that characterize this study as proving a link between T and aggression, the study actually undermines that link.
Here’s the kicker: despite hundreds of citations that characterize this study as proving a link between T and aggression, the study actually undermines that link. How is this possible? Unsurprisingly, the number of reported fights a given prisoner had correlated to a broad pattern of aggressive behavior while in prison. But T remained stubbornly innocuous. T levels did not predict whether someone was in the “fighter” category, displayed additional aggressive behavior in prison, or scored higher on psychological scales of aggression.
When their original plan left them empty-handed, Kreuz and Rose examined the men’s records of past convictions for crimes of physical violence like assault and murder. Again, they found no relationship with T. They then cut the data yet another way, examining the kinds of crimes for which men had been convicted before age 19, some of which had likely occurred over two decades earlier.
With that last move their persistence paid off. The men with what they called “more violent and aggressive offenses during adolescence” had significantly higher T levels than men without such adolescent offenses. They also came at this link from the other direction, starting with T levels. The five men with the highest T levels were all categorized as having committed “violent and aggressive” crimes in adolescence. None of the five men with the lowest T scores had this kind of adolescent offense.
To drive the contrast home, the researchers showed that all the men with the most extreme T scores, whether high or low, had adolescent convictions for the (implicitly nonviolent) crimes of larceny or burglary; it was only the “more aggressive or violent crimes” during adolescence that were associated with T.
How did Rose and Kreuz get to the link between aggression and T? Hardly a straightforward test of a hypothesis, this was more a chronicle of how to repeatedly massage data until it yields the desired result. They wanted to believe. We call this “the Mulder effect.”
Fox Mulder is an FBI agent on the long-running and newly re-made TV program The X-Files. “I want to believe” is the phrase on the now-iconic UFO poster behind his desk. In this science fiction drama, Mulder and his partner, Dana Scully, investigate unsolved cases with paranormal elements, with a heavy emphasis on aliens. In the series, the role of FBI investigator is more like that of a scientist who searches meticulously through evidence with the aim of uncovering the truth.
But for Mulder, like all scientists, objectivity is elusive. As media critic Laura Bradley describes,
Wanting to believe is Mulder’s core vulnerability. And for Mulder, wanting to believe is different from blindly believing. He is still an FBI investigator, of course, always trying to examine the facts—even if he occasionally searches for facts to support his theories instead of theories to support the facts.
As we read the studies on aggression and other characteristics that are supposedly linked to T, it seems that existing beliefs about T are so strong and well elaborated that it might be difficult for both scientists themselves and their readers to see the machinations it sometimes takes to get the data to fit that belief.
Recall that Kreuz and Rose abandoned their initial hypothesis when it didn’t pan out. They then came up with a new hypothesis, using the only item in their data that did correlate with T: having been convicted as an adolescent for a crime they characterized as “aggressive or violent.” This type of post hoc analysis is currently recognized as a major and widespread problem, especially in behavioral sciences.How did Rose and Kreuz get to the link between aggression and T? Hardly a straightforward test of a hypothesis, this was more a chronicle of how to repeatedly massage data until it yields the desired result.
As it was reported in the Washington Post, however, the study took on a more ominous tone. Painting the men as having uniformly and extremely violent histories by naming only assault and attempted murder as examples, the paper reported that “‘young violents’. . .secrete more testosterone than other prisoners.”
In addition to mischaracterizing their juvenile offenses, the wording seems to imply that high T was correlated in real time with committing violent crime. And what was the connection to the My Lai massacre and the conviction of Lt. William Calley? The Post’s reporter explained that “the Army researchers are trying to take these results and apply them, eventually, to the selection of soldiers.”
To wit, soldiers should be aggressive, but not too aggressive. In Rose’s words, “A good soldier puts his energy to a task. He uses his aggression that way. . . .We don’t want young violents. It’s important for a solder to function in a group, not to go off and act aggressively on his own.”
At the height of protests against the war in Vietnam, this was more than a scientific statement; it was a political intervention that seemed to undermine activists’ claims that it was an immoral war characterized by organized murder. The data from the Patuxent study are weak, but this specific enactment of T supports the story that the Department of Defense and the Nixon administration favored during Calley’s trial: My Lai wasn’t a gross failure of the armed forces, but the tragic and criminal result of an individual “young violent” who couldn’t harness his aggression constructively.
Using the term “young violents” to describe both the prisoners with higher T and out-of-control soldiers, Rose sweeps over the enormous gulf that exists between escaping from a juvenile detention center and the mass murder, rape, and mutilations committed at My Lai. He does so by linking them both with T, though neither Calley nor any of the other soldiers who committed the crimes at My Lai were actually known to have high T. It’s not a scientific link—it’s a narrative one.
Excerpted from Testosterone by Rebecca M. Jordan-Young and Katrina Karkazis. Copyright © Rebecca M. Jordan-Young and Katrina Karkazis 2019. Reprinted with permission from Harvard University Press.
Previous ArticleSpanking, Signing, Reading:
On the Medieval Use of Hands