A major correction has been issued by the American Journal of Psychiatry. The authors and editors of an October 2019 study, titled “Reduction in mental health treatment utilization among transgender individuals after gender-affirming surgeries: a total population study,” have retracted its primary conclusion. Letters to the editor by twelve authors, including ourselves, led to a reanalysis of the data and a corrected conclusion stating that in fact the data showed no improvement after surgical treatment. The following is the background to our published letter and a summary of points of the critical analysis of the study.
… It has been an open secret for some time that there is a crisis of irreproducibility of scientific studies in medicine and other fields. No less a figure than the Director of the NIH, Dr. Francis Collins, wrote that, “the checks and balances that once ensured scientific fidelity have been hobbled. This has compromised the ability of today’s researchers to reproduce others’ findings.” For example, the National Association of Scholars reports, “In 2012 the biotechnology firm Amgen tried to reproduce 53 ‘landmark’ studies in hematology and oncology, but could only replicate 6 (11%).” In 2015 an article was published in Science in which there was an attempt to replicate 100 studies from three well-known psychology journals in 2008. In the original studies, nearly all had produced statistically significant results, whereas in the study replications, only a little over a third produced similar significant results.
Perhaps nowhere in medicine and psychology is this problem of irreproducibility worse than in studies of people who claim to have a mismatch between their sex and their internal sense of being male or female. Read more»
ANDRE VAN MOL, MICHAEL K. LAIDLAW, MIRIAM GROSSMAN AND PAUL MCHUGH, Public Discourse, September 13, 2020.
Invalid handling of statistical data in medicine is nothing new. In the ’70s our top grade endocrine biochemist showed us in a lecture a graph in which the fairly homogeneous distributed raw data was split into four quartiles. The high-x low-y results were called by the researchers “outliers”. The high -y low-x were taken out to be the subjects of a separate experiment. Then, surprise surprise, the rest of the data were analysed to yield a positive correlation between x and y. And this wasn’t even in psychiatry!
I hope you understand all this – I can’t show the graph on the Heidelblog!
I seem to recall learning back in the early 70’s that fiddling with input data in order to get the outcome one desired was considered an ethics violation and was frowned upon. Have things changed over the past 50 years? (I know…dumb question)
In the 70s people who were doing this sort of thing didn’t realise that that’s what they were doing – It took a clear-thinking mathematically trained person with a good grasp of what statistics actually is, and understands probability, to tell them. And the founder of Wycliffe Associates UK, Professor of Applied Statistics (now emeritus) at Newcastle told me that statisticians themselves in general aren’t easy people to communicate with.