That much seems to be well agreed on. There are a lot of contributing factors — the pyramid-scheme-like training & employment setup, the recent propagation of soft money positions (universities hiring without setting aside money for salaries, expecting salary money to come out of research grants instead), a reduction in real money available for research at the same time as more people are applying for funding, and then the myriad issues arising from journal policies. Things like an emphasis on unexpected results, disinterest in publishing reproductions, allowing material to be published with scanty experimental details and, worse, heavily-processed data in the form of graphs and charts as opposed to raw data itself.
Which, if you’re interested in that sort of thing, probably the best place to start would be last year’s paper from Alberts et al., “Rescuing US biomedical research from its systemic flaws.” The authors provide a thorough, accessible introduction to many of these problems.
Today, I’m just going to focus on one issue: the difficulty of identifying scientific misconduct. As in, principal investigators publishing work that they know should not be published, at least not in the way they’re presenting it.
Which, sure, you could use the word “fraud” for this, but that’s a very harsh word that I don’t think is entirely appropriate to encompass the range of issues we ought to consider. Like work that’s presented misleadingly to make it seem more exciting than it really is. Or work that’s found out to be inaccurate (or partially inaccurate) after it’s sent for publication, but is never retracted. Classifying those types of cases with the same word used to describe fabricated data seems overly harsh, even though none of that is okay.
For starters: yes, this issue has been in the news a lot more lately than it used to be. Part of the problem is directly related to issues addressed in the Alberts article. Biomedical research seems more competitive now than it used to be, and the people involved are objectively more replaceable; the number of available persons with the necessary training per professorship has increased, largely because each professor needs a team of people to conduct experiments, and the way those teams are currently assembled is primarily from the ranks of trainees.
The other main driver for an increase in apparent fraud is that it can pass by undetected more easily now. Modern experiments are hard — the techniques require a lot of training to even understand the underlying physical principles, let alone to be able to correctly interpret data. Even researchers who’ve come to very different conclusions than myself about what ought to be done acknowledge that experimental difficulty is a huge issue for the reproducibility of modern work; you could read Mina Bissell’s article “The risks of the replication drive,” for instance, where she writes:
“So why am I concerned? Isn’t reproducibility the bedrock of the scientific process? Yes, up to a point. But it is sometimes much easier not to replicate than to replicate studies, because the techniques and reagents are sophisticated, time-consuming and difficult to master. In the past ten years, every paper published on which I have been senior author has taken between four and six years to complete, and at times much longer. People in my lab often need months–if not a year–to replicate some of the experiments we have done on the roles of the microenvironment and extracellular matrix in cancer, and that includes consulting with other lab members, as well as the original authors.”
So, let’s start with that. Experiments are sufficiently involved that they might be difficult to reproduce even if the results were correct. And let’s set aside the issue of whether or not results are robust; in her article, Bissell points out that tiny variations in cell line can have dramatic impacts on their response to assays. So it might be fair to wonder in those cases whether observed results matter, but not whether they’re correct.
But there’s a confounding factor, because some results presumably are not correct. Both John Ioannidis’s article “Why Most Published Research Findings Are False” and Chabris et al.’s article “Most Reported Genetic Associations with General Intelligence Are Probably False Positives” consider the math behind p-testing, the typical ways that researchers collect and process data, and journal publishing policies to show that our current system is likely to yield many inaccurate results. Indeed, there seem to be enough incorrect results published that many drug companies no longer trust academic results, as described in Begley and Ellis’s article “Raise standards for preclinical cancer research” (although it’s worth noting here that Begley and Ellis are basing their conclusions on numerous instances in which famous results could not be reproduced. Bissell’s objections, as mentioned above, could still apply).
Okay. That’s probably a lot to take in, especially since this is all just preamble to the point I wanted to make today. Let me give a quick summary: experiments are hard. That makes testing reproducibility hard. But some of what’s out there is in fact not true (almost assuredly, based on the numbers, although this can only be proven on a case by case basis).
So, how do we separate the cases of things being not true because, yes, science is hard, and it’s easy to reach incorrect conclusions, and it’s a journey, a journey in which no one expects to unravel all of nature’s mysteries right away… cases where a published result is innocently not true for those reasons, and cases of fraud?
Because there is fraud in our current system. Papers knowingly not retracted. Inaccurate data published. Data work-ups that obscure questionable aspects of a group’s results. And it’s very difficult to draw attention to: when people do find out about it, they tend to be powerless underlings. An ethical stance against fraud can easily result in a destroyed career: consider Peter Whoriskey’s article about a Johns Hopkins researcher who drew attention to some problems with research he was working on and was summarily fired, then harassed. Or you could consider Yudhijit Bhattacharjee’s article about Diederik Stapel, a sociology professor who simply made up all his experimental results for decades. Because of his past successes, his students were extremely reticent to confront him about suspected fraud, allowing the scam to persist far longer than you might imagine it should’ve. And, and this is an excellent point that Bhattacharjee made, “Fraud like Stapel’s — brazen and careless in hindsight — might represent a lesser threat to the integrity of science than the massaging of data and selective reporting of experiments.”
And, right, the reason why I’m writing about this today? There is an anonymous platform for commenting on scientific papers called “PubPeer.” And, look, I know that anonymous bullying is a problem, that there are lots of horrible issues with systems like Yik Yak allowing people to write vituperative comments online with no accountability. But PubPeer seemed like an important service simply because it can be so difficult for outsiders to know when there are flaws in scientific research, and because the insiders who do know, and who might have an incentive to report the truth, could suffer grave consequences for doing so.
And yet… PubPeer was ruled against recently. A judge said that PubPeer should be forced to turn over identifying information for an online commenter who supposedly cost Fazlul Sarkar a job at the University of Mississippi. To me, this is bad, because any suspicion of fraud not validated by a university investigative committee could be considered defamation — but university investigative committees are often slow to act and do not necessarily protect the careers of informants. It seems bad to take away an anonymous venue for potentially spreading scientific truth, especially since this is an arena where the anonymity really is important for the accuser, and should not be important for the accused: the accused has objective science to defend themselves with if they are in the right. If Sarkar had separate pieces of data for all his published experiments, he could have shown that data to whomever at the University of Mississippi to quell their concerns. Whereas a commenting graduate student or post-doc or whomever has no guarantee that levying an accusation won’t render him or her permanently unemployed in the field.