Adam Mastroianni on Peer Review and the Academic Kitchen
Feb 13 2023

DALL·E-2023-02-12-11.29.15-research-paper-mistakes-digital-art-300x300.png Psychologist Adam Mastroianni says peer review has failed. Papers with major errors make it through the process. The ones without errors often fail to replicate. One approach to improve the process is better incentives. But Mastroianni argues that peer review isn't fixable. It's a failed experiment. Listen as he makes the case to EconTalk host Russ Roberts for a new approach to science and academic research.

RELATED EPISODE
Luigi Zingales on Incentives and the Potential Capture of Economists by Special Interests
Luigi Zingales of the University of Chicago's Booth School of Business talks with EconTalk host Russ Roberts about Zingales's essay, "Preventing Economists' Capture." Zingales argues that just as regulators become swayed by the implicit incentives of dealing with industry executives,...
EXPLORE MORE
Related EPISODE
Nosek on Truth, Science, and Academic Incentives
Brian Nosek of the University of Virginia talks with EconTalk host Russ Roberts about how incentives in academic life create a tension between truth-seeking and professional advancement. Nosek argues that these incentives create a subconscious bias toward making research decisions...
EXPLORE MORE
Explore audio transcript, further reading that will help you delve deeper into this week’s episode, and vigorous conversations in the form of our comments section below.

READER COMMENTS

David M
Feb 13 2023 at 2:09pm

Great episode. Anyone who works in the peer review kitchen (and keeps their eyes open) will be familiar with its flaws. Adam presents those flaws convincingly and argues that the current system is a “failed experiment.” I’m sympathetic with that viewpoint!

There’s a big machine learning research conference called NeurIPS. In past years the organizers have run some interesting experiments on their own review process for proceedings papers. The results show that acceptance/rejection is incredibly random. I suspect that’s true of other venues in other fields.

Fast-moving fields like machine learning are heavily driven by arXiv preprints. This seems reminiscent of earlier days in science, where much of the communication was informal correspondence between scientists. Machine learning is already in Adam’s “wild west,” to some degree.

I’m really excited about innovations in scientific publishing. In the life sciences, the journal eLife is making very bold moves: manuscripts will be “selected for review”; those selected will be made available as preprints; and all of their reviews (positive and negative!) will be published alongside the manuscripts. This effectively replaces the journal’s “accept/reject” decision with a more textured assignment of credibility.
https://elifesciences.org/inside-elife/54d63486/elife-s-new-model-changing-the-way-you-share-your-research

Gregory McIsaac
Feb 15 2023 at 10:15am

The journal “Hydrology and Earth Systems Sciences” has a similar system of interactive and public peer review

HESS – Home (hydrology-and-earth-system-sciences.net)

Thomas Schaefer
Feb 13 2023 at 2:49pm

Since you brought up the story of Einstein’s interaction with the peer review process: Einstein was indeed annoyed, and he ended up publishing the paper somewhere else. But he also did read the review, which pointed to a serious flaw in his argument. Einstein realized that the anonymous (to Einstein — now we know who it was) reviewer was right, and when he resubmitted the paper the conclusion was exactly the opposite of what it had been before.

While the original paper argued that gravitational waves could not exist, the final version showed that they did. Gravitational wave were finally discovered in 2015, long after Einstein’s death.

The episode is described here https://physicstoday.scitation.org/doi/10.1063/1.2117822 , and in the physics community it is viewed as a prime example for how peer review should work.

Maureen Wood
Feb 14 2023 at 3:57am

Hi Russ,

Thanks for doing another Econtalk on statistics.  I just love these episodes and this one rang very true for me.

I was a bit dumb-founded that a Nobel prize winner had such faith in the numbers that that was enough for him.  As I tell my friends and family (since I work with statistics all the time), numbers are not enough. It has to be plausible and as much as possible, there also has to be a mechanism.   Years ago you did a podcast on food studies with Julia Belluz that got to the heart of this point.  I feel sometimes that various researchers are so in love with their work that they miss , or deliberately ignore, the weakness of the connections and that the noise that can come between their independent  variable and dependent variable is too great to produce anything conclusive

I contend that journals need to ask their referees to go beyond the numbers and look at issues of plausability, correlation and data noise, and context.  Specifically, journals should be asking their referees if their result is plausible from a logic standpoint.  If not, they need to do the experiment again and again to make sure they did not get a random result.  Another thing they should do is consider the “noise test”, i.e., do they have a direct enough line to the starting point and the end point to make firm conclusions.  Finally, researchers should also make sure they have evaluated also the context.  One of my favorite examples of a failed context test is the contribution of hospitals to climate change. https://climatedata.ca/case-study/the-effects-of-climate-change-on-hospitals/  Why hospitals?  What about 8 jillion other industry sectors (that are not mentioned or evaluated) but just lumped together as “industry”?  Is this really significant compared to, say, the contribution of airplanes?  It probably isn’t.  But here it is being trumpeted by policymakers as an important finding.

The “pollution causes dementia” studies are also badly misrepresented for this reason.  The finding passes the plausibility test, as some researchers have shown that there may be a mechanism, but the epidemiological studies don’t pass the noise test.  And also I am not sure if the result also passes, the context test, given that there are far greater concerns that can come from chronic pollution and living in an urban area generally.  Yet again, the media loves the story and for all intents and purposes, most laypersons think it’s done and dusted.

I am a researcher in chemical accident risk.  I can say that the safety perspective tends to drive more statistical honesty and self-searching than other fields because you actually can quantify what you might have to lose if you’re wrong.    Perhaps this is why I get so frustrated with the skim-the-surface approach to publication of findings.  Findings presented imperfectly can have negative impacts and for this reason, scientists ought to give all their proposed outcomes a 3D analysis before they publish them.  I also applaud Dr. Mastroianni’s recommendation to evaluate what are the critical statistics in your study and make sure they are thoroughly  validated.  Bravo.  Everyone should be doing this. It’s scientific.

Doug Iliff
Feb 14 2023 at 10:41am

“I contend that journals need to ask their referees to go beyond the numbers and look at issues of plausability, correlation and data noise, and context.”

And one more thing: in 1986 a young emergency room physician published a letter in the New England Journal of Medicine under the title “The Significance of Statistical Significance,”  One paragraph read as follows:

“Almost any investigation can be made to yield statistically significant results, no matter how trivial clinically, by including enough subjects.  That is to say, a blood-pressure difference of one point, or a difference in death rate of one per thousand, between two groups can be made to yield a P of 0.05 or 0.01 if a large enough sample is chosen (an increasingly common situation in this era of multicenter trials).  Significance testing serves as a check on sampling error, which is only one of many threats to the validity of a given set of data, and not as a substitute for common sense…. With respect to small differences between large groups, I would suggest that the crucial question is not Is the difference statistically significant? but rather, Is the difference practically important?”

The good news is that Russ and Adam (along with other interviewees over the years) are continuing to beat this drum.  The bad news is that nothing has changed for a long time.

Shalom Freedman
Feb 14 2023 at 4:06am

Peer-review we are told at the beginning of this conversation is not the gold standard ideal process most laymen like myself have long thought it to be. It also has not been  around as some of us assume since the very beginning of modern science. Russ especially articulate guest Adam Mastroianni artfully describes the shortcomings of the process in advocating a more open and publicly accessible means of scientific publication. A truly fascinating conversation.

Niklas Jakobsson
Feb 14 2023 at 8:17am

Liked the episode! I think you should interview Abel Brodeur, chair of the Institute for Replication. I would believe that he shares some of the ideas of Adam, but instead of opting out of the pr process, his idea is to try to improve the situation by doing lots of replications. More on the I4R can be found here: i4replication.org

Doug Anderson
Feb 14 2023 at 10:30am

I loved this episode as well.

It reminds me of a recent observation that we never know as much as we think we do– that there is probably much more uncertainty in the things we believe to be true than it is practical for us to believe.

There is an analog to this in how new software systems are rolled out. They are almost always better than what preceded them, but  often in some ways there are worse. But we assume they are not worse in any way and are surprised when some new processes need to be changed to accommodate the new software.

So it is with new knowledge we have deemed to be true.

ian
Feb 15 2023 at 5:40am

Lots to love about this episode. I work in the humanities, where the dynamics are slightly different but the broad issues are shared.

* I suspect that institutions need peer review more than our knowledge does – decisions like who to hire, who to give tenure to, who to promote all makes use of the imprimatur of passing peer review, and those are the incentives that really matter in the modern system. Russ’s comments/questions at the end reflect this, I think.

* I’m a huge fan of the idea of everyone just pulishing their research and then we have some reddit style upvote/downvote system, or allow discussion & review below the article. This absolutely seems like a system more conduicive to the size of C21 academia and more likely to generate discussion and debate. Relatedly, the book review system is terrible for really providing discussion.

* I also agree strongly with Adam’s early point that, when looking for new readings or background to my latest project, I couldn’t care less where an article is published – I fire up google scholar or my uni’s library database and search for titles. My personal publishing strategy reflects that: I don’t want to wait 2-3 years to publish something in the best journal in my discipline, even if they’d have me. I want to publish somewhere reputable enough as soon as possible and let people get reading. Maybe that’s a career mistake, who knows.

* I’m suspicious of those 30%, 50%, 80% of research leaves no trace stats – I’m no star by any means and not all of my work is great, but everything I’ve written has been cited, or caused someone to get in touch with me by email, or generated some form of response. I don’t think this would be the case if high proportions of research was essentially being unread, at least in my niche.

* I also think that the ‘half my CV is worthless’ comment is not necessarily correctly interpreted: we don’t know in advance what work will be valued or prove important & often the not so good work leads to the great stuff, so even if it looks less valuable in hindsight, that’s not a fair judgement. It’s a necessary consequence of a “publish everything and see what sticks” model that much will not stick.

David Gossett
Feb 16 2023 at 6:23pm

Publish or perish is a trainwreck. In the physical sciences, you can run an experiment 150 times. The second you get a good result, you photograph it and start writing your paper. I know Ph.D. candidates who followed graduates and could not replicate their research only weeks after it was completed. I think Mastroianni is understating the problem. It does not work on a good day. Throw in an H-index, and the incentives work against any good research. Federal labs need to stop publishing in journals immediately—post drafts for comment and feedback.

That’s the physical sciences. Social, including economics, fails before the hypothesis is even typed up. On the social side, emergence and complexity constantly change the underlying data (pattern degradation over time). An economist is sitting in a boat. The water is moving the economist all over the place. How do you write a paper under these conditions? Social research cannot be done in a vacuum because the world constantly shifts under the researcher, IMO.

David Reinstein
Feb 17 2023 at 10:24am

It’s a big undertaking to actually check the results of a paper, which is why it’s virtually never done. Although that is, of course, maybe the single most important thing that this process could do, rather than provide some kind of aesthetic judgment.

The Institute for Replication  is doing this — paying people to actually check and stress-test the results of a paper. This can be done. We just need to put incentives into this, and reduce the pressure to ‘vomit out a million published papers.’

David Reinstein
Feb 17 2023 at 10:27am

Russ Roberts: And, you’re talking about empirical work. There’s theoretical work as well, where there’s a mathematical proof, say, or an intellectual, analytical set of postulates and analysis. …  the referees don’t actually read the paper. They kind of eyeball it. They say–I think what we say to ourselves is, ‘Well, if this person is at such and such university, I’m sure they got the equation–I’m sure the math is right. … I’m not going to literally check their equation. That would be tedious. Take hours.’

At The Unjournal we are making all evaluations public, and paying the evaluators (reviewers). We’ve had at least one reviewer do exactly this, and discover an important error in some of the math. They helped clarify this (the proposition stands anyways but with a much more intricate proof), and this has been confirmed by thee authors … and should benefit future readers.

 

Gregory McIsaac
Feb 17 2023 at 3:00pm

I doubt that any system of vetting or peer review can guarantee that all published articles will be 100% true. More thorough and rigorous reviews probably increase the likelihood that published information is mostly true and broadly interesting, but mistakes will be made.  It is important that mistakes are acknowledged and corrected in a timely manner, and the current peer review system has demonstrated some capacity to do so.  Indeed, many peer reviewed articles present data and arguments that are critical of other peer reviewed articles. Perhaps peer review should be seen as a price of admission for entering a discussion/debate among specialists and scholars.

There is much variation in the quality of peer reviews, which makes generalization difficult. Some journals accept almost anything and others accept almost nothing. Ditto for reviewers.  Having some knowledge of the review process allows for weighing the information that is published in different journals.

I’ve been involved in peer review since about 1986 as a peer reviewer, editor and coauthor. Although cumbersome, I have found it useful for getting different perspectives, sometimes detecting errors and suggesting better ways to conduct analyses or express ideas. I have been one of the reviewers who looks at the data, and this has become easier as some journals require that the data used be made available.  I’ve never been directly compensated for serving as a referee or editor. I don’t consider this time wasted because I often learn something from reviewing manuscripts. It is a way of keeping track of changes in methods and theories.  Serving voluntarily as an editor is an indication that your peers respect your judgement, which can be useful in academic promotion and tenure, and thereby may provide some indirect compensation.

A large percentage of peer reviewed academic articles are of limited scope and interest because they are essentially student projects, in which the students are learning the practice of science and learning how to communicate their work to an audience of relevant specialists.  While the content value these manuscripts may be minimal to the broader society, there is often considerable value to the student in going through the process.

In my experience, there are usually more than two referees. Generally, there have been three, except for high profile journals like Science or Nature, which may have five. Additionally, an associate editor also serves as a referee of the manuscript as well as a referee of the other referees. The associate editors are usually people who have broad knowledge of a specialized field, and who try to recruit appropriate referees for each manuscript. An associate editor’s recommendation to publish or reject is not just a matter of counting the votes of the referees, but a matter of judging the manuscript itself and the quality of the referees’ comments and recommendations. All this is reviewed by one or more senior editors, all of which takes time, but electronic submission has streamlined the transfer of documents and generally reduced the time from submission to decision.  And yet mistakes still occur.

The peer review system has difficulty with highly innovative, interdisciplinary, and “out of the box” ideas because of the difficulty of identifying peers who can effectively evaluate such ideas.  Fortunately, there are other venues for sharing those ideas.

David Zetland
Feb 19 2023 at 5:40am

Love this discussion, which touches on an important (but vulnerable) part of academic culture: Intrinsic motivations.

Peer review “works” when everyone has the best of intentions. It doesn’t work when people, explicitly or not, give too little attention to (positive or negative) results.

Around 10 years ago, I published “An auction market for journal articles” that realigns some of the extrinsic motivations (e.g., allowing editors to bid for papers — and get rewards for good papers; giving referees a stake in improving papers, etc.) as well as “fixing” evaluation of papers.

I remember vividly how Mike Munger (asst editor at the time) accepted this paper: “It’s crazy but definitely worth discussion.”

So, discuss 🙂

Prufer, Jens and David Zetland (2010). An auction market for journal articles.” Public Choice 145(3-4): 379-403. [open access]

Abstract:
We recommend that an auction market replace the current system for submitting academic papers and show a strict Pareto-improvement in equilibrium. Besides the benefit of speed, this mechanism increases the average quality of articles and journals and rewards editors and referees for their effort. The “academic dollar” proceeds from papers sold at auction go to authors, editors and referees of cited articles. Nonpecuniary income indicates the academic impact of an article — facilitating decisions on tenure and promotion. This auction market does not require more work of editors.

Jeff
Feb 21 2023 at 12:44am

I did some research and wrote on a paper on a similar topic a while back.  I believe he’s wrong with regards to the peer-review process. The system has been around for 250+ years (started in 1731 and then more widespread by 1752) – see: Spier, R. (2002). The history of the peer-review process. TRENDS in Biotechnology20(8), 357-358.

“1752 the Society took over the editorial responsibility for the production of the Philosophical Transactions, at which time it adopted a review procedure that had been used previously by the Royal Society of Edinburgh as early as 1731. Materials sent to the Society for publication were now subject to inspection by a select group of members who were knowledgeable in such matters, and whose recommendation to the editor was influential in the future progress of that manuscript. This type of review is sometimes regarded as the beginning of the peer-review process, and many other societies, including the Literary and Philosophical Society of Manchester, adopted similar procedures whilst publishing a disclaimer as to the accuracy of the published material”

I’ll grant that a) there may have been other journals and systems that didn’t use this and b) it doesn’t mean that this system is perfect….but I had to write my first comment here because I hate when I start a podcase and the first thing the interviewees says is debatable at best.

Jeff
Feb 21 2023 at 12:52am

Just wanted to add that the article that Adam cites (Csiszar, A. (2016). Peer review: Troubled from the start. Nature532(7599), 306-308) – also suggests this and notes that at least by 1845 papers were being rejected “1892 A paper surfaces that was rejected by a Royal Society referee in 1845, outlining the kinetic theory of gases more than a decade before James Clerk Maxwell’s famous paper. Might referee systems be fundamentally flawed?”

Tim Vickers
Feb 26 2023 at 8:53pm

Had I, like the peer reviewers, actually taken the to read all the comments, I’d probably know the answe to my question which is thus: is AI being used to analyze data sets and replicate assertions?  If so, what have we gleaned from it’s use and, if not, why not?

Comments are closed.


DELVE DEEPER

Watch this podcast episode on YouTube:

This week's guest:

This week's focus:

Additional ideas and people mentioned in this podcast episode:

A few more readings and background resources:

A few more EconTalk podcast episodes:

More related EconTalk podcast episodes, by Category:


* As an Amazon Associate, Econlib earns from qualifying purchases.


AUDIO TRANSCRIPT
TimePodcast Episode Highlights
0:37

Intro. [Recording date: January 17, 2023.]

Russ Roberts: Today is January 17th, 2023 and my guest is psychologist Adam Mastroianni. He is a postdoc research scholar at Columbia University's Business School. His Substack newsletter is Experimental History. Adam, welcome to EconTalk.

Adam Mastroianni: Thanks so much for having me.

0:55

Russ Roberts: Our topic for today, Adam, is a shocking and exhilarating essay that you wrote on peer review. It is not often that 'peer review' and 'exhilarating' appear in the same sentence, but I loved your piece. It blew my mind for reasons I think will become clear as we talk.

Let's start with the idea behind peer review. If you asked normal people--people not like you and me--who are what I would call believers in the system, what would they say is the whole--how is this supposed to work?

Adam Mastroianni: I think probably most people haven't really thought about it, but if you asked them to, they would go, 'Well, I assume that when a scientist publishes a paper, it goes out to some experts who check the paper thoroughly and make sure the paper is right.' Maybe if you really push them to think about it, they would say, 'Well, they probably maybe reproduce the results or something like that, just to make sure that everything is ship-shape; and then the paper comes out. And this is why we can generally trust the things that get published in journals.' Of course, we know in any system, obviously, sometimes things slip through.

And, all of that is a totally reasonable assumption about how the system works; and it is not at all how the system works. And I think that's part of the problem.

Russ Roberts: You could argue it's kind of like how the king might have a taster.

Adam Mastroianni: Yes.

Russ Roberts: Or two--even better. I mean, if the taster has got some idiosyncratic defense mechanism against toxins, having two people taste the food, it's making sure neither die--it's just a good system.

One of the things I learned for your paper--I didn't really learn it, but I often emphasize how there are a lot of things we know that we don't really remember to think about. One of the things that your paper reminds me to think about is that this system--which of course I grew up in over the last 40 years as a Ph.D. [Doctorate of Philosphy]--this system is kind of new in the History of Science. It hasn't really stood the test of time. It's an experiment, you call it.

Adam Mastroianni: Yeah. I think this is something that a lot of people don't understand because--I think this is true across the board of human experience, we assume that whatever world we were born into unless told otherwise, this is just kind of the way it's been forever.

And so, there's sort of this cartoon story I think in a lot of people's heads that somewhere in the 1600s or 1700s, we started doing peer review. We had journals; and before that, it was people writing manuscripts in the wilderness or whatever. Before that it was Newton publishing his stuff. But then we developed modern science, and it's been that way since.

And, that cartoon story just isn't true: that it is true that around the 1600s and 1700s we have the first things that look like almost they could be scientific journals that we have today, but they work very differently. A lot of times they're affiliated with some kind of association and their incentives are different. They want to protect the integrity of the association. And, they're just one part of a really diverse ecosystem of the way that scientists communicate their ideas.

So, they're also writing letters to one another. There are basically magazines, or for a long time scientific communication looks much more like journalism looks today: that they cover scientific developments as if they are news stories.

So, you have a bunch of different people doing a bunch of different things, and it really isn't until the middle of the 20th century that we start centralizing and developing the system that we assume today has always existed. Which is: if you, quote-unquote, "do science," you send your paper off to a scientific journal. It is subjected to peer review and then it comes out. And all of that is very new.

4:38

Russ Roberts: Well, you kind of made a unintentional leap there. You said, 'And then it comes out.' That's if it's accepted.

Adam Mastroianni: Yes, exactly.

Russ Roberts: And, for listeners who are not in the kitchen of journal submission, rejection, or acceptance--sometimes revise and resubmit, it's called--or some flags are raised and questions are raised, flags of things that might be wrong and you have a chance to try to make the people who reviewed it happy. The people who review, by the way, are called referees in most situations, and there's usually two. So, that is the modern world.

The other thing that you haven't mentioned is it takes a really long time. It's kind of, again, I think shocking for people aren't in this world.

What happens is you submit your paper and you--there's a tendency, especially when you're younger, as you are, Adam, relative to me, to sit by your inbox. In the old days it was a mailbox, but now it's an email inbox--kind of like: Any day now, because I sent it, what, three hours ago, I'll be getting a rave review from my two referees, and the editor will say, 'I am thrilled to publish this in its own supplemental celebratory edition of our journal because it's so spectacular and life-changing for the people in the field.' But in fact takes a very long time.

Sometimes people are sent a paper to referee and they decide they don't want to, but they don't tell the journal editor right away--eventually--because they think, 'Maybe I'll do it.' Then they eventually tell the editor, 'You know, I just don't have time.' The editor sends it to someone else. And, even when the two referees agree to review it, they don't review it quickly. There's no real--sometimes there's a sort of a deadline, but it's a very frustrating experience for a young scholar. Right?

Adam Mastroianni: Yeah. My experience so far has been that if there's only a year in between when you first submit the paper and when it comes out, you're doing pretty good.

Russ Roberts: Shocking.

Adam Mastroianni: And, that's assuming that you get it into the first place that you submit it, which is not the average outcome. Other places it could take years; and certainly if you are rejected from one journal or a few journals, it could take multiple years. And this is part of why I think so many people I know come to despise the things that they publish by the time that they get published.

Russ Roberts: We should add that--and again, this is only for the cooks in the kitchen--there are a lot of papers that rejected even if they are true, because they are not worthy or considered worthy of the journal. Your [?] are sort of top tier and then there's second tier, then there's third tier journals. So, you might aim high. The referees might say, 'Oh, this paper is fine. There's nothing really objectionable in it. But, the results are not that interesting. I don't think it merits publication in the Journal of Fascinating Results.' And so, you're going to have to send it to the Journal of Somewhat Interesting Findings. Right? That's a common phenomenon.

Adam Mastroianni: Yes. And, the funny thing from the user standpoint of science--like, when I'm working on a project and I want to know what has been done that's relevant to this, I truly do not care which journal it was in. And so, all of this work that was done to figure out, like, 'Okay: should this go out to a mailing list of--' I don't know how many people Nature or Science emails. Say, it's a hundred thousand, versus it should go out to 20,000 people, or whoever. It doesn't matter to me because now I just want to know: what did people do? And, the letterhead on the top of the paper doesn't matter.

So, all that work when someone is actually trying to use the thing turns out to be unimportant. This is done mainly for purposes of figuring out who should have high status.

Russ Roberts: Ooh, definitely kitchen, inside-kitchen remark. One other thing, again, for people, not in this world, at least in economics--and I don't know about other fields as much, but I think it's often true, at least in economics--the person who is reviewing the paper, the referee, knows who wrote it. Not always, but even when you don't know, you can usually figure it out because of what the topic is. Or you can read the bibliography and see which author got cited the most times--often a hint.

But, the person who wrote the article often almost always does not explicitly know the reviewer. So, it's called a blind review. It's not double blind, but it's a blind review from the perspective of the author. Often authors will thank, quote, "an anonymous referee" for a helpful comment.

The only other thing I would add, again, is that most of the time papers are not rejected because they're not true. They're rejected because they're not interesting, or they're not profound, or the results are not sufficiently important. Or they're not completely convinced. There might be things left out.

So, the revise-and-resubmit comment from a referee is: You know, you didn't deal with this. Deal with this and maybe we'll take it.' And that just adds another layer of delay and uncertainty about the final publication result.

Adam Mastroianni: Yeah. And this is where I think a lot of people misunderstand what the process is doing. They think what's mainly happening when a paper is under review is that it's being checked. And so, someone looks at the data, someone looks at the analysis.

But, most often, nobody is looking at the data. Nobody is looking at the analysis. It actually takes a ton of time to vet a paper to that level. You'd have to open up their data sets--which, by the way, often they're not provided. You don't have to. And, sometimes you do, but a lot of times you don't. You'd have to redo all of their analyses.

It's a big undertaking to actually check the results of a paper, which is why it's virtually never done. Although that is, of course, maybe the single most important thing that this process could do, rather than provide some kind of aesthetic judgment.

When I encounter a paper, I'd love to know, 'Well, did anybody just rerun the code and see if there's some kind of glaring issue? Or if the code actually works? Or if the data actually exists?' Whatever aesthetic judgment the reviewers applied, I mean, I am also, like, an expert consumer. I can look at it, too, and go, 'Oh, I'm not completely convinced.' But, maybe I'm getting ahead of myself here. But also, I don't even get to see what the reviewers said. Most times, most places don't publish the reviews.

So, all that I know is the reviewers said--they didn't say enough disqualifying things to prevent it from being published in this journal. But, I don't know if they said, 'I'm really convinced by this point, but not that point.' Or, 'Here's another alternative explanation that I think warrants inclusion.' I don't get to see any of that as a consumer, because generally the reviews disappear forever once the paper is published.

11:41

Russ Roberts: And, you're talking about empirical work. There's theoretical work as well, where there's a mathematical proof, say, or an intellectual, analytical set of postulates and analysis. And it's--I think--well, you claim and I'm afraid you're right, at least often, that the referees don't actually read the paper. They kind of eyeball it. They say--I think what we say to ourselves is, 'Well, if this person is at such and such university, I'm sure they got the equation--I'm sure the math is right. I mean, they wouldn't make, like, an algebraic error. So, I'm not going to literally check their equation. That would be tedious. Take hours.'

The only question I'm going to generally answer as a referee is: Is this result interesting? Is it consistent with the claims, or the claim is consistent with each other? Does the person deal with previous literature that's been written on this? Is this novel?

But, it becomes the real question--which your essay tells [?] quite frankly, which is--I mean, it's an interesting idea. It sounds plausible. Does it work?

Adam Mastroianni: Yeah. Does peer review work?

I mean, it really depends on what you hope to get out of it. My position would be, no. In part because I think what we would all like to get out of it is some kind of checking. We'd like to know if the papers that we're reading are true or not.

The system obviously doesn't do that.

And, it doesn't do that, but it comes at extreme costs. So, we've talked about how long it takes the paper to get through the process, but there's also the time spent by people reviewing it, which one paper estimates that as 15,000 person-years, per year. Which is a lot of years, especially when these are scientists. These are people who are supposed to be working on the most pressing problems of humanity, and instead they're spending a lot of time sort of glancing to get papers and going, 'Eh, not interesting. This one is interesting.'

And a lot of those papers will never be cited by anybody. It's really hard to get a precise estimate of the number of papers that are never looked at by anybody ever again. But, we know that it's not zero. And, I think a reasonable estimate in the Social Sciences is something like 30%. And, that would probably go up if you exclude papers that are only ever cited by the people who wrote them. And so, that's a lot of time spent on a paper that didn't even matter in the first place.

Russ Roberts: Yeah. The number I saw recently was 80%--that basically 80% of papers are never looked at again. A bit harsh. Could be true. You have to be[?] a referee to see whether that's a true statement.

14:26

Russ Roberts: To be fair to listeners out there who are in this world, some of them are sitting here, sitting listening with things saying, 'This is the most cynical bunch of nonsense I've ever heard. I've reviewed dozens and dozens of papers in my time. I take my responsibilities over every extremely seriously.' You get paid by the way, often. Not always, but often--a modest amount. And, sometimes--there's been a big innovation in recent years--you get paid more if you do it in a timely fashion, which is pleasant. I mean, it's nice for the submitter, the author.

But, how do you answer that? Come on. You're claiming people don't read the paper? You have no evidence for that. That's just a cultural armchair thesis. And: 'I'm a serious reviewer. I make sure the papers are right; I read them carefully; I vet them. And I am confident that the papers I have published--or less true.'

Adam Mastroianni: To that reviewer, I'd say, 'Thank you for your service. And, you are a lone hero on the battlefield.' Because there have been studies done where they look at, well, on average what reviewers do. The British Medical Journal, when it was led by Richard Smith, did a lot of this research where they would deliberately put errors into papers--some major errors, some minor errors--send them out to the standard reviewers that the journal had, get the reviews back, and just see what percentage of these errors did they catch.

On average across the three studies that they did on this, it was about 25%.

And, these were really important and major errors. For instance, the way that we randomized the supposedly randomized controlled trial wasn't really random. Which is really important. That's, like, a very key error to find. If you're doing a randomized controlled trial, it needs to be randomized.

And for that particular error, only about half of people found it. And, that's a very, like, standard one to look for. That should be very forward in your mind when you are looking at a paper.

And so,--and I've heard from them as well, people who take their job really seriously. And I think they are the minority. What's most important about the system is how it works on average. I think on average it doesn't work very well--certainly, at catching major errors.

You can see this--another piece of evidence is: When we discover the papers are fraudulent, where does that happen? And, you would think that if it was happening--if people were vetting the papers, it would happen at the review stage. And it's hard to find the dog that didn't bark, but I've never heard a single story of a fraudulent paper being caught at the review stage. It's always caught after publication.

So, the paper comes out; and someone looks at it and they go, 'That doesn't seem right.' And, purely of their own volition--and, these people are the true heroes--they just decide to dig deeper. And find out, 'Oh, it's all made up,' or 'the data isn't there.' Often this is someone from within the world that the paper was published, so it's someone in the same lab, who goes, 'I just know that there's something creepy going on with these results.'

There was a big case in psychology last year, where a paper came out 10 years ago. This paper about signing at the top versus at the bottom: If you sign a form at the top--ooh, this is a good story. The paper was all about if you sign your name at the top of a paper where you have to attest to something--in this case it was how many miles you drove a car. So, obviously there's some incentive to lie on this because the fewer miles you drive the less you have to pay. And so, if you sign at the top, you should be more honest and you should report more miles than if you sign at the bottom. It's like a very cutesy kind of--

Russ Roberts: Why? What's the logic?

Adam Mastroianni: It's because of psychology. I don't know. This is kind of what we do. 'Oh, you're reminded of--you're not anonymous,' and--sorry, the thing you're signing is specifically like, 'I'm going to be honest.' And so, if you do that at the beginning, you're going to be more honest than if you do that at the end.

And so, they found that this is true in some real world data. I mean, this data turns out to not be real world because the data was obviously made up.

That paper comes out. It's put in PNAS [Proceedings of the National Academy of Sciences], which is a very prestigious journal.

And, ten years go by. And, someone tries to replicate the results and they can't do it. And so, they publish their failure to replicate. That's all great.

As part of publishing that failure to replicate, they also publish for the first time the raw data from the original study, which had never been published before.

And, someone takes a look at it and notices that there are some weird things. For instance, it's an Excel spreadsheet and half of the data is in a different font than the other half of the data. Or, you also notice that if you plot the distribution of the miles that people claim to drive, it's totally uniform--which is really weird because when people report their miles, they almost certainly report--you know, they don't report 3,657. They report 3,600 or 3,650.

But, people were just as likely in this data to report 57 as they were to report 50.

And so, if you basically look a little closer, you realize that, like, this data is obviously fabricated, the effect that they tried to show. They just added some numbers to the original data. There's a great blog post on Data Colada who are some psychologists who do a lot of work on replication.

So, all of that happened 10 years after the original paper was published and all the detective work couldn't even have happened at the beginning because the data was never made available to anybody.

So, if we're not catching it at the review stage, what exactly are we doing?

20:02

Russ Roberts: Now, listeners may remember that back in 2012, I interviewed Brian Nosek, who is also a psychologist and has been a very powerful voice for replication. And, again, if you're not in the kitchen, you wouldn't realize this: Replicating someone else's paper is almost worthless historically in over the last 50 years of this process. And, if you have suspicions and a result might be true, you think, 'Well, I'll go find out. I'll do it again.'

Well, if you find out that it is true, nobody wants to publish it. There's nothing new there.

You find out it's not true: maybe it isn't, maybe it is, but it's not a prestigious pursuit to verify past papers.

So, what Brian and others have done in this project is to try to bring resources to bear, to encourage people to do these kind of checking. And, results have been deeply disturbing--how few results replicate. Particularly in behavioral psychology, but that's just because that's where they started.

I think it'll end up coming to economics. We know it's also true in medicine. Certainly true in epidemiology. And, Brian and his co-authors, Jeffrey Spies and Matt Motyl had a early version of your essay summed up in one beautiful phrase: Published and true are not synonyms.

Adam Mastroianni: Yes.

Russ Roberts: Now, most people would say, 'Oh yes, well, more or less,' or 'kind of,' or 'yes, of course there's some things that don't quite work.' But, your point really is, I think, more thorough. And, Andrew Gelman has also been on the program and Brian back in 2014[?]. Talk about all these issues: why fraud is an extreme case. There are many, many cases that are not fraudulent, where, due to various biases toward publication and other incentives that academics face, many of the results that are published in academic journals are not replicable. Or--excuse me--do not replicate when actually tested.

Why doesn't it? Why is this project that we--okay, we were making fun of it a little bit at the beginning--but on the surface, the idea of having an esteemed expert in the field verify or at least check or vouch for an unpublished result that it merits publication: that would seem to be a pretty good system. Why isn't it working? What kind of intuition do you have? I have my own. I'll share it, but what's yours?

Adam Mastroianni: Yeah. I think part of it is that to really vet a paper requires a ton of work. I think it's much more work than we expect referees to do, and it's certainly much more work than they actually do. If you have to crack open a data set and load it into your statistical software and run the analyses, almost certainly it's going to be really hard to do that because--well, maybe the data isn't labeled properly. Maybe they haven't provided the data for you. Maybe they haven't annotated their code in such a way that it's going to make it--or even given you their code.

So, all those things take a ton of time. I think that's one reason why you have to pay a ton of attention.

I mean, the other is: if you want to replicate their study--I mean some studies, yeah, you can just throw them on an online platform where you get participants like MTurk [Amazon Mechanical Turk] or something like that. And, the Data Colada guys, the Open Science Framework guys, this is mainly what they do. They take studies where you can do that.

Now that alone, it costs you at least several hundred dollars to run that study and maybe an afternoon. Anything more than that--so, say you wanted to replicate that study where they got actual data, supposedly from a rental car company where people were reporting how much they drove. I mean, that's going to take you months, if not years, to replicate that.

And so, it's really hard to check whether these results are actually true.

And I would add something additional, which is: I think in my home field of social psychology, for most results, it does not matter whether they're true or not. And, this is, I think is something that's often missing from, when we talk about replication, we don't have a good sense of: What are the things worth replicating? We know that these things got published, but do they actually matter? I think in most cases the answer is No.

We've spent years, now, in my field trying to figure out a phenomenon called ego depletion, which is just this idea--I mean, I can't think of a more charitable way to describe it other than: When you do boring stuff, you get tired. And there's all this controversy over, like, 'Did that really happen? If you do this task where you have to cross out every E on a page, does that make you more likely to eat a cookie than to eat a radish afterward?'

And, I just can't get to the bottom of why we care. It's obviously true that you can get tired when you exert effort. I don't know what we could find that would make us say that, like, you don't get tired. All we're really figuring out is whether this particular task creates this particular result. And I don't see any evidence that particular task and that particular result are all that important.

This is maybe unique to psychology, where we don't really have a paradigm. So, it's really hard to know what matters and what doesn't. We don't have a periodic table where we go, 'Well, we're missing number 67, so we really need to go find number 67.' We're really scrounging around in the dark.

25:31

Russ Roberts: So, as an economist, my summary of your insight about the time it takes is that: just the incentives aren't there. There's a certain principle and conscientiousness that's expected for reviewers, for referees. You don't get paid very much. Sometimes not at all. And, there's very little professional gain. You do ingratiate yourself sometimes with an editor, which is pleasant. They will or maybe look favorably, you might hope, on your future submissions when you're on the other side of the fence, but there's just not much return to it, so people don't take it terribly seriously.

If you have a rarefied and incorrect view of academic life, you would say, 'Oh, but you won't care about the truth.' And of course, most scholars do. But, if it means spending a year replicating the results and then corresponding with the author about--via, by the way, via the editor, keeping the process anonymous--about: 'What did you guys do with the zeros? Did you code it?' If they didn't answer, 'Did you code that as--you never told me,' right? Because there's a thousand decisions that you make in empirical work that are like that. And so, it's--it just doesn't happen. And, that's sad.

But this replication crisis--which again, I think its home base is in psychology, but it's spreading much more widely--is, it's kind of dramatic. It's not a very good system, the idea of peer review.

So, one more point, and I think listeners are saying, maybe, out there: Did they call this exhilarating? I mean, it's kind of interesting. But, for those of you who are still listening, here comes the exhilarating part. There's really two parts to it and the first is: You say something quite profound, although I'm biased to love it, so I confess that.

You say that this system is worse than nothing. It's one thing to say, 'Okay. So, sometimes it doesn't do its job. Sometimes it lets papers through that are maybe imperfect or rejects papers that are true, but not, quote, "interesting".' But, you say it's worse than that. What's your argument?

Adam Mastroianni: So, I think if a system claims to be doing something, but it doesn't actually do it--if it gains trust that isn't warranted--I think it leaves people worse off because they've been misled into trusting something that they shouldn't have trusted.

And so, the analogy that I use is if the FDA [Food and Drug Administration]--if you found out that the way the FDA figures out whether beef is tainted or safe enough to sell on the market, they just send some guy around to sniff the beef and if it smells bad, he goes, 'Oh, don't sell it.'

If that's all they're doing, I think you'd be really upset. Because of course he's going to catch some things, but most of the tainted beef is not going to smell bad enough for this guy to catch it.

And so, if you had known that that's actually what was going on, you might have done something else. You might have not purchased beef in the first place. You might have been willing to pay a company that will show you how well they vet the beef, to vet all your beef for you. Maybe you go in with a bunch of consumers. Maybe private enterprise steps in.

But, what's really bad is to think that a system is trustworthy when it isn't.

And so, an example of this is this whole thing about vaccines causing autism was in large part fueled by a paper in The Lancet--which is an extremely prestigious medical journal--with an N of 18 being like, 'Hey, there's some kids who have autism and they also had vaccines.' It was sort of like the standard of evidence. It's stamped with the imprimatur of The Lancet. And so, people take it really seriously. When, like, that paper wasn't ready for prime--that claim wasn't ready for the imprimatur of all the medical establishment going, 'Now we know vaccines cause autism.'

I think--what I've heard is--at the time, actually it was worth looking into. And that's what was the state of the evidence was. That's not the way that people take the evidence when we think that peer review gives things the glow of truth. That's part of why I think it's worthy[?] enough.

Plus, it also costs a ton of money and time. And so, if it gives us an unclear benefit, but the cost is very obvious, that I think is also evidence that's worse than nothing.

30:01

Russ Roberts: Yeah. I have to--I'm not going to name any names, but there's a long list of books that claim to give us the latest science on various aspects of life. And, that science is always peer-reviewed journal articles, many of which are false. And yet people are reading them thinking they're true, because, after all, 'It was in The Lancet. It was in the Journal of--whatever.' And so, that's the first point.

The second point is given the idea of how widespread this is, this illusion--because someone might say, reasonably, 'Oh, Adam, come on. Everybody knows it's a somewhat flawed process. Nobody really--'. I don't think that's true.

I'm going to give you the example of Daniel Kahneman, Nobel Prize winner in economics, who has a chapter in his book Thinking Fast and Slow on the phenomenon called priming. And, priming is this idea that you say words related to, say, old people, and then you find out they walk more slowly. And--I think we've talked about this on the program before--I've always found not to pass the sniff test. But it turns out it doesn't pass the replication test. So, my nose is not really the criterion by which we decide whether it's true or not.

But, I think what's interesting about that is that what Kahneman wrote about it--and I salute his honesty. I'm going to read you very short quote. He said, quote, "I placed too much faith in underpowered studies," referring to these primary results that were done with very small samples. And, when larger samples came along, they did not replicate.

Here's what he wrote--and this is the interesting part. He said:

My position when I wrote Thinking Fast and Slow was that if a large body of evidence published in reputable journals supports an initially implausible conclusion, then scientific norms require us to believe that conclusion. Implausibility is not sufficient to justify disbelief, and belief in well-supported scientific conclusions is not optional....

Meaning: Okay, it doesn't pass the sniff test, but it was peer reviewed.

So, your intuition might be, 'That's ridiculous. There's no way people do that.' But: it was peer reviewed, so you have to accept it. It's science.

So, here's a Nobel Prize winner confessing that he has fallen prey to the very phenomenon you're talking about. This isn't just people writing books purported to convey science.

Adam Mastroianni: Yeah. Which I think is really unfortunate: that, like, you should aspire to have the appropriate level of trust in a system given its level of performance. That's something that I really try to cultivate in my own mind: that, I know that when I see a paper in a published journal, that really all the vetting that's been done is someone took a glance at it and thought it was interesting and didn't notice on a very light read any obvious errors.

And so, when I read that paper, my initial reaction is always: Could be. This could be true.

And, if I really want to know whether it's true or not, I unfortunately have to apply additional effort to it.

And, the amount of ideas that I'm actually willing to do that for is actually very small. And, this is why I think, like, most people are very worried about whether studies replicate.

I worry about whether it mattered whether they were done in the first place. Because, when I look back in the papers that I published and I look at, 'Okay, of all the citations in here, what percentage were critical to my paper?' Like: What percentage of citations, if they turned out to be false, would affect my paper negatively? Rather than just I'd have to delete a sentence?'

And, on average my answer was, like, about 12 to 15%.

And, most of those are citations to the statistical packages that I use: That, if those aren't processing the numbers correctly under the hood, that is very bad for me. My results are unlikely to be true.

And, there's only a handful in any paper of: This is an empirical result that if it didn't happen, I'd be in big trouble.

And so, for those, I'm willing to expend the effort to see whether they're true or not and really look at them closely.

For anything else, when I look at it, I go, 'Well, could be. I don't know. It's an interesting exercise to think about whether it might be true or not.' But, I'm not going to bank on it.

34:14

Russ Roberts: Yeah. And this is where we get to the first conclusion that I think is quite profound. But, again, I confess it plays so much to my biases. I'm open to worrying that I'm overconfident in it. But it is amazing to me, that: most people believe that more information is better. I have a book out, Wild Problems, where I talk about the challenge of making decisions when we don't have reliable information--whether to get married, whether to have children, how many, whether to move to a new job, etc. You don't have a reliable empirical source. You don't have a bunch of information at all.

And, a lot of people respond to that book by saying, 'Well, okay. Yeah, it's not perfect, but you should try to get more information. And, the more you have, the better because you can make a more informed decision.'

And, that requires an assumption that I think is not true. And, your point, I think illustrates this beautifully, Kahneman's point illustrates it beautifully: More information is better only if you can weigh it properly. Only if you can assess it properly. If you overreact to it, if you are overly confident in the information, the imperfect information you get, you make a different kind of error.

And, I think this is so unintuitive and so unpleasant. I've gone the other way. I find this really pleasant--because, again, it plays to what I like to think about the world and the reliability of much empirical work. But I think it raises a question for people on the other side that maybe you're being a little too overconfident about our ability as so-called rational people and the ability of our reason to consume information well.

And, when I, quote, when I complain about, you know, empirical findings, people say, 'Well, what's the alternative? Using your gut?' And, the answer is, 'No. It's understanding that the information you have is not always reliable.'

It's amazing how hard that it is to remember.

Adam Mastroianni: Yeah. I think behind this feeling of 'I need to get more information and that'll help me make a better decision,' especially in science, is this idea that we are at the end of science. That, in the past these were people who were just groping around in the darkness. They had no idea what they were doing. But, fortunately we were born into an era where we have it all figured out. Now, obviously, there's a few mysteries left and we're doing some stuff on the margins. But, like, really we're pretty much just crossing the t's and dotting the i's.

I just think that that is so remarkably wrong that history is going to continue--I mean God willing, history will continue after our lifetimes and humanity will continue to produce science and uncover knowledge.

And we will look ignorant very soon. And I think the thing that we'll look most ignorant for is thinking that we weren't ignorant.

And I see myself as one person in a very long history of humans. And, I think a really noble and beautiful history of trying to make us a little bit less ignorant. But, there isn't some threshold that we cross when we go, 'We have come out of the dark ages and now we understand.'

We maybe understand a little bit more than we did before. I think when you feel that way, you look around at the information coming out and you go: All of this could wash away tomorrow. Any of this could be completely untrue. Our understanding of the world could be revolutionized in the next 10 years. And so, don't get too attached to anything.

You need to make decisions and you need to figure out what to do, but this level of certainty that I think that we want is impossible to get. So, you need to come to terms with the fact that mainly we operate in the darkness.

And, I think it's actually very exciting because there's so much left to do and so much left to discover.

Russ Roberts: Yeah. I encourage listeners to go back--we'll link to it--to the episodes with Robert Burton on certainty. We have a real urge for it. It's very comforting. We don't like darkness. We like the light. We like to look where the light is for our keys under the lamp post; and often that's not where the keys were lost. I also would mention the Chuck Klosterman episode, But What If We're wrong? I think a lot of people believe: 'Well, that was a few years ago we had to worry about that, but not now. Now we know a lot more.' It could be true. As you say, could be.

38:37

Russ Roberts: This brings us to, to me, the deepest insight of your essay, which I think applies to many, many things besides peer review, which is the following. I've been in this area now for over a decade at least, going back to my interview with Brian Nosek. I've thought about it a lot, taken a lot of complaints from my peers in economics for my skepticism about empirical work. It's colored the way I see essays like yours. I've confessed to that.

But, deep down, if you ask me, 'Well, what's to be done about it?' And, my natural thought has always been, 'Well, we just have to make it better. I mean, we need better incentives.' The Open Science Project with Brian Nosek is a perfect example. He has funding to give people incentives to do checking and replication, which doesn't pay much in prestige, but now we can compensate people. We'd have journals about it. They can add to their resume, and so on and so forth. And, we just need a better system.

But, you jumped the shark, Adam. You actually suggested it's over. We tested it for the last 60, 80 years or so. It doesn't work. Get rid of it. Is that a correct summary? I mean, it's a shockingly bold, outside-the-box idea. I loved it because I realized it never crossed my mind; and it opens up the possibility there are many other areas where my inability to think outside the box is hampering my ability to come up with interesting alternatives. Is that a good summary?

Adam Mastroianni: Yeah. I mean, I would certainly say it doesn't work; and that is what I really would like people to have in their heads. And then: what do we do next?

I think there's many answers to that question. Some of the people that I heard from feel the same way that you did: that, really, we have to make it work. We have to tweak it and we have to build new things. And, I say, let a thousand flowers bloom. Clearly some people are invested in that, so go for it. I think I am uniquely uninterested in trying to control other people's behavior and to say, 'Nobody should do this and everybody should do that.'

What I'd like to do is make a historical claim, which is 'This is new and weird,' an empirical claim, which is 'This doesn't seem to do the thing that we intend for it to do,' and then leave it to the diversity of humanity to figure out what to do about that.

I have my own answer, which is I feel like I know the way that I do science the best, which is: to write it in the words that I think I should use, to write it for a general audience so anyone can understand it, to include all the data and code and materials so that the very few people who want to open up that code and data and see exactly what I did and how it worked can do that, and then put it out there for anybody to see. And to trust that if what I have to say is interesting and useful to people, that they will tell me. And, they'll tell me how I think it could be better.

And this is exactly what I did. About a month and a half ago, I was writing this paper with a collaborator of mine named Ethan who--and we were trying to write this paper about sort of this weird and speculative idea about imagination. We had all these studies. And, we were trying to write it up for a journal, and we felt like we couldn't do it without lying--that just at some point we had to imply that we knew how it worked. Or: Study Eight, we forgot why we ran it and we have to figure out some way of maneuvering around the fact that we ran this study a few months ago and it was very clear to us why we did it then. Now the results are interesting, but we can't revive the reason that we had to do it in the first place.

And we realized, like, we can't write this paper in the way that a journal would force us to write it. We can't pretend that it's related to all this literature that it's not actually related to.

And so, instead, he and I wrote it in regular words. And so, when we got to Study Eight, we said, 'Hey, we forgot why we ran this study. That's our bad. We should have cut better notes. If you know why we ran this study, we'd love to hear from you. But here are the results and we think the results are actually interesting regardless of the reason we ran it.' I mean, all those studies were pre-registered, by the way. So, it wasn't like we were making up the analysis as we went, but we forgot why.

Russ Roberts: Explain what that is, pre-registered.

Adam Mastroianni: So, just this idea that you should state publicly, or at least in some timestamped document, the analysis that you're going to do before you do it. And, people do this to prevent themselves from being able to basically cherry-pick their analyses or go, 'Oh, okay, well this analysis didn't come out. But, actually it really would be better if you did it a little bit this way.' And, now all of a sudden it gives you some results or the results that you wanted. So, it's to combat that.

And so, we wrote that paper so that anybody can understand it and just put it on the Internet. And, what could have happened is nobody could have cared. They could have said, 'I don't want to listen to this because it hasn't been peer-reviewed. It isn't written in the way a normal paper is written. There are jokes in it. I don't like that.'

And, instead, I'd put it on this site called PsyArXiv, which is just a place you can put PDFs [portable document format] of psychology papers. Normally, people put things there before they submit it to a journal, or once they submit it to a journal to stake their claim and say, 'Here's my paper. Just so you know it's timestamped and no one can scoop me because I know it's going to be a year before it comes out in a journal.' We just put it there as its final resting place.

I woke up the next morning to find that thousands of people had read it and were talking about it and responding to it. And they were sending me reviews. People were writing in to say, 'I think I know why you ran Study 8. Here's why.' One of the few people that we cited-- one of the few people who had done research that was relevant to what we did--wrote to us and was, like, 'Here's how I got involved in doing that research that you cited, and here's how I think you might be able to turn your effect off.'

I mean, it was great. It felt like the way that science should work.

And, again, it could have worked that everybody would've just ignored it. And, that would have been a useful signal too, because we would have realized, like, 'Okay, this isn't useful to anybody,' or, 'It isn't useful today.' That would be fine. I think actually that should be the modal response to scientific claims--is neglect--because most of them don't matter or we haven't figured out whether they matter or we can't use them yet or at all.

But instead, I got what are peer reviews. Like, one person sent me an annotated PDF of the paper, saying, 'Here is what my review would be.' And, I said, 'Thank you.'

And, it felt so much better to engage in the process like that where I'm open to feedback. Like, I edited--we changed the paper and uploaded a new version based on the feedback that we got.

What I don't want is feedback at the end of a gun, where the options are, 'Change your paper or else we're not going to publish it.' I just think that you should be able to say, 'I see your point, but I don't agree and I'm going to keep the paper the way it is because I think that's the right way to do it.'

And people can argue about it. You can suffer for that choice. Maybe that your paper is worse and fewer people will use it.

But, this is the way that I intend to do science going forward. It's not the way I think that everybody should do it if they don't like doing it that way. I heard from people who said, 'I like submitting my papers and getting them reviewed in this way.' And, I say, 'Great, keep doing it.'

I just think that we should have a more diverse ecosystem where people can do research in different ways, communicate in different ways. And, I know what part of that ecosystem that I want to live in.

45:47

Russ Roberts: Yeah. Let's take a step back for a minute and think about how science has changed in terms of--as a profession. So, you point out--a beautiful point--that only one of Einstein's papers was peer reviewed, and it annoyed him so much he published it somewhere else when he found out the editor had sent it out. And I think again: one of the great insights of this short and accessible essay--for those of you at home who want to read it, we'll link to it--is that it wasn't always this way.

And, once you realize that, then you could imagine an alternative. It's a really important, I think, tool for thinking generally--much beyond these very narrow issues.

But, when we think about the world before peer review--and before journals of the kind that exists today--there are a couple of things that strike me as obvious. I don't know if they're important, but they are obvious.

There are a lot fewer people engaged in this enterprise. The ones who were engaged were really good at it. They were the crème de la crème. The ones who had fallen by the wayside weren't worthy of staying in the game.

The returns to the game were very small. You didn't get tenure, usually. You didn't get a big grant. So, there was prestige, but it was a different kind of prestige than we have today.

And an academic life today is publication. For those--many of you listening who aren't academics have heard the phrase 'Publish or perish'--there's a huge incentive to publish. And, that quote changes everything. I mean, it changes what you think about, how you think about it. It pushes you towards smaller thoughts and smaller questions because the return is more--safer. And, most people in academics are not risk-takers.

So, in the old days, you had bolder risk-takers. They were a fewer, smaller portion of the population. There weren't many journals. And, to the extent that they were journals, they were very different. As you say they were likely from the Royal Society. They'd had a strong incentive to preserving their reputation.

Truth emerged from the interactions of these exceptional people either accepting or not accepting the work of their peers. So, it was peer reviewed, but not in the way it is today.

And, I don't know if we can go back and put that genie back in the bottle, given the modern world of academic life. So, what are your thoughts on that?

Adam Mastroianni: Yeah. I guess it's funny. Like, I have very little interest in trying to fix academia because it feels impossible, and it feels like you could waste the rest of your life doing it. I'm glad that there are people who are interested in it.

I'd rather build something else. Like, I'd rather contribute to the ecosystem. I mean, my dream is to build an alternative research institution that can do things differently. Not because I think everybody should do them that way, but somebody should do them that way. And, I want to be one of the people doing them that way.

The other thing that I would say is, and this is one of the points I've made in the piece, is that I think science is a strong-link problem rather than a weak-link problem.

So, in a strong-link problem, you care about your best, basically. In a weak-link problem, you care about your worst. So, I mean, a literal physical chain is a weak-link problem. A chain is only as strong as its weakest link. But science, I think, isn't a chain. And, basically the worst work that we do doesn't matter. We can just ignore it. We proceed at the rate of our best work.

I think a good example of this is Isaac Newton revolutionized a lot of things. Mechanics, we still use. He also came up with a recipe for a Philosopher's Stone. That didn't go anywhere. But it also didn't really hold us back. I mean, maybe it wasted some people's time trying to replicate his recipe for a Philosopher's Stone. But really what mattered was the best work that he did. And we wouldn't gain all that much if we had stopped Newton from publishing his Philosopher's Stone recipe.

I feel the same way today: that we actually don't gain all that much from preventing the publication of really low-quality work, because basically it just doesn't go anywhere. At worst, it distracts us.

We do lose a lot from preventing the publication of very high-quality work.

I think this is a unique set of trade-offs in science: that, I'm not willing to give up the best in order to prevent some of the worst.

I feel differently about my doctor, for instance. I care a lot about going to a doctor that isn't going to harm me. And so, I'm willing to give up some of the best doctors if it means I get to prevent some of the worst doctors.

I just think science works differently, and that in the long term, the truth wins out. I think that's been true historically. And so, what we want to do, I think, is actually increase the variance of the work that we do, because the bad stuff basically ends up not mattering in the long run and the good stuff changes the world.

50:51

Russ Roberts: That's well said. This is part of a larger debate that the world's been having for the last--oh, all of seven years or so--about whether social media should be filtered, moderated? Should we allow untrue claims as if we could identify them with a truth meter? People have tried to do such things: to try to have committees and advisory boards to assess accuracy of scientific medical--fill in the blank--information on Twitter.

And, it seems like on Twitter, Facebook--social media generally--that it might be a weak-link problem. Someone who is poking around making wild conspiracy claims that are believed by a lot of people is maybe not so healthy.

But, science does not seem to work that way, or at least it didn't in the past. And, I think your point about, say, alchemy or other efforts: they failed, and the marketplace of ideas judged them harshly. The marketplace of ideas, I would argue, on social media is not quite as resilient, doesn't work quite as well. The incentives aren't there.

But, I think the beauty of so-called science is that the experts who might vet or opine on scientific reliability have reputational things at stake that are not present when they are anonymous reviewers for obscure journal. And, the public, kind of unfiltered sharing of information you're talking about might actually work.

Adam Mastroianni: Yeah. I mean, one thing I would say about the experiment of social media is it's still pretty new. And, it could still be, in the long run, that this is a period of time where we all tried something that didn't work. We all got in a room together and shouted, and that's what Twitter was. We realized: That's not the way to do things. And, maybe 10 years from now, we look back on this time as, like, a wild experiment that we're glad is over. Which is why I feel much less certain about that than I feel about peer review, where I think we actually do have, maybe, 50 years or so of data; we could see like: 'Okay, that has a pretty big cost. The benefit should be pretty obvious. It's hard to find that benefit. And so, maybe we should move on or at least try something else.'

But, I think it really depends on the scale of your thinking. That, it could be really true that in the short term, the best work and the worst work both have a similar impact. And, it's only as you go out further that the best work is really what sticks around and the worst work falls away.

53:24

Russ Roberts: Of course, the other part of that is that a journal article on your resume enhances it. I mean, there's a point where you have so many that the next journal article doesn't enhance it very much. But there's a strong bias for publication on the part of young scholars. And, a non-peer reviewed system or a non-journal--I'm going to call it a journal system, a more open source system for publishing like you did, where you just say, 'Here's what I found. I hope somebody finds it interesting.' That encourages you to work on important things, because it's not important. People are not going to spread it around, pay attention to it, if it's minutia.

But, a journal does pay attention to minutiae. And, as you work down the tiers of quality, too, especially--and I think the other thing that's overwhelmingly true about academic life in United States and other countries--Western countries, at least, I don't know about the rest of the world--it incentivizes you to work on small things that are safe. And they probably get to a result that is statistically significant. And, whether it's important for the world--I think you've alluded to in passing two or three times in our conversation.

I think that it's worth breaking it out in more intensity. Most scholars work on tiny things that are not relatively important. And, that's tragedy. That's worse than the 15,000 hours on reviewing.

Adam Mastroianni: Yes. I once heard from a very prominent scholar that you could delete half of her CV [curriculum vitae] and you would lose nothing. And, I thought that was the most tragic thing I'd ever heard. Because, those papers, I mean, took years to write. They took maybe millions of dollars in grant money, probably came from the public--and it all just didn't matter? I mean, what if you got to the end of your life, and you're like, you could get rid of half of my years and you wouldn't have lost anything about my life? I just don't want to do work like that.

And I hate the idea of encouraging anyone to do work like that.

So, I think there are people who work in the system right now who don't feel that way, and who feel like, 'Each thing I produce, I feel--' And, I say, 'Great.'

But, I think that there should be something for the people who want to swing for the fences. Or who want everything that they try to be worthwhile, whether it works or not.

And, I'd like to be one of those people. So, that's why I'm doing it this way.

Russ Roberts: Adam, may I ask how old you are?

Adam Mastroianni: I'm 31 years old.

Russ Roberts: Right--so you're young, even for an academic. You're at the beginning of your career. Clearly a thoughtful person.

But, writing an essay like this is high risk. It, saying the things you were saying on this program, they're a little bit heretical for the church of academic life, the church of academic life. What kind of reaction have you gotten to this essay, which--let me back up. Why did you think of writing this? And, did you have any trepidation? You should have. Maybe. And, what kind of reaction did you get? It's a gutsy thing to do.

Adam Mastroianni: Yeah. I guess I wrote it because I felt like it was true and I didn't see anyone see it or I didn't see anyone saying it.

And it felt, like, I mean--to the earlier part of our conversation--that people have this model, this idea of how these things work. And that idea is wrong. I thought that it would just be helpful to basically give permission to people to stop believing in it and to say that the emperor has no clothes. So, that's why I wrote it.

The--it's funny. When I originally posted it--I write about a lot of things related to science and psychology on my blog, and I thought, 'Ah, this is probably going to be one that not a lot of people read. It's kind of inside baseball.'

And, now it's been viewed by, like, 215,000 people; and it's the most popular thing that I've written. Which goes to show you have no idea--I think it was going to be useful to people.

Which is also why my meta strategy has never been to try to figure out what is useful to people or what's going to work on the Internet, and go like: what gets me fired up the most? Like, what sticks in my head the most? And trust that I'm not that weird of a person. And, what's true for me is going to be true for a lot of other people as well.

In terms of the reaction that I've gotten--so, a lot of people commented or wrote to me saying, you know, like, 'Thank you for saying this.' It's like, 'I've had such bad experiences,' and shared terrible experiences that they had had with peer review that I said--one comment I got recently was 'I wrote this training grant.' So, often graduate students will apply to the National Science Foundation to get a grant to support their graduate education. And, this person had written one of those grants with a trainee and it says at the top, like, 'Training Grant.'

And, it got rejected. And, the one-line response that they got was, 'This is a good idea if only it had been submitted as a training grant.'

I mean, truly just pour one out for that poor graduate student who now doesn't get this funding because someone didn't read it.

Anyway, so I got a lot of that, and I got a lot of pushback.

So, literally, there was a tenured professor in my comments saying, like, 'Adam, are you still,'--so I was a resident advisor when I was in graduate school at Harvard and I think this person had Googled me and was like, 'Are you still a resident advisor there? I have severe concerns about your ability to mentor undergraduates. I'm a Harvard alum. This is very serious.' And so, which was sort of 'Golly, this whole threat--' I mean, fortunately it's not my job anymore, but threatened my job on the Internet, based on writing this article.

And so, I wrote a whole follow-up piece responding to some of these. I mean, not to the threats or something really to say to it, but to some of the things that people have said. So, a lot of people were like, 'Well, could we tweak it in this way? Could we tweak it in that way? What if we did that?'

To which I said, 'Great. If you believe in it, you should do it.' I want to live in an ecosystem of people who are trying to solve problems that they believe are problems and using solutions that they believe in. I don't think that those things will work, but I'm just some dude. Like, 'If you think they'll work, go for them. I'm going to do this thing that I think will work.'

Other people talked about, you know, 'If everyone did what you did, we would live in a world of chaos. This is just people saying stuff.'

Russ Roberts: The Wild West. The Wild West.

Adam Mastroianni: Yeah. And I thought, man, this is--the next essay that I'm working on is called the radical idea that people aren't stupid. Because I think it is a radical idea.

And I think it is historically very strange to think that we live in a world where people are capable of processing information rationally--to the best of our abilities. We don't think that the people are omniscient. But, this is, I think, a very long historical struggle.

I was reading a history essay recently about how a key part of the American Revolution--a key radical idea--is that people in general can benefit from education.

And, before that, there was this feeling that, 'No, a few people are born who are capable of understanding the world properly and everyone else must be controlled by those people.' And, part of the American Revolution was--I mean, this took a while. This wasn't at the very beginning--but, to say that: No, actually if you offer everyone a free and appropriate public education, that they can be made better. I think there's just something beautiful and noble about that idea.

And, I feel similarly about producing knowledge. That, I know that not everybody will agree with it and it won't be useful to every single person, but I believe that people in general can benefit from it, and that I can benefit from their reading it and then they're commenting on it.

This is why I write in regular words on my blog rather than writing scientific legalese in journals, because I think I'm better for it and they're better for it, too. I think it's a bit of a radical idea, but it's one that I'm willing to die for.

1:01:25

Russ Roberts: The Wild, Wild West thing is--as a Hayekian believer in emergent order, I think if more people felt the way you did, other institutions, norms, and other things would emerge that--from the bottom up, which is really what you're suggesting--it's an ecosystem of discovery and exploration that exists in the past. Maybe it'll work better in our time. Maybe not. There are things that are different. As I said, there's way too many Ph.D.s, but maybe it'll work. But, are you claiming--not 'claiming'--are you saying that from now on your research will not be submitted to an academic journal? And, if so, do you plan to remain in academic life or you're hoping that it'll turn out well? Are you independently wealthy, Adam, that [inaudible 01:02:16]--

Adam Mastroianni: No--

Russ Roberts: What's going on here?

Adam Mastroianni: No. I have one more paper in the pipeline that's under review right now that I submitted a while ago. My intention after that paper comes out is to never submit a paper to a scientific journal again.

Now, look, it might happen in the future that, like, I need an academic job, and in order to do that, I need to play the game. I'd rather not. My hope is that--I mean, I write about this all on my Substack--my hope is that enough people value it, that I can make a living doing it there. And, I think that's possible. I think it'll still take a while.

So, I've never thought about myself as, like, 'Oh, I'm leaving academia.' That: I'm totally willing to teach classes to be part of an ecosystem and an institution, but not at the cost of writing papers that I don't believe in. That's just too great a cost. I think we take on a lot of costs to be scientists and to be academics. We could all make more money doing something else. We have prestige in our own world, but we have other options.

And, you're supposed to get something for the trade-offs that you make. And, what you're supposed to get is the freedom to say the things that you think are true. And, if you don't have that freedom, this life isn't worth it. So, yeah, I'm going to say the things I think are true and see what happens.

Russ Roberts: But, why can't you do that via the peer review process? Why not just submit papers that you think are true? I mean, why is playing the journal game a violation of your philosophical values?

Adam Mastroianni: I think because in order to succeed at that game, I inevitably have to say something that I think isn't true, or that I don't believe in, or write a paper in a way that I don't think is the best way to write it.

So, I spent much of the past year writing long responses to reviewers and reading their long responses to me that end up with, like, 'Okay, this is too speculative. I need to pull it back,' or 'I need to run this study that I think is not a good use of time and doesn't tell us anything, but it will satisfy this reviewer.'

And, this all takes a long time. All those months that I spend writing those comments are months that I don't spend actually producing truth.

I also--there was a point where I published a paper last year where I spent a day arguing with a journal as to whether I was allowed to use the word 'years' instead of always shortening it to the letter Y? And, they just insisted that I had to shorten it to the letter Y, and I could not say the word 'years'. And I had to make this case to them that I want my paper to be readable, and to be readable to people who might not know that Y stands for years, especially when this is just in the introduction.

And so, look, I try not to be crazy. I'm willing to make some sacrifices, but I'm not willing to say words that I don't mean. And I find when I write these papers and I put them on the Internet, I can do so much of a better job explaining them to people and actually representing what I did then when I have to produce what can pass peer review, which is basically a legal contract.

And I've heard from people who read the last paper that I posted that were like, 'I read this out loud to my eight-year-old daughter and she got it.' And, I thought that's what I want to do. Why wouldn't you want to produce something that an eight-year-old could understand? Maybe that eight-year-old could be the next great scientist.

Why would you want to paywall it so that you can't access it in the first place and then write it in words that you can't understand?

So, yeah: I want to live a life that I think is true or and better, and I'm willing to take on the risk to do it.

Russ Roberts: My guest today has been the very brave Adam Mastroianni. Adam, thanks for being part of EconTalk. I salute you.

Adam Mastroianni: Thank you. Thanks for having me.