Top
Best
New

Posted by timr 1 day ago

A flawed paper in management science has been cited more than 6k times(statmodeling.stat.columbia.edu)
699 points | 360 comments
SeanLuke 1 day ago|
I developed and maintain a large and very widely used open source agent-based modeling toolkit. It's designed to be very highly efficient: that's its calling card. But it's old: I released its first version around 2003 and have been updating it ever since.

Recently I was made aware by colleagues of a publication by authors of a new agent-based modeling toolkit in a different, hipper programming language. They compared their system to others, including mine, and made kind of a big checklist of who's better in what, and no surprise, theirs came out on top. But digging deeper, it quickly became clear that they didn't understand how to run my software correctly; and in many other places they bent over backwards to cherry-pick, and made a lot of bold and completely wrong claims. Correcting the record would place their software far below mine.

Mind you, I'm VERY happy to see newer toolkits which are better than mine -- I wrote this thing over 20 years ago after all, and have since moved on. But several colleagues demanded I do so. After a lot of back-and-forth however, it became clear that the journal's editor was too embarrassed and didn't want to require a retraction or revision. And the authors kept coming up with excuses for their errors. So the journal quietly dropped the complaint.

I'm afraid that this is very common.

mnw21cam 1 day ago||
A while back I wrote a piece of (academic) software. A couple of years ago I was asked to review a paper prior to publication, and it was about a piece of software that did the same-ish thing as mine, where they had benchmarked against a set of older software, including mine, and of course they found that theirs was the best. However, their testing methodology was fundamentally flawed, not least because there is no "true" answer that the software's output can be compared to. So they had used a different process to produce a "truth", then trained their software (machine learning, of course) to produce results that match this (very flawed) "truth", and then of course their software was the best because it was the one that produced results closest to the "truth", whereas the other software might have been closer to the actual truth.

I recommended that the journal not publish the paper, and gave them a good list of improvements to give to the authors that should be made before re-submitting. The journal agreed with me, and rejected the paper.

A couple of months later, I saw it had been published unchanged in a different journal. It wasn't even a lower-quality journal, if I recall the impact factor was actually higher than the original one.

I despair of the scientific process.

timr 1 day ago|||
If it makes you feel any better, the problem you’re describing is as old as peer review. The authors of a paper only have to get accepted once, and they have a lot more incentive to do so than you do to reject their work as an editor or reviewer.

This is one of the reasons you should never accept a single publication at face value. But this isn’t a bug — it’s part of the algorithm. It’s just that most muggles don’t know how science actually works. Once you read enough papers in an area, you have a good sense of what’s in the norm of the distribution of knowledge, and if some flashy new result comes over the transom, you might be curious, but you’re not going to accept it without a lot more evidence.

This situation is different, because it’s a case where an extremely popular bit of accepted wisdom is both wrong, and the system itself appears to be unwilling to acknowledge the error.

FeloniousHam 11 hours ago||
Back when I listened to NPR, I shook my fist at the radio every time Shankar Vidantim came on to explain the latest scientific paper. Whatever was being celebrated, it was surely brand new. It's presentation on Morning Edition gave it the imprimature of "Proofed Science", and I imagined it getting repeated at every office lunch and cocktail party. I never heard a retraction.
BLKNSLVR 1 day ago||||
It seems that the failure of the scientific process is 'profit'.

Schools should be using these kinds of examples in order to teach critical thinking. Unfortunately the other side of the lesson is how easy it is to push an agenda when you've got a little bit of private backing.

a123b456c 1 day ago|||
Many people do not know that Impact Factor is gameable. Unethical publications have gamed it. Therefore a higher IF may or may not indicate higher prominence. Use Scimago journal rankings for non-gameable scores.
PaulHoule 1 day ago||
Science and Nature are mol-bio journals that publish the occasional physics paper with a title you'd expect on the front page of The Weekly World News.
bargle0 1 day ago|||
If you’re the same Sean Luke I’m thinking of:

I was an undergraduate at the University of Maryland when you were a graduate student there in the mid nineties. A lot of what you had to say shaped the way I think about computer science. Thank you.

domoregood 1 day ago|||
Comments like this are the best part HN.
sizzle 21 hours ago|||
Imagine if you did a bootcamp instead
oawiejrlij 1 day ago|||
When I was a grad student I contacted a journal to tell them my PI had falsified their data. The journal never responded. I also contacted my university's legal department. They invited me in for an hour, said they would talk to me again soon, and never spoke to me or responded to my calls again after that. This was in a Top-10-in-the-USA CS program. I have close to zero trust in academia. This is why we have a "reproducibility crisis".
neilv 1 day ago|||
PSA for any grad student in this situation: get a lawyer, ASAP, to protect your own career.

Universities care about money and reputation. Individuals at universities care about their careers.

With exceptions of some saintly individual faculty members, a university is like a big for-profit corporation, only with less accountability.

Faculty bring in money, are strongly linked to reputation (scandal news articles may even say the university name in headlines rather than the person's name), and faculty are hard to get rid of.

Students are completely disposable, there will always be undamaged replacements standing by, and turnover means that soon hardly anyone at the university will even have heard of the student or internal scandal.

Unless you're really lucky, the university's position will be to suppress the messenger.

But if you go in with a lawyer, the lawyer may help your whistleblowing to be taken more seriously, and may also help you negotiate a deal to save your career. (For example of help, you need the university's/department's help in switching advisors gracefully, with funding, even as the uni/dept is trying to minimize the number of people who know about the scandal.)

lancewiggs 1 day ago|||
I found mistakes in the spreadsheet backing up 2 published articles (corporate governance). The (tenured Ivy) professor responded by paying me (after I’d graduated) to write a comprehensive working paper that relied on a fixed spreadsheet and rebutted the articles.

Integrity is hard, but reputations are lifelong.

lotsofpulp 11 hours ago|||
>PSA for any grad student in this situation: get a lawyer, ASAP, to protect your own career.

Back in my day, grad students generally couldn't afford lawyers.

sizzle 21 hours ago||||
Name and shame?
bflesch 1 day ago|||
Name and shame these frauds. Let me guess, was it Stanford?
consp 1 day ago|||
This reminds me of my former college who asked me to check some code from a study, which I did not know it was published, and told him I hope he did not write it since it likely produced the wrong results. They claimed some process was too complicated to do because it was post O(2^n) in complexity, decided to do some major simplification of the problem, and took that as the truth in their answer. End result was the original algorithm was just quadratic, not worse, given the data set was easily doable in minutes at best (and not days as claimed) and the end result did not support their conclusions one tiny bit.

Our conclusion was to never trust psychology majors with computer code. And like with any other expertise field they should have shown their idea and/or code to some CS majors at the very least before publishing.

trogdor 1 day ago|||
> it became clear that the journal's editor was too embarrassed

How sad. Admitting and correcting a mistake may feel difficult, but it makes you credible.

As a reader, I would have much greater trust in a journal that solicited criticism and readily published corrections and retractions when warranted.

steveklabnik 1 day ago||
Unfortunately, academia is subject to the same sorts of social things that anything else is. I regularly see people still bring up a hoax article sent to a journal in 1996 as a reason to dismiss the entire field that one journal publishes in.

Personally, I would agree with you. That's how these things are supposed to work. In practice, people are still people.

ameligrana 1 day ago|||
I take the occasion to say that I helped making/rewriting a comparison between various agent-based modelling software at https://github.com/JuliaDynamics/ABMFrameworksComparison, not sure if this correctly represents all of them fairly enough, but if anyone wants to chime in to improve the code of any of the frameworks involved, I would be really happy to accept any improvement
ameligrana 1 day ago||
SeanLuke, I tried to fix an issue about Mason I opened when I was looking into this a while back two years ago and tried to notify people about that (https://github.com/JuliaDynamics/ABMFrameworksComparison/iss...) with https://github.com/JuliaDynamics/ABMFrameworksComparison/pul..., hopefully the methodology is correct, I know very little about Java...In general, I don't think there is any very good comparison on performance in this field unfortunately at the moment, though if someone is interested in trying to make a correct one, I will be happy to contribute
achillean 1 day ago|||
I had a similar experience where a competitor released an academic paper rife with mistakes and misunderstandings of how my software worked. Instead of reaching out and trying to understand how their system was different than mine they used their incorrect data to draw their conclusions. I became rather disillusioned with academic papers as a result of how they were able to get away with publishing verifiably wrong data.
orochimaaru 1 day ago|||
I think the publish or perish academic culture makes it extremely susceptible to glossing over things like this - especially for statistical analysis. Sharing data, algorithms, code and methods for scientific publications will help. For papers above a certain citation count, which makes them seem "significant", I'm hoping google scholar can provide an annotation of whether the paper is reproducible and to what degree. While it won't avoid situations like what the author is talking about, it may force journal editors to take rebuttals and revisions more seriously.

From the perspective of the academic community, there will be lower incentive to publish incorrect results if data and code is shared.

cannonpalms 1 day ago|||
Is this the kind of thing that retractions are typically issued for, or would it simply be your responsibility to submit a new paper correcting the record? I don't know how these things work. Thanks.
contrarian1234 1 day ago|||
maybe naiive but isnt this what "comments" in journals are for?

theyre usually published with a response by the authors

pseudohadamard 21 hours ago||
I reviewed for Management Science years ago, once. Once. They had a ridiculously baroque review process with multiple layers of reviewing and looping within them where a paper gets re-reviewed over and over. I couldn't see any indication that it improved the quality over the standard three-people-review-then vote process. The papers I was given were pure numerology, long equations involving a dozen or more terms multiplied out where changing any one of them would throw the results in a completely different direction. And the weightings in some of the equations seemed pretty arbitrary, "we'll put a 0.4 in here because it makes the result look about right". It really didn't inspire confidence in the quality of the stuff they were publishing.

Now I'm not saying that everything in M-S is junk, but the small subset I was exposed to was.

nairboon 1 day ago||
Nowadays high citation numbers don't mean anymore what they used to. I've seen too many highly cited papers with issues that keep getting referenced, probably because people don't really read the sources anymore and just copy-paste the citations.

On my side-project todo list, I have an idea for a scientific service that overlays a "trust" network over the citation graph. Papers that uncritically cite other work that contains well-known issues should get tagged as "potentially tainted". Authors and institutions that accumulate too many of such sketchy works should be labeled equally. Over time this would provide an additional useful signal vs. just raw citation numbers. You could also look for citation rings and tag them. I think that could be quite useful but requires a bit of work.

mike_hearn 1 day ago||
I explored this question a bit a few years ago when GPT-3 was brand new. It's tempting to look for technological solutions to social problems. It was COVID so public health papers were the focus.

The idea failed a simple sanity check: just going to Google Scholar, doing a generic search and reading randomly selected papers from within the past 15 years or so. It turned out most of them were bogus in some obvious way. A lot of ideas for science reform take as axiomatic that the bad stuff is rare and just needs to be filtered out. Once you engage with some field's literatures in a systematic way, it becomes clear that it's more like searching for diamonds in the rough than filtering out occasional corruption.

But at that point you wonder, why bother? There is no alchemical algorithm that can convert intellectual lead into gold. If a field is 90% bogus then it just shouldn't be engaged with at all.

MarkusQ 1 day ago|||
There is in fact a method, and it got us quite far until we abandoned it for the peer review plus publish or perish death spiral in the mid 1900s. It's quite simple:

1) Anyone publishes anything they want, whenever they want, as much or as little as the want. Publishing does not say anything about your quality as a researcher, since anyone can do it.

2) Being published doesn't mean it's right, or even credible. No one is filtering the stream, so there's no cachet to being published.

We then let memetic evolution run its course. This is the system that got us Newton, Einstein, Darwin, Mendeleev, Euler, etc. It works, but it's slow, sometimes ugly to watch, and hard to game so some people would much rather use the "Approved by A Council of Peers" nonsense we're presently mired in.

seec 13 hours ago||
Yeah, the gatekeepers just want their political power, and that's it. Also, education/academia is a big industry nowadays; it feeds many people who have a big incentive to perpetuate the broken system.

We are just back to the universities under the religious control system that we had before the Enlightenment. Any change would require separating academia from political government power.

Academia is just the propaganda machine for the government, just like the church was the tool for justifying god-gifted powers to kings.

lo0dot0 1 day ago|||
I think that the solution is very simple, remove the citation metric. Citations don't mean correctness. What we want is correctness.
raddan 1 day ago|||
Interesting idea. How do you distinguish between critical and uncritical citation? It’s also a little thorny—if your related work section is just describing published work (which is a common form of reviewer-proofing), is that a critical or uncritical citation? It seems a little harsh to ding a paper for that.
nairboon 1 day ago|||
That's one of the issues that causes a bit of work. Citations would need to be judged with context. Let's say paper X is nowadays known to be tainted. If a tainted work is cited just for completeness, it's not an issue, e.g. "the method has been used in [a,b,c,d,x]" If the tainted work is cited critically, even better: e.g. "X claimed to show that..., but y and z could not replicate the results". But if it is just taken for granted at face value, then the taint-label should propagate: e.g. ".. has been previously proved by x and thus our results are very important...".
wasabi991011 1 day ago|||
"Uncritically" might be the wrong criteria, but you should definitely understand the related work you are citing to a decent extent.
boelboel 1 day ago|||
Going to conferences seeing researchers who've built a career doing subpar (sometimes blatantly 'fake') work has made me grow increasingly wary of experts. Worst is lots of people just seem to go along with it.

Still I'm skeptical about any sort of system trying to figure out 'trust'. There's too much on the line for researchers/students/... to the point where anything will eventually be gamed. Just too many people trying to get into the system (and getting in is the most important part).

mezyt 1 day ago||
The worse system is already getting gamed. There's already too much on the line for researchers/students, so they don't admit any wrong doing or retract anything. What's the worse that could happen by adding a layer of trust in the h-index ?
boelboel 1 day ago||
I think it could end up helping a bit in the short term. But in the end an even more complicated system (even if in principle better) will reward those spending time gaming it even more.

The system ends up promoting an even more conservative culture. What might start great will end up with groups and institutions being even more protective of 'their truths' to avoid getting tainted.

Don't think there's any system which can avoid these sort of things, people were talking about this before WW1, globalisation just put it in overdrive.

elzbardico 1 day ago|||
Those citation rings are becoming rampant in my country, along with the author count inflation.
portly 1 day ago|||
Maybe there should be a different way to calculate h-index. Where for an h-index of n, you also need n replications.
pseudohadamard 12 hours ago||
>people don't really read the sources anymore and just copy-paste the citations.

That's reference-stealing, some other paper I read cited this so it should be OK, I'll steal their reference. I always make sure I read the cited paper before citing it myself, it's scary how often it says something rather different to what the citation implies. That's not necessarily bad research, more that the author of the citing paper was looking for effect A in the cited reference and I'm looking for effect B, so their reason for citing differs from mine, and it's a valid reference in their paper but wouldn't be in mine.

chmod775 1 day ago||
Pretty much all fields have shit papers, but if you ever feel the need to develop a superiority complex, take a vacation from your STEM field and have a look at what your university offers under the "business"-anything label. If anyone in those fields manages to produce anything of quality, they're defying the odds and should be considered one of the greats along the line of Euclid, Galileo Galilei, or Isaac Newton - because they surely didn't have many shoulders to stand on either.
lordnacho 1 day ago||
This is exactly how I felt when studying management as part of ostensibly an Engineering / Econ / Management degree.

When you added it up, most of the hard parts were Engineering, and a bit Econ. You would really struggle to work through tough questions in engineering, spend a lot of time on economic theory, and then read the management stuff like you were reading a newspaper.

Management you could spot a mile away as being soft. There's certainly some interesting ideas, but even as students we could smell it was lacking something. It's just a bit too much like a History Channel documentary. Entertaining, certainly, but it felt like false enlightenment.

seec 13 hours ago||
Econ is the only social science that isn't completely bogus. The replication rate isn't too bad, even though it is still worse than STEM of course. Everything else is basically like rolling a dice or even worse. Special mention to "pedagogy," which manages to be systematically worse than random; in other words, they only produce bullshit and not much else.
HPsquared 1 day ago||
I suppose it's to be expected, the business department is built around the art of generating profit from cheap inputs. It's business thinking in action!
fnord123 1 day ago||
> Stop citing single studies as definitive. They are not. Check if the ones you are reading or citing have been replicated.

And from the comments:

> From my experience in social science, including some experience in managment studies specifically, researchers regularly belief things – and will even give policy advice based on those beliefs – that have not even been seriously tested, or have straight up been refuted.

Sometimes people use fewer than one non replicatable studies. They invent studies and use that! An example is the "Harvard Goal Study" that is often trotted out at self-review time at companies. The supposed study suggests that people who write down their goals are more likely to achieve them than people who do not. However, Harvard itself cannot find such a study existing:

https://ask.library.harvard.edu/faq/82314

ChrisMarshallNY 1 day ago||
Check out the “Jick Study,” mentioned in Dopesick.

https://en.wikipedia.org/wiki/Addiction_Rare_in_Patients_Tre...

NedF 1 day ago||
[dead]
KingMob 1 day ago||
Definitely ignore single studies, no matter how prestigious the journal or numerous the citations.

Straight-up replications are rare, but if a finding is real, other PIs will partially replicate and build upon it, typically as a smaller step in a related study. (E.g., a new finding about memory comes out, my field is emotion, I might do a new study looking at how emotion and your memory finding interact.)

If the effect is replicable, it will end up used in other studies (subject to randomness and the file drawer effect, anyway). But if an effect is rarely mentioned in the literature afterwards...run far, FAR away, and don't base your research off it.

A good advisor will be able to warn you off lost causes like this.

tgv 1 day ago||
The root of the problem is referred to implicitly: publish or perish. To get tenure, you need publications, preferably highly cited, and money, which comes from grants that your peers (mostly from other institutions) decide on. So the mutual back scratching begins, and the publication mill keeps churning out papers whose main value is the career of the author and --through citation-- influential peers, truth be damned.
strangattractor 1 day ago||
Citations being the only metric is one problem. Maybe an improved rating/ranking system would be helpful.

Ranking 1 to 3 - 1 being the best - 3 the bare minimum for publication.

3. Citations only

2. Citations + full disclosure of data.

1. Citations + full disclosure of data + replicated

nick486 19 hours ago||
this will arguably be worse.

you'll just get replication rings in addition to citation rings.

People who cheat in their papers will have no issues cheating in their replication studies too. All this does, is give them a new tool to attack papers they don't like by faking a failed replication.

bicepjai 1 day ago|||
The same dynamics from school carry over into adulthood: early on it’s about grades and whether you get into a “good” school; later it becomes the adult version of that treadmill : publish or perish.
jbreckmckye 1 day ago||
something something Goodhart's Law
te7447 1 day ago||
Something "systems that are attacked by entities that adapt often need to be defended by entities that adapt".
dev_l1x_be 1 day ago||
There is a surprisingly large amount of bad science out there. And we know it. One of my favourite writeup on the subject: John P. A. Ioannidis: Why Most Published Research Findings Are False

https://pmc.ncbi.nlm.nih.gov/articles/PMC1182327/pdf/pmed.00...

Cornbilly 1 day ago||
This is a great paper but, in my experience, most people in tech love this paper because it allows them to say "To hell with pursuing reality. Here is MY reality".
raphman 18 hours ago|||
FWIW, Ioannidis never demonstrated that a certain number of findings (or most) in a specific discipline are actually false - he calculated estimates based on assumptions. While Ioannidis work is important, and his claims may be true for many disciplines, a more nuanced view is helpful.

For example, here's an article that argues (with data) that there is actually little publication bias in medical studies in the Cochrane database:

https://replicationindex.com/2020/12/24/ioannidis-is-wrong/

FabHK 1 day ago||
John Ioannidis is a weird case. His work on the replication crisis across many domains was seminal and important. His contrarian, even conspiratorial take on COVID-19 not so much.
raddan 1 day ago|||
Ugh, wow, somehow I missed all this. I guess he joins the ranks of the scientists who made important contributions and then leveraged that recognition into a platform for unhinged diatribes.
timr 1 day ago|||
Please don't lazily conclude that he's gone crazy because it doesn't align with your prior beliefs. His work on Covid was just as rigorous as anything else he's done, but it's been unfairly villainized by the political left in the USA. If you disagree with his conclusions on a topic, you'd do well to have better reasoning than "the experts said the opposite".

Ioannidis' work during Covid raised him in my esteem. It's rare to see someone in academics who is willing to set their own reputation on fire in search of truth.

kelipso 1 day ago||||
What’s happening here?

“Most Published Research Findings Are False” —> “Most Published COVID-19 Research Findings Are False” -> “Uh oh, I did a wrongthink, let’s backtrack at bit”.

Is that it?

mike_hearn 1 day ago||
Yes, sort of. Ioannidis published a serosurvey during COVID that computed a lower fatality rate than the prior estimates. Serosurveys are a better way to compute this value because they capture a lot of cases which were so mild people didn't know they were infected, or thought it wasn't COVID. The public health establishment wanted to use an IFR as high as possible e.g. the ridiculous Verity et al estimates from Jan 2020 of a 1% IFR were still in use more than a year later despite there being almost no data in Jan 2020, because high IFR = COVID is more important = more power for public health.

If IFR is low then a lot of the assumptions that justified lockdowns are invalidated (the models and assumptions were wrong anyway for other reasons, but IFR is just another). So Ioannidis was a bit of a class traitor in that regard and got hammered a lot.

The claim he's a conspiracy theorist isn't supported, it's just the usual ad hominem nonsense (not that there's anything wrong with pointing out genuine conspiracies against the public! That's usually called journalism!). Wikipedia gives four citations for this claim and none of them show him proposing a conspiracy, just arguing that when used properly data showed COVID was less serious than others were claiming. One of the citations is actually of an article written by Ioannidis himself. So Wikipedia is corrupt as per usual. Grokipedia's article is significantly less biased and more accurate.

tripletao 1 day ago|||
He published a serosurvey that claimed to have found a signal in a positivity rate that was within the 95% CI of the false-positive rate of the test (and thus indistinguishable from zero to within the usual p < 5%). He wasn't necessarily wrong in all his conclusions, but neither were the other researchers that he rightly criticized for their own statistical gymnastics earlier.

https://statmodeling.stat.columbia.edu/2020/04/19/fatal-flaw...

That said, I'd put both his serosurvey and the conduct he criticized in "Most Published Research Findings Are False" in a different category from the management science paper discussed here. Those seem mostly explainable by good-faith wishful thinking and motivated reasoning to me, while that paper seems hard to explain except as a knowing fraud.

zahlman 1 day ago|||
> He wasn't necessarily wrong in all his conclusions, but neither were the other researchers that he rightly criticized for their own statistical gymnastics earlier.

In hindsight, I can't see any plausible argument for an IFR actually anywhere near 1%. So how were the other researchers "not necessarily wrong"? Perhaps their results were justified by the evidence available at the time, but that still doesn't validate the conclusion.

tripletao 1 day ago||
I mean that in the context of "Most Published Research Findings Are False", he criticized work (unrelated to COVID, since that didn't exist yet) that used incorrect statistical methods even if its final conclusions happened to be correct. He was right to do so, just as Gelman was right to criticize his serosurvey--it's nice when you get the right answer by luck, but that doesn't help you or anyone else get the right answer next time.

It's also hard to determine whether that serosurvey (or any other study) got the right answer. The IFR is typically observed to decrease over the course of a pandemic. For example, the IFR for COVID is much lower now than in 2020 even among unvaccinated patients, since they almost certainly acquired natural immunity in prior infections. So high-quality later surveys showing lower IFR don't say much about the IFR back in 2020.

mike_hearn 16 hours ago||
There were people saying right at the time in 2020 that the 1% IFR was nonsense and far too high. It wasn't something that only became visible in hindsight.

Epidemiology tends to conflate IFR and CFR, that's one of the issues Ioannidis was highlighting in his work. IFR estimates do decline over time but they decline even in the absence of natural immunity buildup, because doctors start becoming aware of more mild cases where the patient recovered without being detected. That leads to a higher number of infections with the same number of fatalities, hence lower IFR computed even retroactively, but there's no biological change happening. It's just a case of data collection limits.

That problem is what motivated the serosurvey. A theoretically perfect serosurvey doesn't have such issues. So, one would expect it to calculate a lower IFR and be a valuable type of study to do well. Part of the background of that work and why it was controversial is large parts of the public health community didn't actually want to know the true IFR because they knew it would be much lower than their initial back-of-the-envelope calculations based on e.g. news reports from China. Surveys like that should have been commissioned by governments at scale, with enough data to resolve any possible complaint, but weren't because public health bodies are just not incentivized that way. Ioannidis didn't play ball and the pro lockdown camp gave him a public beating. I think he was much closer to reality than they were, though. The whole saga spoke to the very warped incentives that come into play the moment you put the word "public" in front of something.

mike_hearn 1 day ago|||
Yeah I remember reading that article at the time. Agree they're in different categories. I think Gellman's summary wasn't really supportable. It's far too harsh - he's demanding an apology because the data set used for measuring test accuracy wasn't large enough to rule out the possibility that there were no COVID cases in the entire sample, and he doesn't personally think some explanations were clear enough. But this argument relies heavily on a worst case assumption about the FP rate of the test, one which is ruled out by prior evidence (we know there were indeed people infected with SARS-CoV-2 in that region in that time).

There's the other angle of selective outrage. The case for lockdowns was being promoted based on, amongst other things, the idea that PCR tests have a false positive rate of exactly zero, always, under all conditions. This belief is nonsense although I've encountered wet lab researchers who believe it - apparently this is how they are trained. In one case I argued with the researcher for a bit and discovered he didn't know what Ct threshold COVID labs were using; after I told him he went white and admitted that it was far too high, and that he hadn't known they were doing that.

Gellman's demands for an apology seem very different in this light. Ioannidis et al not only took test FP rates into account in their calculations but directly measured them to cross-check the manufacturer's claims. Nearly every other COVID paper I read simply assumed FPs don't exist at all, or used bizarre circular reasoning like "we know this test has an FP rate of zero because it detects every case perfectly when we define a case as a positive test result". I wrote about it at the time because this problem was so prevalent:

https://medium.com/mike-hearn/pseudo-epidemics-part-ii-61cb0...

I think Gellman realized after the fact that he was being over the top in his assessment because the article has been amended since with numerous "P.S." paragraphs which walk back some of his own rhetoric. He's not a bad writer but in this case I think the overwhelming peer pressure inside academia to conform to the public health narratives got to even him. If the cost of pointing out problems in your field is that every paper you write has to be considered perfect by every possible critic from that point on, it's just another way to stop people flagging problems.

tripletao 1 day ago||
Ioannidis corrected for false positives with a point estimate rather than the confidence interval. That's better than not correcting, but not defensible when that's the biggest source of statistical uncertainty in the whole calculation. Obviously true zero can be excluded by other information (people had already tested positive by PCR), but if we want p < 5% in any meaningful sense then his serosurvey provided no new information. I think it was still an interesting and publishable result, but the correct interpretation is something like Figure 1 from Gelman's

https://sites.stat.columbia.edu/gelman/research/unpublished/...

I don't think Gelman walked anything back in his P.S. paragraphs. The only part I see that could be mistaken for that is his statement that "'not statistically significant' is not the same thing as 'no effect'", but that's trivially obvious to anyone with training in statistics. I read that as a clarification for people without that background.

We'd already discussed PCR specificity ad nauseam, at

https://news.ycombinator.com/item?id=36714034

These test accuracies mattered a lot while trying to forecast the pandemic, but in retrospect one can simply look at the excess mortality, no tests required. So it's odd to still be arguing about that after all the overrun hospitals, morgues, etc.

mike_hearn 15 hours ago||
By walked back, what I meant is his conclusion starts by demanding an apology, saying reading the paper was a waste of time and that Ioannidis "screwed up", that he didn't "look too carefully", that Stanford has "paid a price" for being associated with him, etc.

But then in the P.P.P.S sections he's saying things like "I’m not saying that the claims in the above-linked paper are wrong." (then he has to repeat that twice because in fact that's exactly what it sounds like he's saying), and "When I wrote that the authors of the article owe us all an apology, I didn’t mean they owed us an apology for doing the study" but given he wrote extensively about how he would not have published the study, I think he did mean that.

Also bear in mind there was a followup where Ioannidis's team went the extra mile to satisfy people like Gellman and:

They added more tests of known samples. Before, their reported specificity was 399/401; now it’s 3308/3324. If you’re willing to treat these as independent samples with a common probability, then this is good evidence that the specificity is more than 99.2%. I can do the full Bayesian analysis to be sure, but, roughly, under the assumption of independent sampling, we can now say with confidence that the true infection rate was more than 0.5%.

After taking into account the revised paper, which raised the standard from high to very high, there's not much of Gellman's critique left tbh. I would respect this kind of critique more if he had mentioned the garbage-tier quality of the rest of the literature. Ioannidis' standards were still much higher than everyone else's at that time.

Nezteb 1 day ago||||
> So Wikipedia is corrupt as per usual. Grokipedia's article is significantly less biased and more accurate.

I hope this was sarcasm.

throw310822 1 day ago||
I would hope the same. But knowing Wikipedia I'm afraid it isn't.
doctorpangloss 1 day ago|||
Does the IFR matter? The public thinks lives are infinitely valuable. Lives that the public pays attention to. 0.1% or 1%, it doesn’t really matter, right, it gets multiplied by infinity in an ROI calculation. Or whatever so called “objective” criteria people try to concoct for policymaking. I like Ioannidis’s work, and his results about serotypes (or whatever) were good, but it was being co-opted to make a mostly political policy (some Republicans: compulsory public interaction during a pandemic and uncharitably, compulsory transmission of a disease) look “objective.”

I don’t think the general idea of co-opting is hard to understand, it’s quite easy to understand. But there is a certain personality type, common among people who earn a living by telling Claude what to do, out there with a defect to have to “prove” people on the Internet “wrong,” and these people are constantly, blithely mobilized to further someone’s political cause who truly doesn’t give a fuck about them. Ioannidis is such a personality type, and as you can see, a victim.

zahlman 1 day ago||
> The public thinks lives are infinitely valuable.

In rhetoric, yes. (At least, except when people are given the opportunity to appear virtuous by claiming that they would sacrifice themselves for others.)

In actions and revealed preferences, not so much.

It would be rather difficult to be a functional human being if one took that principle completely seriously, to its logical conclusion.

I can't recall ever hearing any calls for compulsory public interaction, only calls to stop forbidding various forms of public interaction.

doctorpangloss 22 hours ago||
The SHOW UP act was congressional republicans forcing the end of telework for federal workers, not for any rational basis. Teachers in Texas and Florida, where Republicans run things, staff were faced with show up in person (no remote learning) or quit.
giardini 1 day ago|||
Yeah, and lucky you! You gain all this insight b/c you logged into Hacker News on the very day someone posted the truth! What a coincidence!
sampo 1 day ago|||
He made a famous career, to being a professor and a director in Stanford University, about meta-research on the quality of other people's research, and critiquing the methodology of other people's studies. Then during Covid he tried to do a bit of original empirical research of his own, and his own methods and statistical data analysis were even worse than what he has critiqued in other people's work.
throwaway150 1 day ago||
> I’ve been in the car with some drunk drivers, some dangerous drivers, who could easily have killed people: that’s a bad thing to do, but I wouldn’t say these were bad people.

If this isn't bad people, then who can ever be called bad people? The word "bad" loses its meaning if you explain away every bad deed by such people as something else. Putting other people's lives at risk by deciding to drive when you are drunk sounds like very bad people to me.

> They’re living in a world in which doing the bad thing–covering up error, refusing to admit they don’t have the evidence to back up their conclusions–is easy, whereas doing the good thing is hard.

I don't understand this line of reasoning. So if people do bad things because they know they can get away with it, they aren't bad people? How does this make sense?

> As researchers they’ve been trained to never back down, to dodge all criticism.

Exactly the opposite is taught. These people are deciding not to back down and admit wrong doing out of their own accord. Not because of some "training".

fjsocjdjdcisj 1 day ago||
As writers often say: there’s no such thing as a synonym.

“That’s a bad thing to do…”

Maybe should be: “That’s a stupid thing to do…”

Or: reckless, irresponsible, selfish, etc.

In other words, maybe it has nothing to do with morals and ethics. Bad is kind of a lame word with limited impact.

Jach 1 day ago||
It's a broad and simple word but it's also a useful word because of its generality. It's nice to have such a word that can apply to so many kinds and degrees of actions, and saves so many pointless arguments about whether something is more narrowly evil, for example. Applied empirically to people, it has predictive power and can eliminate surprise because the actions of bad people are correlated with bad actions in many different ways. A bad person does something very stupid today, very irresponsible tomorrow, and will unsurprisingly continue to do bad things of all sorts of kinds even if they stay clear of some kinds.
spongebobstoes 1 day ago|||
labelling a person as "bad" is usually black and white thinking. it's too reductive, most people are both good and bad

> because they know they can get away with it

the point is that the paved paths lead to bad behavior

well designed systems make it easy to do good

> Exactly the opposite is taught.

"trained" doesn't mean "taught". most things are learned but not taught

brabel 1 day ago||
When everyone else does it, it's extremely hard to be righteous. I did it long ago... everyone did it back then. We knew the danger and thought we were different, we thought we could drive safely no matter our state. Lots of tragedies happen because people disastrously misjudge their own abilities, and when alcohol is involved doubly so. They are not bad people, they're people who live in a flawed culture where alcohol is seen as acceptable and who cannot avoid falling for the many human fallacies... in this case caused by the Dunning Kruger effect. If you think people who fall for fallacies are bad, then being human is inherently bad in your opinion.
throwaway150 1 day ago||
I don't think being human is inherently bad. But you have to draw the line to consider someone as "bad" somewhere, right? If you don't draw a line, then nobody in the world is a bad person. So my question is where exactly is that line?

You guys are saying that drink driving does not make someone a bad person. Ok. Let's say I grant you that. Where do you draw the line for someone being a bad person?

I mean with this line of reasoning you can "explain way" every bad deed and then nobody is a bad person. So do you guys consider someone to be actually a bad person and what did they have to do to cross that line where you can't explain away their bad deed anymore and you really consider them to be bad?

ordu 1 day ago||
> If you don't draw a line, then nobody in the world is a bad person. So my question is where exactly is that line?

I don't think that that line can be drawn exactly. There are many factors to consider and I'm not sure that even considering them will allow you to draw this line and not come to claims like '99% of people are bad' or '99% of people are not bad'.

'Bad' is not an innate property of a person. 'Bad' is a label that exists only in an observer's model of the world. A spherical person in vacuum cannot be 'bad', but if we add an observer of the person, then they may become bad.

To my mind, the decision of labeling a person to be bad or not labeling them is a decision reflecting how the labeling subject cares about the one on the receiving side. So, it goes like this: first you decide what to do with bad behavior of someone, and if you decide to go about it with punishment, then you call them 'bad', if you decide to help them somehow to stop their bad behavior, then you don't call them bad.

It works like this: when observing some bad behavior I decide what to do about it. If I decide to punish a person, I declare them to be bad. If I decide to help them stop their behavior, I declare them to be not bad, but 'confused' or circumstantially forced, or whatever. Y'see: you cannot change personal traits of others, so if you declare that the reason of bad behavior is a personal trait 'bad' then you cannot do anything about it. If you want to change things, you need to find a cause of bad behavior, that can be controlled.

ChrisMarshallNY 1 day ago||
Sounds like the Watergate Scandal. The crime was one thing, but it was the cover-up that caused the most damage.

Once something enters The Canon, it becomes “untouchable,” and no one wants to question it. Fairly classic human nature.

> "The most erroneous stories are those we think we know best -and therefore never scrutinize or question."

-Stephen Jay Gould

thewanderer1983 1 day ago||
did “not impact the main text, analyses, or findings.”

Made me think of the black spoon error being off by a factor of 10 and the author also said it didn't impact the main findings.

https://statmodeling.stat.columbia.edu/2024/12/13/how-a-simp...

gus_massa 1 day ago|
The webpage of the journal [1] only says 109 citations of the original article, this count only "indexed" journals, that are not guaranty to be ultra high quality but at least filter the worse "pay us to publish crap" journals.

ResearchGate says 3936 citations. I'm not sure what they are counting, probably all the pdf uploaded to ResearchGate

I'm not sure how they count 6000 citations, but I guess they are counting everything, including quotes by the vicepresident. Probably 6001 after my comment.

Quoted in the article:

>> 1. Journals should disclose comments, complaints, corrections, and retraction requests. Universities should report research integrity complaints and outcomes.

All comments, complaints, corrections, and retraction requests? Unmoderated? Einstein articles will be full of comments explaining why he is wrong, from racist to people that can spell Minkowski to save their lives. In /newest there is like one post per week from someone that discover a new physics theory with the help of ChatGPT. Sometimes it's the same guy, sometimes it's a new one.

[1] https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1964011

[2] https://www.researchgate.net/publication/279944386_The_Impac...

optionalsquid 1 day ago||
> I'm not sure how they count 6000 citations, but I guess they are counting everything, including quotes by the vicepresident. Probably 6001 after my comment.

The number appears to be from Google Scholar, which currently reports 6269 citations for the paper

Calavar 1 day ago||
> All comments, complaints, corrections, and retraction requests? Unmoderated? Einstein articles will be full of comments explaining why he is wrong, from racist to people that can spell Minkowski to save their lives. In /newest there is like one post per week from someone that discover a new physics theory with the help of ChatGPT. Sometimes it's the same guy, sometimes it's a new one.

Judging from PubPeer, which allows people to post all of the above anonymously and with minimal moderation, this is not an issue in practice.

bee_rider 1 day ago|||
They mentioned a famous work, which will naturally attract cranks to comment on it. I’d also expect to get weird comments on works with high political relevance.
gus_massa 1 day ago|||
Link to PubPerr https://pubpeer.com/publications/F9538AA8AC2ECC7511800234CC4...

It has 0 comments, for an article that forgot "not" in "the result is *** statistical significative".

Calavar 1 day ago||
Isn't a lack of comments the opposite of the problem you were previously claiming?
More comments...