Monday, February 7, 2011

Further studies of the decline effect find decline of the decline effect

“The Truth Wears Off: Is something wrong with the scientific method?”

The Decline Effect explored in an article by Jonah Lehrer in the New Yorker refers to a temporal decline in the size of an observed effect: for example, the therapeutic value of antidepressants appears to have declined threefold since the original trials. Based on the cases presented, this effect is not limited to medical and psychological studies. One example in evolutionary biology is the relationship between physical symmetry and female choice: initial studies consistently found strong selection for symmetry in mates by females, but as time passed, the evidence grew increasingly smaller.

This may be a result of selective reporting – scientists focus on results that are novel and interesting, even if they are in fact simply statistical outliers, or worse, the result of unconscious human bias. This sentiment is troubling; humans – scientists or not– are proficient pattern finders, but our subconscious (or conscious) beliefs influence what we search for. Lehrer argues that replication – the process of carrying out additional, comparable but independent studies – isn’t an effective part of the scientific method. After all, if study results are biased, and replications don’t agree, how can we know what to trust?

I don’t disagree with most of the article’s points: that scientists can produce biased results, PhD not withstanding, that more effort and time should be invested in data collection and experimental methodology, that the focus on 5% statistical significance is problematic. For one, it’s not clear from the article how prevalent the decline effect is. However, I wonder whether Lehrer, similar to the scientists he’s reporting on, has selected specific, interesting data points, while ignoring the general trend of the research. In 2001, Jennions and Moller published evidence of a small negative trend in effect size over time for 200+ studies, however, they suggest this is due to a bias toward high statistical significance, which requires either large effect sizes (the early studies published), or small effect sizes in combination with large sample sizes (a scenario which takes more time).

Even if the decline effect is rampant, does it represent a failure of replicability? Lehrer states that replication is flawed because “it appears that nature often gives us different answers”. As ecologists though, we know that nature doesn’t give different answers, we ask it different questions (or the same question in different contexts). Ecology is complex and context-dependent, and replication is about investigating the general role of a mechanism that may have been studied only in a specific system, organism, or process. Additional studies will likely produce slightly or greatly different results, and optimally a comprehensive understanding of the effect results. The real danger is that scientists, the media, and journals over-emphasize the significance of initial, novel results, which haven’t (and may never be) replicated.

Is there something wrong with the scientific method (which is curiously never defined in the article)? The decline effect hardly seems like evidence that we’re all wasting our time as scientists – for one, the fact that “unfashionable” results are still publishable suggests that replicability is doing what it’s supposed to, that is, correct for unusual outcomes and produce something close to the average effect size. True, scientists are not infallible, but the strength of the scientific process today is that it doesn’t operate on the individual level: it relies on a scientific community made of peers, reviewers, editors, and co-authors, and hopefully this encourages greater accuracy in our conclusions.


Anonymous said...

They cited that the decline effect is too large to be explained by the central limit theorem (the more you sample, the more close to reality you get). I think they failed to consider the fact that, when performing experiments, one usually only continues along a path that appears to be showing something. This could just be a spurious result or an outlier. As more pursue a similar line of experimentation, they may find that there is indeed, not much of an effect. This is then neither the original researcher's fault, nor that of the later researchers, and not some sort of 'decline effect'.

Victor said...

Result of selective reporting is scientists focus on results that are novel and interesting, even if they are in fact simply statistical outliers.
This could just be a spurious result or an outlier.

xlpharmacy said...

This sure is an interesting study, I honestly dont know anything about it, but it sure is great.

Jackie said...