Monday, April 21, 2014

Null models matter, but what should they look like?

Neutral Biogeography and the Evolution of Climatic Niches. Florian C. Boucher, Wilfried Thuiller, T. Jonathan Davies, and S├ębastien Lavergne. The American Naturalist, Vol. 183, No. 5 (May 2014), pp. 573-584

Null models have become a fundamental part of community ecology. For the most part, this is an improvement over our null-model free days: patterns are now interpreted with reference to patterns that might arise through chance and in the absence of ecological processes of interest. Null models today are ubiquitous in tests of phylogenetic signals, patterns of species co-occurrence, models of species distribution-climate relationships. But even though null models are a success in that they are widespread and commonly used, there are problems--in particular, there is a disconnect between how null models are chosen and interpreted and what information they actually provide. Unfortunately, simple and easily applied null models tend to be favoured, but they are often interpreted as though they are complicated, mechanism-explicit models.

The new paper “Neutral Biogeography and the Evolution of Climatic Niches” from Boucher et al. provides a good example of this problem. The premise of the paper is straightforward: studies of phylogenetic niche conservation tend to rely on simple null models, and as a result may misinterpret what their data shows because of the type of null models that they use. The study of phylogenetic niche conservation and niche evolution is becoming increasingly popular, particularly studies on how species' climatic niches evolve and how climate niches relate to patterns of diversity. In a time of changing climates, there are also important applications looking at how species respond to climatic shifts. Studies of changes in climate niches through evolutionary time usually rely on a definition of the climate niche based on empirical data, more specifically, the mean position of a given species along a continuous abiotic gradient. Because this is not directly tied to physiological measurements, climate niche data may also capture the effect of dispersal limitations or biotic interactions. Hence the need for null models, however the null models used in these studies primarily flag changes in climate niche that result from to random drift or selection in a varying environment. These types of null models use Brownian motion (a "random walk") to answer questions about whether niches are more or less similar than expected due to chance, or else whether a particular model of niche evolution is a better fit to the data than a model of Brownian motion.

The authors suggest that the reliance on Brownian motion is problematic, since these simple null models cannot actually distinguish between patterns of climate niches that arise simply due to speciation and migration but no selection on climate niches, and those that are the result of true niche evolution. If this is true, conclusions about niche evolution may be suspect, since they depend on the null model used. The authors used a neutral, spatially explicit model (known as an "alternative neutral biogeographic model") that simulates dynamics driven only by speciation and migration, with species being neutral in their dynamics. This provides an alternative model of patterns that may arise in climate niches among species, despite the absence of direct selection on the trait. The paper then looks at whether climatic niches exhibit phylogenetic signals when they arise via neutral spatial dynamics; if gradualism a reasonable neutral expectation for the evolution of climatic niches on geological timescales; and whether constraints on climatic niche diversification can arise simply through bounded geographic space. Simulations of the neutral biogeographic model used a gridded “continent” with variable climate conditions: each cell has a carrying capacity, and species move via migration and split into two species either by point mutation, or else by vicariance (a geographic barrier appears, leading to divergence of 2 populations). Not surprisingly, their results show that even in the absence of any selection on species’ climate niches, patterns can result that differ greatly from a simple Brownian motion-based null model. So the simple null model (Brownian motion) often concluded that results from the more complex null model were different from the random/null expectation. This isn't a problem per se. The problem is that currently interpretations of the Brownian motion model may be that anything different from null is a signal for niche evolution (or conservation). Obviously that is not  correct.

This paper is focused on the issue of choosing null models for studies of climate niche evolution, but it fits into a current of thought about the problems with how ecologists are using null models. It is one thing to know that you need and want to use a null model, but it is much more difficult to construct an appropriate null model, and interpret the output correctly. Null models (such as the Brownian motion null model) are often so simplistic that they are straw man arguments – if ecology isn't the result of only randomness, your null model is pretty likely to be a poor fit to the data. On the other hand, the more specific and complex the null model is, the easier it is to throw the baby out with the bathwater. Given how much data is interpreted in the light of null models, it seems that choosing and interpreting null models needs to be more of a priority.


Florian Hartig said...

Hi Caroline,

good points! Jeremy had a post a while ago with similar points.

One could add that the story can also go the other way - not rejecting the null model is no proof that the null model is correct, specially not when the summary statistics that are used are not sufficient for the inferential problem at hand (I'm thinking of neutral theory, for example, where one can show that patterns arising from a niche-structured community can be produced perfectly well by a neutral model).

That being said, I want to reemphasize your point that null-models are clearly a big improvement over simple correlative analysis, because they account for all kinds of artifacts that mess up regression analyses. Their results are often over-interpreted, but so are all other statistical methods. If I compare to other statistical approaches (p-values, regression, ...), I would say: the situation is normal.

Jeremy Fox said...

Spot on Caroline. And unlike Florian, I don't take much comfort from the fact that any analytical approach can be misinterpreted...

Caroline Tucker said...

Thanks, sorry I missed your earlier post on the topic! The problem for me is deciding what is the normal course of progress (we have some null models, it will take a while for more appropriate null models to develop), and what is stasis. Any analytical approach can be misinterpreted, but how long until someone should make a fuss about interpreting it correctly?

florianhartig said...

Jeremy, I'm not advocating to take comfort in it, I just wanted to point out that the problem is not necessarily in using null models, but rather our apparent inability to correctly interpret a hypothesis test.

P-values are misinterpreted pretty much since they are around, and people have been pointing this out for decades to no effect. It's just so convenient to say that p(D<0.05|H0) = p(!H0).

It seems to me that exactly the same thing is what we do with null models, if I call my null model N0, abd the alternative "my favorite theory", we say that p(D<0.05|N0) ==> "my favorite theory"

Hans Castorp said...

The discussion about null models in becoming a little boring. When you discard a null model you simply say that your system has more structure than the null model. It is always possible to devise a more complicated model, to discard this model, and then to devise a new more complicated model and so on, ad infininitum. Null models sett a lower limit on the complexity of the problem. Probably it would be more rewarding to see at null models that are NOT discarded, since they seti un UPPER limit t the complexity of your model