In various ways these outcomes feedback on each other – description can inform explanatory models, and explanatory models can be evaluated based on their predictions. In a recent paper in Oikos, Houlahan et al. discuss the tendency of many ecological fields to under-emphasize predictive approaches and instead focus on explanatory statistical models. They note that prediction is rarely at the centre of ecological research and that this may be limiting ecological progress. There are lots of interesting questions that ecologists should be asking, including what are the predictive horizons (spatial and temporal scales) over which predictive accuracy decays? Currently, we don't even know what a typical upper limit on model predictive ability is in ecology.
Although the authors argue for the primacy of prediction ["Prediction is the only way to demonstrate scientific understanding", and "any potentially useful model must make predictions about some unknown state of the natural world"], I think there is some nuance to be gained by recognizing that understanding and prediction are separate outcomes and that their relationship is not always straightforward (for a thorough discussion see Shmueli 2010). Ideally, a mutually informative feedback between explanation and prediction should exist, but it is also true that prediction can be useful and worthy for reasons that are not dependent on explanation and vice versa. Further, to understand why and where prediction is limited or difficult, and what is required to correct this, it is useful to consider it separately from explanation.
Understanding/explanation can be valuable and inspire further research, even if prediction is impossible. The goal of explanatory models is to have the model [e.g., f(x)] match as closely as possible the actual mechanism [F(x)]. A divergence between understanding and prediction can naturally occur when there is a difference between concepts or theoretical constructs and our ability to measure them. In physics, theories explaining phenomenon may arise many years before they can actually be tested (e.g. gravitational waves). Even if useful causal models are available, limitations on prediction can be present: in particle physics, the Heisenberg uncertainty principle identifies limits on the precision at which you can know both the position of a particle and its momentum. In ecology, a major limitation to prediction may simply be data availability. In a similar field (meteorology) in which many processes are important and nonlinearities common, predictions require massive data inputs (frequently collected over near continuous time) and models that can be evaluated only via supercomputers. We rarely collect biotic data at those scales in ecology. We can still gain understanding if predictions are impossible, and hopefully eventually the desire to make predictions will motivate the development of new methods or data collection. In many ecological fields, it might be worth thinking about what can be done in the future to enable predictions, even if they aren't really possible right now.
Approaches that emphasize prediction frequently improve understanding, but this is not necessarily true either. Statistically, understanding can come at the cost of predictive ability. Further, a predictive model may provide accurate predictions, but do so using collinear or synthetic variables that are hard to interpret. For example, a macroecological relationship between temperature and diversity may effectively predict diversity in a new habitat, and yet do little on its own to identify specific mechanisms. Prediction does not require interpretability or explanatory ability, as is clear from papers such as "Model-free forecasting outperforms the correct mechanistic model for simulated and experimental data". So it's worth being wary of the idea that a predictive model is necessarily 'better'.
With this difference between prediction and understanding in mind, it is perhaps easier to understand why ecologists have lagged in prediction. For a long time, statistical approaches used in ecology were biased toward those meant to improve understanding, such as regression models, where parameters estimate the strength and direction of a relationship. This is partially responsible for our obsession with p-values and R^2 terms. What Houlahan et al. do a great job of emphasizing is that by ignoring prediction as a goal, researchers are often limiting their ability confirm their understanding. Predictions that are derived from explanatory models Some approaches in ecology have already moved naturally towards emphasizing prediction, especially SDMs/ecological niche models. They recognized that it was not enough to describe species-environment relationships; testing predictions allowed them to determine how universal and mechanistic these relationships actually were. A number of macroecological models fit nicely with predictive statistical approaches, and could adopt Houlahan’s suggestions quite readily (e.g. reporting measures of predictive ability and testing models on withheld data). But for some approaches, the search for mechanism is so deeply integrated into how they approach science that it will take longer and be more difficult (but not impossible)*. Even for these areas, prediction is a worthy goal, just not necessarily an easy one.
*I was asked for examples of 'unpredictable' areas of ecology. This may be pessimistic, but I think that something like accurately predicting the composition (both species' abundance and identity) of diverse communities at small spatial scales might always be difficult, especially given the temporal dynamics. But I could be wrong!
**This has been edited to correctly spell the author's name.
...if the Simpsons could predict Trump, I suppose there's hope for ecologists too... |