The EEB & Flow

Monday, February 2, 2015

Reproductive Character Displacement or Alternative Explanations?

*Guest post by Santiago J. Sánchez-Pacheco

Closely related animal species are often so similar that it is hard to distinguish them. This immediately leads to the question of how the individuals of such species, when in sympatry, can recognize their conspecifics. Usually, the species differ in traits (i.e., species recognition signals; e.g., visual and sound signals) that are detectable by sensory mechanisms. Less is known, however, about how these phenotypic differences evolve. A common view is that hybrids suffer reduced fitness or cannot be produced whatsoever, and therefore selection should favor individuals with traits that avoid interspecific matings. By diverging in such traits, females and males of closely related species are less likely to waste energy in failed matings. This widely accepted assumption is usually referred to as “reproductive character displacement” (Losos, 2013).

From Evolution (Third edition; Futuyma, 2013).

When Brown and Wilson (1956) described character displacement, they proposed the following process: populations of two closely related species, after first coming into contact with each other, interact “in such a way as to diverge further from one another where they occur together”. Such divergence minimizes the chances of both competition and hybridization between the species, and therefore favors coexistence over exclusion.

While it is generally accepted that natural selection is the force increasing the frequency of the divergent traits, whether or not the resulting divergence is driven by the interaction between the two species (e.g., competition) remains uncertain. If a pattern of differences is consistently detected between populations of two closely related species when they are compared in allopatry versus sympatry, it seems reasonable to attribute this pattern to the interaction of both species. However, a number of processes other than a response to interspecific interaction may result in a “displacement-like” pattern—substantial differences of the environments between allopatry and sympatry, phenotypic plasticity or even random processes can all trigger differentiation (Kamath, 2014).

Based on six criteria (Box 1) established by Schluter & McPhail (1992) as general indicators to rule out alternative processes that might lead to a displacement-like pattern, recently Stuart & Losos (2013) pointed out that only a small portion (9 of 144 cases) of recent studies claiming evidence for ecological character displacement can conclude with a high degree of certainty that the interspecific interaction led to the observed divergence. According to Stuart & Losos, falsification of only one of these six criteria is enough evidence to determine that such divergence did not result from character displacement. Consequently, their findings suggest that most documented cases of ecological character displacement are equally consistent with other evolutionary and ecological phenomena. Although these two studies focus only on ecological character displacement, it is worth noting that the same eco-evolutionary principles underlie reproductive character displacement, so that alternative processes could also explain phenotypic differentiation presumably derived from interspecific interaction.

Despite the concept of character displacement having remained in the evolutionary literature for decades, this assumption has seldom been subjected to critical scrutiny. Indeed, it was not until recently that significant progress in designing thorough studies to rigorously test this adaptive hypothesis was achieved (e.g., Stuart et al. [2014]).

Box 1: Modified from Stuart and Losos (2013). The six criteria for Ecological Character Displacement (ECD).

References

Brown Jr., W. L. and E. O. Wilson. 1956. Character displacement. Systematic Zoology 5(2): 49–64.

Kamath, A. 2014. http://www.anoleannals.org/2014/10/25/rapid-evolution-in-anolis-carolinensis-following-the-invasion-of-anolis-sagrei/

Losos, J. 2013. http://www.anoleannals.org/2013/08/12/reproductive-character-displacement-and-dewlap-color-in-haitian-anoles/

Schluter, D. and J. D. McPhail. 1992. Ecological character displacement and speciation in sticklebacks. The American Naturalist 140: 85–108.

Stuart Y. E. and J. B. Losos. 2013. Ecological character displacement: glass half full or half empty. Trends in Ecology and Evolution 28(7): 402–408.

Stuart Y. E., Campbell T. S., Hohenlohe, P. A., Reynolds, R. G., Revell, L. J. and J. B. Losos. 2014. Rapid evolution of a native species following invasion by a congener. Science 346: 463–466.

A blog post reviewing Stuart and Losos (2013) from a different perspective:

http://evol-eco.blogspot.ca/2013/05/why-pattern-based-hypotheses-fail.html

Tuesday, January 27, 2015

50 years of applying theory to ecological problems: where are we now?

Fifty years ago, the seminal volume ‘The Genetics of Colonizing Species’ edited by Herbert G. Baker and G. Ledyard Stebbins was published, and it marked a new phase for the nascent sciences of ecology and evolutionary biology –namely applying theories and concepts to understanding applied issues. Despite the name, this book was not really about genetics, though there were several excellent genetics chapters, what it was really about was the collective flexing of the post-modern synthesis intellectual muscles. Let’s back up for a minute.

The modern synthesis, largely overlooked and forgotten by modern course syllabi, is the single most important event in ecology and evolution since the publication of Darwin’s Origin of the Species. Darwin’s concepts of evolution stand as dogma today, but after publishing his book, Darwin and others recognized that he lacked a crucial mechanism –how organismal characteristics were passed on from parent to offspring. He assumed that whatever the mechanisms, offspring varied in small ways from parents and that there was continuous variation across a population.

For more than 30 years, from about 1900-1930, evolution via natural selection was thought disproven. With the rediscovery of Mendel’s garden pea breeding experiments in 1900, many influential biologists of the day believed that genetic variation was discontinuous in ‘either-or’ states and that abrupt changes typified the appearance of new forms. Famously, this thinking lead to the belief that ‘hopeful monsters’ were produced with some becoming new species instantaneously. This model of speciation was referred to ‘saltationism’

Of course there were heretics, most notably the statisticians who worked with continuous variation (e.g., Karl Pearson, and Ronald Fisher) who refuted the claims made by saltationists in the 1920s. Some notable geneticists changed their position on saltationism because their experiments and observations provided evidence that natural selection was important (most notably T.H. Morgan). However, it wasn't until WWII that the war was won. A group of scientists working on disparate phenomena published a series of books from 1937-1950 that showed how genetics was completely compatible with Darwinian natural selection and could explain a wide variety of observations from populations to biogeography to paleontology. These ‘architects’ and their books were: Theodosius Dobzhansky (Genetics and the Origin of Species); Ernst Mayr (Systematics and the Origin of Species); E. B. Ford (Mendelism and Evolution); George Gaylord Simpson (Tempo and Mode in Evolution); and G. Ledyard Stebbins (Variation and Evolution in Plants). With this, they unified biology and thus the modern synthesis was born.

Now back to the edited volume. Which such a powerful theory, it made sense that there should be a theoretical underpinning to applied ecological problems. The book grew out of a symposium held in Asilomar, California Feb. 12-16, 1964[1], organized by C. H.Waddington, who originally saw an opportunity to bring together thinkers on population genetics. But the book became so much more. According to Baker and Stebbins:

“…the symposium … had as its object the bringing together of geneticists, ecologists, taxonomists and scientists working in some of the more applied phases of ecology –such as wildlife conservation, weed control, and biological control of insect pests.”

Thus the goal was really about modern science and the ability to inform ecological management. The invitees include a few of the ‘architects’ (Dobzhansky, Mayr, and Stebbins) and their academic or intellectual progeny, which includes many of the most important thinkers in ecology and evolution in the 1960s and 70s (Wilson, Lewontin, Sakai, Birch, Harper, etc.).

Given the importance of the Genetics of Colonizing Species in establishing the role that theory might play for applied ecology, it is important to reflect on two important questions: 1) How much have our basic theories advanced in the last 50 years; and perhaps more importantly, 2) has theory provided key insights to solving applied problems?

This book is the fodder for a graduate seminar course I am teaching, and these two questions are the focus of our comparing the chapters to modern papers. Over the next couple of months, students in this course will be contributing blog posts that examine the relationship between the classic chapters and modern work, and they will muse on these two questions. Hopefully by the end of this ongoing dialogue, we will have a better feeling of whether basic theory has advanced our ability to solve applied problems.

[1] See 50^th anniversary symposium

Friday, January 23, 2015

Equalizing and stabilizing traits?

Plant functional traits and the multidimensional nature of species coexistence. 2015. Nathan J. B. Kraft, Oscar Godoy, and Jonathan M. Levine. PNAS.

(This isn’t a brand new paper, but somehow I’m already behind on reading papers in the new year...)

A recent paper from Kraft et al. in PNAS does a really nice job in filling a gap that has been in literature for a while, which is to extend the influential theoretical work on coexistence from Chesson (and extended more recently by Jonathan Levine et al.) to explicitly incorporate functional traits and trait-based approaches to ecology. Chesson’s work (particularly ARES 2000) lays out a framework for understanding coexistence and competitive interactions, which focuses on the importance of stabilizing effects (niche differences) and equalizing effects (fitness differences) between competing species (e.g.). This theory makes strong predictions of when and how coexistence is expected (for example, when species have strong enough niche differences). However, accurate application of the theory is somewhat difficult, perhaps because identifying and calculating niche and fitness differences requires heavy use of mathematical models and careful experimental design.

In contrast, the value of the focus on functional traits in ecology is that they are readily measured, easily conceptualized, and databases of values already exist. In common with equalizing/stabilizing effects, traits are meant to inform our understanding of species' niches, but in contrast, traits are empirically friendly. One of the more common critiques of the Chesson framework was that empirical measures, particularly traits, couldn't be shoehorned into it. After all, traits likely contribute to both equalizing and stabilizing forces in complicated ways that may well shift during a species' life.

What Nathan Kraft and coauthors have done is show that this is not a limitation - traits can contribute to both equalizing and stabilizing forces, and mathematical models can tease these effects apart. They relate detailed measurements of leaf, root, seed and whole plant traits for 18 California annual plants with the results of mathematical models of competition and coexistence between these species. The authors found strong and exciting relationships between the theoretically motivated measures of competitive processes and species' traits. Average fitness differences had significant correlations with functional traits, particularly maximum height, leaf [N], leaf area, rooting depth, and phenology.

From Kraft et al. 2015: Correlations between species traits and A) Stabilizing niche differences, and B) Average fitness differences.
The key to interpreting these plots is to understand that where the coloured line overlaps with the grey shading, the correlation is not different than the null model. When the line is between the null and the center of the figure, the correlation is significant and negative; where it is between the null and the external edge, the correlation is significant and positive.

No individual traits correlated with niche differences, but models including multiple traits considered together do correlate with niche differences. A rather nice bit of support for the multidimensional view of the niche.

This paper does a nice job of expanding Chesson's framework a little bit farther towards empirical applications. Further, it reinforces the value of trait approaches. There are still some important limitations - the first is that this particular system of annual plants has been studied in great detail. It seems unlikely that the traits identified in this paper can necessarily be generalized as "equalizing" traits. A trait with an equalizing effect in a California grassland may well contribute less to fitness in a desert system, for example. Perennial species are altogether less integrated into experimental applications of Chesson's framework (life time fitness, among other things, being much easier to capture in annual plants). But this paper is a suggestion of a useful way forward, albeit a way that requires much more data and careful experimentation. The authors acknowledge that more study is due, but also the potential: “These complex relationships argue against the simple use of single traits to infer community assembly processes but lay the foundation for a theoretically robust trait-based community ecology.”

Monday, January 12, 2015

#ibs2015 – Confronting uncertainty, biases and the unknown

The 2015 meeting of the International Biogeography Society just came to an end, and even for someone who wouldn’t traditionally consider themselves a ‘biogeographer’ there were many interesting topics and talks to see.

The focus of most talks was on biological patterns over space and/or time (or ‘deep time’, which is a fun phrase to throw around), and the talks emphasized how sophisticated statistical methodologies for such questions have become. The extent and complexity of approaches for making inferences from limited existing information, be it phylogenetic, distributional, or fossil and pollen records, is pretty amazing.

Such complicated inference needn't and shouldn't come at the cost of careful scientific work, and must include recognition of uncertainty and biases. The final sessions of the conference acted as an excellent (and at times provocative) reminder of this. For example, Joaquin Hortal advocated the development of ‘maps of ignorance’, which instead of showing the typical distributions of known information, highlight where information is missing and new sampling should be emphasized. Not only is information sometimes missing, but its value degrades over space and time. The value of a sample declines the further away you get from that site or the more different the spatial scale; samples over 50 years old may not represent current conditions any more. Predictions should consider or even incorporate this uncertainty.

Catherine Graham, David Nipperess, and Jon Chase all gave talks similarly emphasizing how fundamental consideration of scale and extent is. This is as true for phylogenetic community analysis (Graham, what extent or size of tree should be considered for analyses of community phylogenetics?); for rarefaction of phylogenetic diversity (Nipperess); or for measures of beta-diversity (Chase). Without this context, we are likely to be misunderstanding our results.

Finally, David Currie gave a damning critique of macroecology. Unfortunately, he said, macroecology seems to be a field where hypothesis testing is rare and conclusions are drawn based on spurious correlations with little explanatory and even less predictive ability. For example, why has the study of latitudinal gradients in richness progressed little beyond a list of possible correlates after more than 30 years of attention? Though Currie was focused on his own field, his comments were relevant to many ecological approaches. Currie expressed concerns about areas where scientific methods were being given short shift. In particular, he noted a lack of appropriate hypothesis testing and strong inference. Instead there is a tendency for studies to look for evidence in support of a hypothesis of interest, rather than attempting to falsify a hypothesis. Supporting evidence, sadly, does not actually increase the probability that a hypothesis is true, since the evidence could equally support some other, currently unconsidered, hypothesis. Further, correlations between variables of interest are at best a weak test of a hypothesis. The most important suggestions were that macroecologists and others should be testing the predictive ability of their hypotheses on new data sets: model fitting, in his opinion, is too often confused with model testing.

Monday, December 15, 2014

Holiday caRd 2014: Snowflakes

Apparently it's that time of year again! The R circlize package plays a prominent role in this caRd.
Like snowflakes, no two cards are likely to be identical, so try it a few times :)

Lots of options for viewing the R code:

1) Run it automatically by just using the following few lines of R code. Probably the easiest way, provided you've installed RCurl: it allows you to directly run the github code from its url.

install.packages("RCurl")

library(RCurl)
options(RCurlOptions = list(verbose = FALSE, capath = system.file("CurlSSL", "cacert.pem", package = "RCurl"), ssl.verifypeer = FALSE))
#this seems necessary for the Windows people only?
#
eval(expr = parse(text = getURL("https://gist.githubusercontent.com/cmtucker/c591e868c76de1ac81e6/raw/ea3581a2d7f10810023529c7046edb40f099cbb3/snowflakeCode")))

2) Go to https://gist.github.com/cmtucker/c591e868c76de1ac81e6 and access directly. You can download the file directly ("download gist") or hit "raw" and copy/paste.

3) Copy and paste the code below.

Friday, December 12, 2014

A changing world: Themes from the 2014 BES-SFE meeting in Lille #BESSfe

I attended the joint British Ecological Society/Société Française d’Ecologie (BES/SFE) meeting held in Lille, France, Dec. 9-12. I quite enjoy BES meetings, but this one felt just a little more dynamic and exciting. The meeting did a great job of bringing people together who otherwise might not attend the same meetings. The overall quality of talks was excellent and the impression was that labs were presenting their best, most exciting results. One thing that always fascinates me about meetings is the fact that emergent themes arise that reflect what people are currently excited about. Over the three days of talks, I felt that three emergent themes seemed particularly strong among the talks I attended:

1) Pollinators in a changing world

Photo by Marc Cadotte

There were a surprising number of talks focusing on human-caused changes to landscapes affect pollinator abundance and diversity. I am an Editor of a British Journal (Journal of Applied Ecology) and work on pollinator diversity has always been stronger in the UK, but there were just so many talks that it is obvious that this is an important issue for many people in the UK and Europe. Nick Isaac examined whether butterfly abundance was related to the abundance of host plants –which should be a measure of habitat quality. Plants that serves as hosts for caterpillars were more important than those that supply nectar to adults, presumably because the adults can better find resources. And specialist species were especially sensitive to host plant diversity.

Adriana De Palma gave a great talk on reanalyzing global patterns of bee responses to land-use and showed that biases in where research is done is influencing generalities. Bee communities in some well-studied regions appear more sensitive to land-use change and those regions with many bumblebees mask effects that on other types of bees. Bill Kunin examined patterns at a regional scale (UK) where a pollinator crisis was identified in the late 2000s and causes have been attributed to everything from land-use change to pesticide use to cell phones -to the second coming of Jesus. Habitat quality and flora resources do not seem to be that important at large scales, but there seems to be a strong effect of pesticide use. But at a smaller landscape scale, Florence Hecq showed that habitat heterogeneity within agricultural landscapes and the size of semi-natural grasslands were important for maintaining pollinator diversity. Changes in pollinator diversity have consequences for crop yield, as shown nicely by Colin Fontaine.

Photo by Marc Cadotte

In a really interesting study, Olivia Norfolk showed that traditional agriculture practices by Bedouin minorities in Egypt enhanced pollinator abundance. Because their agricultural practices support high plant diversity, both wild and domestic plant species, pollinators fare better than in intense agriculture. Moreover, one of the most important crops, almonds, sees higher yield with higher plant diversity –though this effect is lost when there are a lot of introduced honeybees.

2) Effects of land-use on biodiversity

A number of other talks examined how human-caused changes influence biodiversity patterns and resulting functions across a number of taxa. Jonathan Tonkin examined a number of different types of species (plants, beetles, spiders, etc.) that occur along riparian habitats and showed that there weren’t concordant changes in richness, but there were simultaneous shifts in composition. Human stressed caused multiple communities to shift to very nonrandom community types. In Agricultural systems, Colette Bertrand showed that agriculture that changed frequently (e.g., crop rotation) supported more beetle species that systems where the same crops are planted year after year.

Human deforestation greatly changes many biodiversity patterns and we need to better understand these make sound conservation decisions. Cecile Albert examined land-use change and fragmentation in southern Quebec and showed that we can determine the importance of forest patches in human-dominated landscapes for the ability of species to move between large forested areas. Using her model she can identify where conservation and habitat protection should be focused. Nicolas Labriere studied how different forest changes influenced the delivery of ecosystem services, including carbon storage, diversity and soil retention. He showed that only intact forests were able to maximally deliver all ecosystem services.

From WWF

3) Species differences and dynamics at different scales

A major theme is how species differences are important for ecological processes, ecosystem function and conservation. I’ve argued elsewhere that we are heading into a paradigm shift in ecology, where we've moved from counting species to accounting for species. Wilfried Thuiller asked how well European reserves conserve different forms of biodiversity, namely functional and phylogenetic diversity. He prioritized species by their distinctiveness and range size so that the most important were functionally or phylogenetically unique and have a small range. Distinct mammals tend to not be well protected and the modern reserve system does not maximally protect biodiversity. This is most acute in eastern Europe where there is a order of magnitude less protected area than in western Europe.

Georges Kunstler argued that trait approaches to understanding competition are valuable because they can reduce the dimensionality of students, from all pairwise species interactions to relative simple measures of trait differences. He showed, using an impressive global forest dataset, that competition appears stronger when neighbour trees are more similar in their traits.

A number of talks examined if measures of species differences can explain biodiversity patterns. At very large scales, Kyle Dexter showed that phylogenetic diversity does not explain where species are across the neotropics. In some places species are in the same habitat as a close relative and sometimes with a distant relative. At smaller scales, talks explored trait or phylogenetic patterns Andros Gianuca, Anne Pilière and Lars Götzenberger all assessed the relative contributions of trait and phylogenetic differences to explain community patterns and all showed that phylogeny may be a stronger explanation than the traits they measured.

4) Species dynamics, coexistence and ecosystem function

Understanding tree growth and dispersal are key to predicting how forests will respond to environmental change and to successfully managing and conserving them. Sean MacMahon showed that the seasonality of tree growth is critical to modelling carbon flux in forests. He developed an ingenious set of modelling approaches to analyze daily tree diameter change and showed that growth is highly concentrated in the middle of the growing season, which is at odds with traditional conceptual models where tree growth is constant from spring to fall. Noelle Beckman examined tree dispersal and the consequence of losing vertebrate seed dispersers. She showed that reducing the number of seed dispersers results in low seeding survival because seedlings are locally very dense, instead of being dispersed, and seed predators and other enemies have an easier time finding them.

The mechanism most often cited by plant community ecologists is competition, but Christian Damgaard states that this simple mechanism is almost never tested. Further, models of competition are often based on numbers of individuals, but plants make such counts notoriously difficult. Instead he developed a very elegant model showing how plant height and horizontal cover feedback to competition. What he calls vertical density is a predictor of the following season’s horizontal cover. Competition is also key to observing a relationship between species richness and ecosystem function. Rudolf Rohr showed, using a series of Lotka-Volterra models that randomly assembling communities always results in a positive relationship between richness and function –which is why experiments often support this pattern. In natural communities, this relationship often disappears, and he shows that simulations with competitive sorting break this relationship.

Finally, Florian Altermatt examined whether the physical structure of stream networks influences the distribution of diversity in streams using protozoan and bacterial communities in series of connected tubes that look like a branch, and compared these to linear tubes. He found that diversity is highest in the interior branches (see image to the left), much like real rivers, and the linear system had no such pattern of diversity. He attributed part of this diversity gradient to competitive differences among species and differences in movement of the organisms.

Monday, December 8, 2014

Identifying the correct spatial scale, a work in progress

Are ecologists conducting research at the optimal scale? Jackson, H.B. and Fahrig, L. 2014. Global Ecology and Biogeography. 24, 52–63.

It’s a truism in ecology that there is a spatial scale at which to each ecological process and interaction occurs. Competition often occurs at local scales, speciation generally occurs at biogeographic scales. Empirical evidence seems to support this - the relationship between, for example, forest cover and beetle abundance changes from strong to nonexistent as the spatial scale of analysis increases, suggesting small scales are most meaningful for the relationship (Holland et al 2004).

But do most ecological studies choose the appropriate scale for data collection and analysis? A new meta-analysis from Heather Bird Johnson and Lenore Fahrig suggests that ecological studies, even multi-scale studies, rarely do. Multi-scale studies can show how a relationship changes in strength with spatial scale, and so should be ideal for identifying the “intrinsic scale” or the “scale of effect” – the spatial extent that best predicts a particular ecological process. (Figure below)

From Jackson and Fahrig 2014. The scale of effect illustrated using a multi-scale study: the strength of the relationship between abundance and spatial scale is strongest at 4 km (radius).

Identifying the appropriate spatial scale for a question and system is of course ideal for a researcher. Researchers can then collect appropriate data, choose to focus on interactions with processes occurring at the same scale, and to correctly analyze data. However, the appropriate spatial scale may not be easy to identify: appropriate spatial scales must be included a multi-scale study. If a study includes spatial scales that cover too small or too large an extent, or has divides the extent into too few sub-scales, simply having a study with multiple scales may still be insufficient.

Theory suggests that species' traits--e.g. dispersal distances and reproductive rates--should be strongly related to the scale of effect, but empirical evidence is surprisingly inconclusive. If studies are already successfully identifying the scale of effect, the authors hypothesized that the observed scale of effect (the scale of prediction at which results are strongest) should vary with taxonomic group, body size, species’ mobility, reproductive rates, and species function. On the other hand, if the scale of effect is being inappropriately determined, perhaps due to decisions about the number of scales to include, and the minimum and maximum spatial scale considered, then these may be the primary correlate of the scale of effect.

To determine whether multi-scale studies were successfully identifying the scale of effect, the authors performed a literature review and meta-analysis. They identified studies that featured abundance or occurrence, which was measured at multiple nested spatial scales, for multiple taxonomic groups. The scales considered in these studies ranged from 10m to 100km.

The results were rather disappointing. By far the strongest predictors of the scale of effect in a study were the largest or smallest scales they considered. This suggests that the true scale of effect might be outside the scales considered by such studies. Worse, differences between taxonomic groups disappeared when you controlled for the minimum and maximum spatial scales used in a study. Where the same species appeared in several different studies, their scales of effect from each study were no more similar than if you had chosen any random species in the same taxonomic group.

From Jackson and Fahrig 2014. There were no significant differences between the observed scale of effect and taxonomic groups. Instead, the largest and smallest spatial scales evaluated in the study drove the conclusion about the scale of effect.

The good news is that the more scales that were considered in a study, the less likely it was that the minimum or maximum scales considered appeared to be driving the results.

From Jackson and Fahrig 2014. The more scales evaluated, the less likely that choice of minimum or maximum scale was driving the result.

In addition, the authors found that the relationship between observed scale of effect and species traits was weak to non-existent in most studies. This is particularly unfortunate given the recent focus on species traits as useful predictors of ecological relationships. The inability to correctly identify the scale of effect certainly may limit our ability associate spatial scale and traits. It is also likely that context modifies the effect of traits (for example, body size may have different effects on dispersal depending on the type of matrix and the environment), further weakening the observed relationship.

One of the largest issues Jackson and Fahrig identified is that in many of the papers, choice of scales was driving by methodological (data availability, precedent, etc.) issues rather than biological justifications. Questions about trait-scale relationships may well be unanswerable until studies have the data for a necessary range of spatial scales. Until then, Jackson and Fahrig recommend that studies be more forthright about their limitations, something this paper will hopefully draw attention to.

Wednesday, December 3, 2014

#ESA100 : Statistical Steps to Ecological Leaps

By Marc Cadotte and Caroline Tucker

For their centennial, ESA is asking their members identify as the ecological milestones of the last 100 years. They’ve asked the EEB & Flow to consider this question as a blog post. And there are many – ecology has grown from an amateur mix of natural history and physiology to a relevant and mature discipline. Part of this growth rests on major theoretical developments from great ecologists like Clements, Gleason, MacArthur, Whittaker, Wilson, Levins, Tilman and Hubbell. These people provided the ideas needed to move ecology to new territory. But ideas on their own aren’t enough, in the absence of necessary tools and methods. Instead, we argue that modern ecology would not exist without statistics.

The most cited paper in ecology and evolutionary biology is a methodological one (Felsenstein’s 1985 paper on phylogenetic confidence limits in Evolution – cited over 26,000 times) (Felsenstein, 1985). Statistics is the backbone that ecology develops around. Every new statistical method potentially opens the door to new ways of analyzing data and perhaps new hypotheses. To this end, we show how seven statistical methods changed ecology.

1. P-values and Hypothesis Testing – Setting standards for evidence.

Ecological papers in the early 1900s tended to be data-focused. And that data was analyzed in statistically rudimentary ways. Data was displayed graphically, perhaps with a simple model (e.g. regression) overlaid on the plot. Scientists sometimes argued that statistical tests offered no more than confirmation of the obvious.

At the same time, statistics were undergoing a revolution focused on hypothesis testing. Karl Pearson started it, but Ronald Fisher (Fisher 1925) and Pearson’s son Egon and Jerzy Neyman (Neyman & Pearson 1933) produced the theories that would change ecology. These men gave us the p-value – ‘the probability to obtain an effect equal to or more extreme than the one observed presuming the null hypothesis of no effect is true’ and gave us a modern view of hypothesis testing – i.e. that a scientist should attempt to reject a null hypothesis in favour of some alternative hypothesis.

It’s amazing to think that these concepts are now rote memorization for first year students, having become so ingrained into modern science. Hypothesis testing using some pre-specified level of significance is now the default method for looking for evidence. The questions asked, choices about sample size, experimental design and the evidence necessary to answer questions were all framed in the shadow of these new methods. p-values are no longer the only approach to hypothesis testing, but it is incontestable that Pearson and Fisher laid the foundations for modern ecology. (See Biau et al 2010 for a nice introduction).

2. Multivariate statistics: Beginning to capture ecological complexity.

Because the first emergence of statistical tests arose from agricultural studies, they were designed to test for differences from among treatments or from known distributions. They applied powerfully to experiments manipulating relatively few factors and measuring relatively few variables. However, these types of analyses did not easily permit investigations of complex patterns and mechanisms observed in natural communities.

Often what community ecologists have in hand are multiple datasets about communities including species composition and abundance, environmental measurements (e.g. soil nutrients, water chemistry, elevation, light, temperature, etc.), and perhaps distances between communities. And what researchers want to know is how compositional (multi-species) change among communities is determined by environmental variables. We shouldn’t understate the importance of this type of analysis on communities, in one tradition of community ecology, we would simply analyze changes in richness or diversity. But communities can show a lack of variation in diversity even when communities are being actively structured: diversity is simply the wrong currency.

Many of the first forays into multivariate statistics were through measuring the compositional dissimilarity or distances between communities. For example Jaccard (Jaccard, 1901), and Bray and Curtis (Bray & Curtis, 1957) are early ecologists that invented distance-based measures. Correlating compositional dissimilarity with environmental differences required ordination techniques. Principle Component Analysis (PCA) was actually invented by Karl Pearson around 1900 but computational limitations constrained its use until the 1980s. Around this time, other methods began to emerge which ecologists started to employ (Hill, 1979; Mantel, 1967). The development of new methods continues today (e.g. Peres-Neto & Jackson, 2001), and the use of multivariate analysis is a community ecology staple.

There are now full texts dedicated to the implementation of multivariate statistical tests with ecological data (e.g., Legendre & Legendre, 1998). Further, there are excellent resources available in R (more on this later) and especially in the package vegan (Oksanen et al., 2008), which implements most major multivariate methods. Going forward it is clear that multivariate techniques will continue to be reassessed and improved (e.g. Guillot & Rousset, 2013), and there will be a greater emphasis on the need to articulate multivariate hypotheses and perhaps use multivariate techniques to predict communities (Laughlin, 2014) –not just explain variation.

3. Null models: Disentangling patterns and processes.

Ecology occurs over large spatial and temporal scales, and so it is always reliant on observational data. Gathering observational data is often much easier than doing experimental work at the same spatial or temporal scale, but it is also complicated to analyze. Variation from a huge number of unmeasured variables could well weaken patterns or create unexpected ones. Still, the search for patterns drove the analysis of observational data: including patterns along environmental gradients, patterns in species co-occurrences, patterns in traits. The question of what represented a meaningful pattern was harder to answer.

It seems that ecology could not go on looking at patterns forever. But it took some heated arguments finally change this. The ‘null model wars’ revolved around Jared Diamond’s putative assembly rules for birds on islands (Diamond 1975), which relied on a “checkerboard” pattern of species co-occurrences. The argument for null models was led by Connor and Simberloff (Connor & Simberloff 1979) and later joined by Nicholas Gotelli (e.g. Gotelli & Graves 1996). A null model, they point out, was necessary to determine whether observed patterns of bird distribution were actually different from random patterns of apparent non-independence between species pairs. Further, other ecological mechanisms (different habitat requirements, speciation, dispersal limitations) could also produce non-independence between species pairs. The arguments about how to appropriately formulate null models have never completely ended (e.g., 1, 2, 3), but they now drive ecological analyses. Tests of species-area relationships, phylogenetic diversity within communities, limiting similarity of body sizes or traits, global patterns of diversity, species co-occurrences, niche overlaps, and nestedness in networks, likely all include a null model of some sort.

The null model wars have been referred to as a difficult and contentious time for ecology. Published work (representing significant amounts of time and funding) perhaps needed to be re-evaluated to differentiate between true and null ecological patterns. But despite these growing pains, null models have forced ecology to mature beyond pattern-based analyses to more mechanistic ones.

4. Spatial statistics: Adding distance and connectivity.

Spatially-explicit statistics and models seem like an obvious necessity for ecology. After all, the movement of species through space is an immensely important part of their life history, and further, most ecologically relevant characteristics of the landscapes vary through space, e.g. resources, climate, and habitat. Despite this, until quite recently ecological models tended to assume a uniform distribution of species and processes through space, and that species’ movement was uniform or random through space. The truism that points close in space, all else being equal, should be more similar than distant points, while obvious, also involved a degree of statistical complexity and computing requirements difficult to achieve.

Fortunately for ecology, the late 1980s and early 1990s were a time of rapid computing developments that enabled the incorporation of increasing spatial complexity into ecological models (Fortin & Dale 2005). Existing methods – some ecological, some borrowed from geography – were finally possible with available technology, including nearest neighbour distances, Ripley’s K, variograms, and the Mantel test (Fortin, Dale & ver Hoef 2002). Ideas now fundamental to ecology such as connectivity, edge effects, spatial scale (“local” vs. “regional”), spatial autocorrelation, and spatial pattern (non-random, non-uniform spatial distributions) are the inheritence of this development. Many fields of ecology have incorporated spatial methods or even owe their development to spatial ecology, including meta-communities, landscape ecology, conservation and management, invasive species, disease ecology, population ecology, and population genetics. Pierre Legendre asked in his seminal paper (Legendre 1993) on the topic whether space was trouble, or a new paradigm. It is clear that space was an important addition to ecological analyses.

5. Measuring diversity: rarefaction and diversity estimators.

How many species are there in a community? This is a question that inspires many biologists, and is something that is actually very difficult to measure. Cryptic, dormant, rare and microscopic organisms are often undersampled, and accurate estimates of community diversity need to deal with these undersampled species.

Communities may seem to have different numbers of species simply based on the fact some have been sampled more thoroughly. Unequal sampling effort can distort real differences or similarities in the numbers of species. For example, in some recent analyses of plant diversity using the freely available species occurrence data from GBIF, we found that Missouri seems to have the highest plant diversity –a likely outcome of the fact that the Missouri Botanical Gardens routinely samples local vegetation and makes the data available. Estimating diversity from equalized sampling effort was developed by a number of ecologists (Howard Sanders, Stuart Hurlbert, Dan Simberloff, and Ken Heck) in the 1960s and 1970s resulting in modern rarefaction techniques.

Sampling effort was one problem, and ecologists also recognized that even with equivalent sampling effort, we are likely missing rare and cryptic species. Most notably Anne Chao and Ramon Margalef developed a series of diversity estimators in the 1980s-1990s. These types of estimators place emphasis on the numbers of rare species, because these give insight into the unobserved species. All things being equal, the community with more rare species likely has more unobserved species. These types of estimators are particularly important when we need to estimate the ‘true’ diversity form a limited number of samples. For example, researchers at Sun Yat-sen University in Guangzhou, China, recently performed metagenomic sampling of almost 2000 soil samples from a 500x1500 m forest plot. From these samples they used all known diversity estimators and have come to the conclusion that there are about 40,000 species of bacteria and 16,000 species of fungi in this forest plot! This level of diversity is truly astounding, and without genetic sampling and the suite of diversity estimators, we would have no way of knowing that there is this amazing, complex world beneath our feet.

As we move forward, researchers are measuring diversity in new ways, by quantifying phylogenetic and functional diversity and we will need new methods to estimate these for entire communities and habitats. Anne Chao, and colleagues have recently published a method to estimate true phylogenetic diversity (Chao et al., 2014).

6. Hierarchical and Bayesian modelling: Understanding complex living systems.

Each previous section reinforces the fact that ecology has embraced statistical methods that allow it to incorporate complexity. Accurately fitting models to observational data might require large numbers of parameters with different distributions and complicated interconnections. Hierarchical models offer a bridge between theoretical models and observational data: they can account for missing or biased data, latent (unmeasured) variables, and model uncertainty. In short, they are ideal for the probabilistic nature of ecological questions and predictions (Royle and Dorazio, 2008). The computational and conceptual tools have greatly advanced over the past decade, with a number of good computer programs (e.g., BUGS ) available and several useful texts (e.g., Bolker 2008).

The usage of these types of models has been closely (but not exclusively) tied to Bayesian approaches to statistics. Bayesian statistics have had much written about them, and not a little controversy beyond the scope of this post (but see these blogs for lots of interesting discussion). The focus is on assigning a probability distribution to a hypothesis (the prior distribution) which can be updated sequentially as more information is obtained. Such an approach may have natural similarities to management and applied practices in ecology, where expert or existing knowledge is already incorporated into decision making and predictions informally. Often though, hierarchical models can be tailored to better fit our hypotheses than traditional univariate statistics. For example, species occupancy or abundance can be modelled as probabilities based on detection error, environmental fit and dispersal likelihood.

There is so much that can be said about hierarchical and bayesian statistical models, and their incorporation into ecology is still in progress. The promise from these methods that the complexity inherent in ecological processes can be more closely captured by statistical models and that model predictions are improving, is one of the most important developments in recent years.

7. The availability, community development and open sharing of statistical methods.

The availability of and access to statistical methods today is unparalleled in any time in human history. And it is because of the program R. There was a time recently where a researcher might have had to purchase a new piece of software to perform a specific analysis, or that they would have to wait years for new analyses to become available. The rise of this availability of statistical methods is threefold. First, R is freely available without any fees limiting access. Second, is that the community of users contribute to it, meaning that specific analyses required for different questions are available, and often formulated to handle the most common types of data. Finally, new methods appear in R as they are developed. Cutting edge techniques are immediately available, further fostering their use and scientific advancement.

References

Bolker, B. M. (2008). Ecological models and data in R. Princeton University Press.

Bray, J. R., & Curtis, J. T. (1957). An Ordination of the Upland Forest Communities of Southern Wisconsin. Ecological Monographs, 27(4), 325–349. doi:10.2307/1942268

Chao, A., Chiu, C.-H., Hsieh, T. C., Davis, T., Nipperess, D. A., & Faith, D. P. (2014). Rarefaction and extrapolation of phylogenetic diversity. Methods in Ecology and Evolution, n/a–n/a. doi:10.1111/2041-210X.12247

Connor, E.F. & Simberloff, D. (1979) The assembly of species community: chance or competition? Ecology, 60, 1132-1140.

Diamond, J.M. (1975) Assembly of species communities. Ecology and evolution of communities (eds M.L. Cody & J.M. Diamond), pp. 324-444. Harvard University Press, Massachusetts.
Felsenstein, J. (1985). Confidence limits on phylogenies : An approach using the bootstrap. Evolution, 39, 783–791.

Fisher, R.A. (1925) Statistical methods for research workers. Oliver and Boyd, Edinburgh.
Fortin, M.-J. & Dale, M. (2005) Spatial Analysis: A guide for ecologists. Cambridge University Press, Cambridge.

Fortin, M.-J., Dale, M. & ver Hoef, J. (2002) Spatial analysis in ecology. Encyclopedia of Environmetrics (eds A.H. El-Shaawari & W.W. Piegorsch). John Wiley & Sons.
Gotelli, N.J. & Graves, G.R. (1996) Null models in ecology. Smithsonian Institution Press Washington, DC.

Guillot, G., & Rousset, F. (2013). Dismantling the Mantel tests. Methods in Ecology and Evolution, 4(4), 336–344. doi:10.1111/2041-210x.12018
Hill, M. O. (1979). DECORANA — A FORTRAN program for Detrended Correspondence Analysis and Reciprocal Averaging.

Jaccard, P. (1901). Etude comparative de la distribution florale dans une portion des Alpes et du Jura. Bulletin de La Societe Vaudoise Des Sciences Naturelle, 37, 547–579.

Laughlin, D. C. (2014). Applying trait-based models to achieve functional targets for theory-driven ecological restoration. Ecology Letters, 17(7), 771–784. doi:10.1111/ele.12288

Legendre, P. (1993) Spatial autocorrelation: trouble or new paradigm? Ecology, 74.

Legendre, P., & Legendre, L. (1998). Numerical Ecology. Amsterdam: Elsevier Science B. V.

Mantel, N. (1967). The detection of disease clustering and a generalized regression approach. Cancer Research, 27, 209–220.

Neyman, J. & Pearson, E.S. (1933) On the problem of the most efficient tests of statistical hypotheses. PHilosophical Transactions of the Royal Society A, CCXXXL.

Oksanen, J., Kindt, R., Legendre, P., O’Hara, R., Simpson, G. L., Stevens, M. H. H., & Wagner, H. (2008). Vegan: Community Ecology Package. Retrieved from http://vegan.r-forge.r-project.org/

Peres-Neto, P. R., & Jackson, D. A. (2001). How well do multivariate data sets match? The advantages of a Procrustean superimposition approach over the Mantel test. Oecologia, 129, 169–178.

Royle and Dorazio. (2008). Hierarchical Modeling and Inference in Ecology.