Showing posts with label community ecology. Show all posts
Showing posts with label community ecology. Show all posts

Wednesday, December 3, 2014

#ESA100 : Statistical Steps to Ecological Leaps

By Marc Cadotte and Caroline Tucker

For their centennial, ESA is asking their members identify as the ecological milestones of the last 100 years. They’ve asked the EEB & Flow to consider this question as a blog post. And there are many – ecology has grown from an amateur mix of natural history and physiology to a relevant and mature discipline. Part of this growth rests on major theoretical developments from great ecologists like Clements, Gleason, MacArthur, Whittaker, Wilson, Levins, Tilman and Hubbell. These people provided the ideas needed to move ecology to new territory. But ideas on their own aren’t enough, in the absence of necessary tools and methods. Instead, we argue that modern ecology would not exist without statistics.

The most cited paper in ecology and evolutionary biology is a methodological one (Felsenstein’s 1985 paper on phylogenetic confidence limits in Evolution – cited over 26,000 times) (Felsenstein, 1985). Statistics is the backbone that ecology develops around. Every new statistical method potentially opens the door to new ways of analyzing data and perhaps new hypotheses. To this end, we show how seven statistical methods changed ecology.

1. P-values and Hypothesis Testing – Setting standards for evidence.

Ecological papers in the early 1900s tended to be data-focused. And that data was analyzed in statistically rudimentary ways. Data was displayed graphically, perhaps with a simple model (e.g. regression) overlaid on the plot. Scientists sometimes argued that statistical tests offered no more than confirmation of the obvious.

At the same time, statistics were undergoing a revolution focused on hypothesis testing. Karl Pearson started it, but Ronald Fisher (Fisher 1925) and Pearson’s son Egon and Jerzy Neyman (Neyman & Pearson 1933) produced the theories that would change ecology. These men gave us the p-value – ‘the probability to obtain an effect equal to or more extreme than the one observed presuming the null hypothesis of no effect is true’ and gave us a modern view of hypothesis testing – i.e. that a scientist should attempt to reject a null hypothesis in favour of some alternative hypothesis.

It’s amazing to think that these concepts are now rote memorization for first year students, having become so ingrained into modern science. Hypothesis testing using some pre-specified level of significance is now the default method for looking for evidence. The questions asked, choices about sample size, experimental design and the evidence necessary to answer questions were all framed in the shadow of these new methods. p-values are no longer the only approach to hypothesis testing, but it is incontestable that Pearson and Fisher laid the foundations for modern ecology. (See Biau et al 2010 for a nice introduction).

2. Multivariate statistics: Beginning to capture ecological complexity.

Because the first emergence of statistical tests arose from agricultural studies, they were designed to test for differences from among treatments or from known distributions. They applied powerfully to experiments manipulating relatively few factors and measuring relatively few variables. However, these types of analyses did not easily permit investigations of complex patterns and mechanisms observed in natural communities.

Often what community ecologists have in hand are multiple datasets about communities including species composition and abundance, environmental measurements (e.g. soil nutrients, water chemistry, elevation, light, temperature, etc.), and perhaps distances between communities. And what researchers want to know is how compositional (multi-species) change among communities is determined by environmental variables. We shouldn’t understate the importance of this type of analysis on communities, in one tradition of community ecology, we would simply analyze changes in richness or diversity. But communities can show a lack of variation in diversity even when communities are being actively structured: diversity is simply the wrong currency.

Many of the first forays into multivariate statistics were through measuring the compositional dissimilarity or distances between communities. For example Jaccard (Jaccard, 1901), and Bray and Curtis (Bray & Curtis, 1957) are early ecologists that invented distance-based measures. Correlating compositional dissimilarity with environmental differences required ordination techniques. Principle Component Analysis (PCA) was actually invented by Karl Pearson around 1900 but computational limitations constrained its use until the 1980s. Around this time, other methods began to emerge which ecologists started to employ (Hill, 1979; Mantel, 1967). The development of new methods continues today (e.g. Peres-Neto & Jackson, 2001), and the use of multivariate analysis is a community ecology staple.

There are now full texts dedicated to the implementation of multivariate statistical tests with ecological data (e.g., Legendre & Legendre, 1998). Further, there are excellent resources available in R (more on this later) and especially in the package vegan (Oksanen et al., 2008), which implements most major multivariate methods. Going forward it is clear that multivariate techniques will continue to be reassessed and improved (e.g. Guillot & Rousset, 2013), and there will be a greater emphasis on the need to articulate multivariate hypotheses and perhaps use multivariate techniques to predict communities (Laughlin, 2014) –not just explain variation.

3. Null models: Disentangling patterns and processes.

Ecology occurs over large spatial and temporal scales, and so it is always reliant on observational data. Gathering observational data is often much easier than doing experimental work at the same spatial or temporal scale, but it is also complicated to analyze. Variation from a huge number of unmeasured variables could well weaken patterns or create unexpected ones. Still, the search for patterns drove the analysis of observational data: including patterns along environmental gradients, patterns in species co-occurrences, patterns in traits. The question of what represented a meaningful pattern was harder to answer.

It seems that ecology could not go on looking at patterns forever. But it took some heated arguments finally change this. The ‘null model wars’ revolved around Jared Diamond’s putative assembly rules for birds on islands (Diamond 1975), which relied on a “checkerboard” pattern of species co-occurrences. The argument for null models was led by Connor and Simberloff (Connor & Simberloff 1979) and later joined by Nicholas Gotelli (e.g. Gotelli & Graves 1996). A null model, they point out, was necessary to determine whether observed patterns of bird distribution were actually different from random patterns of apparent non-independence between species pairs. Further, other ecological mechanisms (different habitat requirements, speciation, dispersal limitations) could also produce non-independence between species pairs. The arguments about how to appropriately formulate null models have never completely ended (e.g., 1, 2, 3), but they now drive ecological analyses. Tests of species-area relationships, phylogenetic diversity within communities, limiting similarity of body sizes or traits, global patterns of diversity, species co-occurrences, niche overlaps, and nestedness in networks, likely all include a null model of some sort.

The null model wars have been referred to as a difficult and contentious time for ecology. Published work (representing significant amounts of time and funding) perhaps needed to be re-evaluated to differentiate between true and null ecological patterns. But despite these growing pains, null models have forced ecology to mature beyond pattern-based analyses to more mechanistic ones.

4. Spatial statistics: Adding distance and connectivity.

Spatially-explicit statistics and models seem like an obvious necessity for ecology. After all, the movement of species through space is an immensely important part of their life history, and further, most ecologically relevant characteristics of the landscapes vary through space, e.g. resources, climate, and habitat. Despite this, until quite recently ecological models tended to assume a uniform distribution of species and processes through space, and that species’ movement was uniform or random through space. The truism that points close in space, all else being equal, should be more similar than distant points, while obvious, also involved a degree of statistical complexity and computing requirements difficult to achieve.

Fortunately for ecology, the late 1980s and early 1990s were a time of rapid computing developments that enabled the incorporation of increasing spatial complexity into ecological models (Fortin & Dale 2005). Existing methods – some ecological, some borrowed from geography – were finally possible with available technology, including nearest neighbour distances, Ripley’s K, variograms, and the Mantel test (Fortin, Dale & ver Hoef 2002). Ideas now fundamental to ecology such as connectivity, edge effects, spatial scale (“local” vs. “regional”), spatial autocorrelation, and spatial pattern (non-random, non-uniform spatial distributions) are the inheritence of this development. Many fields of ecology have incorporated spatial methods or even owe their development to spatial ecology, including meta-communities, landscape ecology, conservation and management, invasive species, disease ecology, population ecology, and population genetics. Pierre Legendre asked in his seminal paper (Legendre 1993) on the topic whether space was trouble, or a new paradigm. It is clear that space was an important addition to ecological analyses.

5. Measuring diversity: rarefaction and diversity estimators.

How many species are there in a community? This is a question that inspires many biologists, and is something that is actually very difficult to measure. Cryptic, dormant, rare and microscopic organisms are often undersampled, and accurate estimates of community diversity need to deal with these undersampled species.

Communities may seem to have different numbers of species simply based on the fact some have been sampled more thoroughly. Unequal sampling effort can distort real differences or similarities in the numbers of species. For example, in some recent analyses of plant diversity using the freely available species occurrence data from GBIF, we found that Missouri seems to have the highest plant diversity –a likely outcome of the fact that the Missouri Botanical Gardens routinely samples local vegetation and makes the data available. Estimating diversity from equalized sampling effort was developed by a number of ecologists (Howard Sanders, Stuart Hurlbert, Dan Simberloff, and Ken Heck) in the 1960s and 1970s resulting in modern rarefaction techniques.

Sampling effort was one problem, and ecologists also recognized that even with equivalent sampling effort, we are likely missing rare and cryptic species. Most notably Anne Chao and Ramon Margalef developed a series of diversity estimators in the 1980s-1990s. These types of estimators place emphasis on the numbers of rare species, because these give insight into the unobserved species. All things being equal, the community with more rare species likely has more unobserved species. These types of estimators are particularly important when we need to estimate the ‘true’ diversity form a limited number of samples. For example, researchers at Sun Yat-sen University in Guangzhou, China, recently performed metagenomic sampling of almost 2000 soil samples from a 500x1500 m forest plot. From these samples they used all known diversity estimators and have come to the conclusion that there are about 40,000 species of bacteria and 16,000 species of fungi in this forest plot! This level of diversity is truly astounding, and without genetic sampling and the suite of diversity estimators, we would have no way of knowing that there is this amazing, complex world beneath our feet.

As we move forward, researchers are measuring diversity in new ways, by quantifying phylogenetic and functional diversity and we will need new methods to estimate these for entire communities and habitats. Anne Chao, and colleagues have recently published a method to estimate true phylogenetic diversity (Chao et al., 2014).

6. Hierarchical and Bayesian modelling: Understanding complex living systems.

Each previous section reinforces the fact that ecology has embraced statistical methods that allow it to incorporate complexity. Accurately fitting models to observational data might require large numbers of parameters with different distributions and complicated interconnections. Hierarchical models offer a bridge between theoretical models and observational data: they can account for missing or biased data, latent (unmeasured) variables, and model uncertainty. In short, they are ideal for the probabilistic nature of ecological questions and predictions (Royle and Dorazio, 2008). The computational and conceptual tools have greatly advanced over the past decade, with a number of good computer programs (e.g., BUGS ) available and several useful texts (e.g., Bolker 2008).

The usage of these types of models has been closely (but not exclusively) tied to Bayesian approaches to statistics. Bayesian statistics have had much written about them, and not a little controversy beyond the scope of this post (but see these blogs for lots of interesting discussion). The focus is on assigning a probability distribution to a hypothesis (the prior distribution) which can be updated sequentially as more information is obtained. Such an approach may have natural similarities to management and applied practices in ecology, where expert or existing knowledge is already incorporated into decision making and predictions informally. Often though, hierarchical models can be tailored to better fit our hypotheses than traditional univariate statistics. For example, species occupancy or abundance can be modelled as probabilities based on detection error, environmental fit and dispersal likelihood.

There is so much that can be said about hierarchical and bayesian statistical models, and their incorporation into ecology is still in progress. The promise from these methods that the complexity inherent in ecological processes can be more closely captured by statistical models and that model predictions are improving, is one of the most important developments in recent years. 

7. The availability, community development and open sharing of statistical methods.

The availability of and access to statistical methods today is unparalleled in any time in human history. And it is because of the program R. There was a time recently where a researcher might have had to purchase a new piece of software to perform a specific analysis, or that they would have to wait years for new analyses to become available. The rise of this availability of statistical methods is threefold. First, R is freely available without any fees limiting access. Second, is that the community of users contribute to it, meaning that specific analyses required for different questions are available, and often formulated to handle the most common types of data. Finally, new methods appear in R as they are developed. Cutting edge techniques are immediately available, further fostering their use and scientific advancement.

References

Bolker, B. M. (2008). Ecological models and data in R. Princeton University Press.

Bray, J. R., & Curtis, J. T. (1957). An Ordination of the Upland Forest Communities of Southern Wisconsin. Ecological Monographs, 27(4), 325–349. doi:10.2307/1942268

Chao, A., Chiu, C.-H., Hsieh, T. C., Davis, T., Nipperess, D. A., & Faith, D. P. (2014). Rarefaction and extrapolation of phylogenetic diversity. Methods in Ecology and Evolution, n/a–n/a. doi:10.1111/2041-210X.12247

Connor, E.F. & Simberloff, D. (1979) The assembly of species community: chance or competition? Ecology, 60, 1132-1140.

Diamond, J.M. (1975) Assembly of species communities. Ecology and evolution of communities (eds M.L. Cody & J.M. Diamond), pp. 324-444. Harvard University Press, Massachusetts.
Felsenstein, J. (1985). Confidence limits on phylogenies : An approach using the bootstrap. Evolution, 39, 783–791.

Fisher, R.A. (1925) Statistical methods for research workers. Oliver and Boyd, Edinburgh.
Fortin, M.-J. & Dale, M. (2005) Spatial Analysis: A guide for ecologists. Cambridge University Press, Cambridge.

Fortin, M.-J., Dale, M. & ver Hoef, J. (2002) Spatial analysis in ecology. Encyclopedia of Environmetrics (eds A.H. El-Shaawari & W.W. Piegorsch). John Wiley & Sons.
Gotelli, N.J. & Graves, G.R. (1996) Null models in ecology. Smithsonian Institution Press Washington, DC.

Guillot, G., & Rousset, F. (2013). Dismantling the Mantel tests. Methods in Ecology and Evolution, 4(4), 336–344. doi:10.1111/2041-210x.12018
Hill, M. O. (1979). DECORANA — A FORTRAN program for Detrended Correspondence Analysis and Reciprocal Averaging.

Jaccard, P. (1901). Etude comparative de la distribution florale dans une portion des Alpes et du Jura. Bulletin de La Societe Vaudoise Des Sciences Naturelle, 37, 547–579.

Laughlin, D. C. (2014). Applying trait-based models to achieve functional targets for theory-driven ecological restoration. Ecology Letters, 17(7), 771–784. doi:10.1111/ele.12288

Legendre, P. (1993) Spatial autocorrelation: trouble or new paradigm? Ecology, 74.

Legendre, P., & Legendre, L. (1998). Numerical Ecology. Amsterdam: Elsevier Science B. V.

Mantel, N. (1967). The detection of disease clustering and a generalized regression approach. Cancer Research, 27, 209–220.

Neyman, J. & Pearson, E.S. (1933) On the problem of the most efficient tests of statistical hypotheses. PHilosophical Transactions of the Royal Society A, CCXXXL.

Oksanen, J., Kindt, R., Legendre, P., O’Hara, R., Simpson, G. L., Stevens, M. H. H., & Wagner, H. (2008). Vegan: Community Ecology Package. Retrieved from http://vegan.r-forge.r-project.org/

Peres-Neto, P. R., & Jackson, D. A. (2001). How well do multivariate data sets match? The advantages of a Procrustean superimposition approach over the Mantel test. Oecologia, 129, 169–178.

Royle and Dorazio. (2008). Hierarchical Modeling and Inference in Ecology. 

Monday, October 6, 2014

What is ecology’s billion dollar brain?

(*The topic of the billion dollar proposal came up with Florian Hartig (@florianhartig), with whom I had an interesting conversation on the idea*)

Last year, the European Commission awarded 1 billion dollars to a hugely ambitious project to recreate the human brain using supercomputers. If successful, the Human Brain Project would revolutionize neuroscience. (Although skepticism remains as to whether this project is a more of a pipe dream than reasonable goal). For ecology and evolution, where infrastructure costs are relatively low (compared to say, a Large Hadron Collider), 1 billion dollars means that there is essentially no financial limitation on your proposal, so nearly any project, experiment, analysis, dataset, or workforce, is within the realm of possibility. The European Commission call was for a proposal for research to occur over 10 years, meaning that the constraints on project length (usually driven by grant terms and graduate student theses) are low. So if you could write a proposal, upon which there are essentially no constraints at all, what would it be for? (*if you think that 10 years is too limiting for a proper long-term study, feel free to assume you can set up the infrastructure in 10 years and run it for as long as you want).

The first thing I recognized was that in proposing the 'ultimate' ecological project, you're implicitly stating how you think ecology should be done. For example, do you could focus on the most general questions and start from the bottom. If this is the case, it might be most effective to ask a single fundamental question. It would not be unreasonable to propose to measure metabolic rates under standardized conditions for every extent species, and develop a database of parameter values for them. This would be the most complete ecological database ever, that certainly seems like an achievement. 

But perhaps you choose something that is still of general importance but less simplistic, and run a standardized experiment in multiple systems. This has been effective for the NutNet project. Propose to run replicate experiments with top-of-the-line warming arrays on plant communities in every major ecosystem. Done for 10 years, over a reasonably large scale, with data recorded on physiology and important life history events, this might provide some ability to predict how warming temperatures are affecting ecosystems. 

The alternative is embrace ecological complexity (and the ability to deal with complexity that 1 billion dollars offers). Given the analytic power, equipment, and man hours that 1 billion dollars can buy, you could record every single variable--biotic, abiotic, weather--in a particular system (say, a wetland) for every second of every day. If you don’t simply drown in the data you’ve gathered, maybe you can reconstruct that wetland, predict every property from the details. While that may seem a bit extreme, if you are a complexity-fatalist, you start to recognize that even the general experiments are quickly muddied by complexity. Even that simple, general list of species' metabolic parameters quickly spirals into complexity. Does it make sense to use only one set of standardized conditions? After all, conditions that are reasonable for a rainforest tree are meaningless for an ocean shark or a tundra shrub. Do you use the mean condition for each ecosystem as the standard, knowing that species may only interact with the variance or extremes in those conditions (such as desert annuals that bloom after rains, or bacteria that use cyst stages to avoid harsh environments). What about ontogenetic or plastic differences? Intraspecific differences?

It's probably best then to realize that there is no perfect ecological experiment. The interesting thing about the Human Brain project is that neuroscience is more like ecology than many scientific fields - it deals with complex organic systems with emergent properties and great variability. What ecology needs, ever so simplistically, is more data and better models. Maybe, like neuroscience, we should request a supercomputer that could located and incorporate all ecological data ever collected, across fields (natural history, forestry, agronomy, etc) and recognize the connections between that data, based on geography, species, or scale. This could both give us the most sophisticated possible data map, showing where the data gaps exist, and where areas are data-rich and ready for model development. Further, it could (like the Human Brain) begin to develop models for the interconnections between data. 

Without too many billion dollar calls going on, this is only a thought experiment, but I have yet to find someone who had an easy answer for what they would propose to do (ecologically) with 1 billion dollars. Why is it so difficult?

Monday, March 17, 2014

How are we defining prediction in ecology?

There is an ongoing debate about the role of wolves in altering ecosystem dynamics in Yellowstone, which has stimulated a number of recent papers, and apparently inspired an editorial in Nature. Entitled “An Elegant Chaos”, the editorial reads a bit like an apology for ecology’s failure at prediction, suggesting that we should embrace ecology’s lack of universal laws and recognize that “Ecological complexity, which may seem like an impenetrable thicket of nuance, is also the source of much of our pleasure in nature”.

Most of the time, I also fall squarely into the pessimistic “ecological complexity limits predictability” camp. And concerns about prediction in ecology are widespread and understandable. But there is also something frustrating about the way we so often approach ecological prediction. Statements such as “It would be useful to have broad patterns and commonalities in ecology” feel incomplete. Is it that we really lack “broad patterns and commonalities in ecology”, or has ecology adopted a rather precise and self-excoriating definition for “prediction”? 

.
We are fixated on achieving particular forms of prediction (either robust universal relationships, or else precise and specific numerical outputs), and perhaps we are failing at achieving these. But on the other hand, ecology is relatively successful in understanding and predicting qualitative relationships, especially at large spatial and temporal scales. At the broadest scales, ecologists can predict the relationships between species numbers and area, between precipitation, temperature and habitat type, between habitat types and the traits of species found within, between productivity and the general number of trophic levels supported. Not only do we ignore this foundation of large-scale predictable relationships, but we ignore the fact that prediction is full of tradeoffs. As a paper with the excellent title, “The good, the bad, and the ugly of predictive science” states, any predictive model is still limited by tradeoffs between: “robustness-to-uncertainty, fidelity-to-data, and confidence-in-prediction…. [H]igh-fidelity models cannot…be made robust to uncertainty and lack-of-knowledge. Similarly, equally robust models do not provide consistent predictions, hence reducing confidence-in-prediction. The conclusion of the theoretical investigation is that, in assessing the predictive accuracy of numerical models, one should never focus on a single aspect.” Different types of predictions have different limitations. But sometimes it seems that ecologists want to make predictions in the purest, trade-off free sense - robustness-to-uncertainty, fidelity-to-data, and confidence-in-prediction - all at once. 

In relation to this, ecological processes tend to be easier to represent in a probabilistic fashion, something that we seem rather uncomfortable with. Ecology is predictive in the way medicine is predictive – we understand the important cause and effect relationships, many of the interactions that can occur, and we can even estimate the likelihood of particular outcomes (of smoking causing lung cancer, of warming climate decreasing diversity), but predicting how a human body or ecosystem will change is always inexact. The complexity of multiple independent species, populations, genes, traits, all interacting with similarly changing abiotic conditions makes precise quantitative predictions at small scales of space or time pretty intractable. So maybe that shouldn’t be our bar for success. The analogous problem for an evolutionary biologist would be to predict not only a change in population genetic structure but also the resulting phenotypes, accounting for epigenetics and plasticity too. I think that would be considered unrealistic, so why is that where we place the bar for ecology? 

In part the bar for prediction is set so high because the demand for ecological knowledge, given habitat destruction, climate change, extinction, and a myriad of other changes, is so great. But in attempting to fulfill that need, it may be worth acknowledging that predictions in ecology occur on a hierarchy from those relationships at the broadest scale that we can be most certain about, moving down to the finest scale of interactions and traits and genes where we may be less certain. If we see events as occurring with different probabilities, and our knowledge of those probability distributions declining the farther down that hierarchy we travel, then our predictive ability will decline as well. New and additional research adds to the missing or poor relationships, but at the finest scales, prediction may always be limited.

Wednesday, January 29, 2014

Guest post: One way to quantify ecological communities

This is a guest post by Aspen Reese, a graduate student at Duke University, who in addition to studying the role of trophic interactions in driving secondary succession, is interested in how ecological communities are defined. Below she explains one possible way to explicitly define communities, although it's important to note that communities must explicitly be networks for the below calculations.

Because there are so many different ways of defining “community”, it can be hard to know what, exactly, we’re talking about when we use the term. It’s clear, though, that we need to take a close look at our terminology. In her recent post, Caroline Tucker offers a great overview of why this is such an important conversation to have. As she points out, we aren’t always completely forthright in laying out the assumptions underlying the definition used in any given study or subdiscipline. The question remains then: how to function—how to do and how to communicate good research—in the midst of such a terminological muddle?

We don’t need a single, objective definition of community (could we ever agree? And why should we?). What we do need, though, are ways to offer transparent, rigorous definitions of the communities we study. Moreover, we need a transferable system for quantifying these definitions.

One way we might address this need is to borrow a concept from the philosophy of biology, called entification. Entification is a way of quantifying thingness. It allows us to answer the question: how much does my study subject resemble an independent entity? And, more generally, what makes something an entity at all?

Stanley Salthe (1985) gives us a helpful definition: Entities can be defined by their boundaries, degree of integration, and continuity (Salthe also includes scale, but in a very abstract way, so I’ll leave that out for now). What we need, then, is some way to quantify the boundedness, integration, and continuity of any given community. By conceptualizing the community as an ecological network*—with a population of organisms (nodes) and their interactions (edges)—that kind of quantification becomes possible.

Consider the following framework: 

Boundedness
Communities are discontinuous from the environment around them, but how discrete that boundary is varies widely. We can quantify this discreteness by measuring the number of nodes that don’t have interactions outside the system relative to the total number of nodes in the system (Fig. 1a). 

Boundedness = (Total nodes without external edges)/(Total nodes)

Integration
Communities exhibit the interdependence and connections of their parts—i.e. integration. For any given level of complexity (which we can define as the number of constitutive part types, i.e. nodes (McShea 1996)), a system becomes more integrated as the networks and feedback loops between the constitutive part types become denser and the average path length decreases. Therefore, degree of integration can be measured as one minus the average path length (or average distance) between two parts relative to the total number of parts (Fig. 1b).

Integration 1-((Average path length)/(Total nodes))

Continuity
All entities endure, if only for a time. And all entities change, if only due to entropy. The more similar a community is to its historical self, the more continuous it is. Using networks from two time points, a degree of continuity is calculated with a Jaccard index as the total number of interactions unchanged between both times relative to the total number of interactions at both times (Fig. 1c).

Continuity = (Total edges-changed edges)/(Total edges)
Fig 1. The three proposed metrics for describing entities—(A) boundedness, (B) integration, and (C) continuity—and how to calculate them. 

Let’s try this method out on an arctic stream food web (Parker and Huryn 2006). The stream was measured for trophic interactions in June and August of 2002 (Fig. 2). If we exclude detritus and consider the waterfowl as outside the community, we calculate that the stream has a degree of boundeness of 0.79 (i.e. ~80% of its interactions are between species included in the community), a degree of integration of 0.98 (i.e. the average path length is very close to 1), and a degree of continuity of 0.73 (i.e. almost 3/4 of the interactions are constant over the course of the two months). It’s as easy as counting nodes and edges—not too bad! But what does it mean?
Fig. 2: The food web community in an arctic stream over summer 2002. Derived from Parker and Huryn (2006). 

Well, compare the arctic stream to a molecular example. Using a simplified network (Burnell et al. 2005), we can calculate the entification of the cellular respiration pathway (Fig. 3). We find that for the total respiration system, including both the aerobic and anaerobic pathways, boundedness is 0.52 and integration is 0.84. The continuity of the system is likely equal to 1 at most times because both pathways are active, and their makeup is highly conserved. However, if one were to test for the continuity of the system when it switches between the aerobic and the anaerobic pathway, the degree of continuity drops to 0.6.
Fig. 3: The anaerobic and aerobic elements of cellular respiration, one part of a cell’s metabolic pathway. Derived from Burnell et al. (2005)
Contrary to what you might expect, the ecological entity showed greater integration than the molecular pathway. This makes sense, however, since molecular pathways are more linear, which increases the average shortest distance between parts, thereby decreasing continuity. In contrast, the continuity of molecular pathways can be much higher when considered in aggregate. In general, we would expect the boundedness score for ecological entities to be fairly low, but with large variation between systems. The low boundedness score of the molecular pathway is indicative of the fact that we are only exploring a small part of the metabolic pathway and including ubiquitous molecules (e.g. NADH and ATP).

Here are three ways such a system could improve community ecology: First, the process can highlight interesting ecological aspects of the system that aren’t immediately obvious. For example, food webs display much higher integration when parasites are included, and a recent call (Lafferty et al. 2008) to include these organisms highlights how a closer attention to under-recognized parts of a network can drastically change our understanding of a community. Or consider how the recognition that islands, which have clear physical boundaries, may have low boundedness due to their reliance on marine nutrient subsidies (Polis and Hurd 1996) revolutionized how we study them. Second, this methodology can help a researcher find a research-appropriate, cost-effective definition of the study community that also maximizes its degree of entification. A researcher could use sensitivity analyses to determine what effect changing the definition of her community would have on its characterization. Then, when confronted with the criticism that a certain player or interaction was left out of her study design, she could respond with an informed assessment of whether the inclusion of further parts or processes would actually change the character of the system in a quantifiable way. Finally, the formalized process of defining a study system will facilitate useful conversation between researchers, especially those who have used different definitions of communities. It will allow for more informed comparisons between systems that are similar in these parameters or help indicate a priori when systems are expected to differ strongly in their behavior and controls.

Communities, or ecosystems for that matter, aren’t homogeneous; they don’t have clear boundaries; they change drastically over time; we don’t know when they begin or end; and no two are exactly the same (see Gleason 1926). Not only are communities unlike organisms, but it is often unclear whether or not communities or ecosystems are units of existence at all (van Valen 1991). We may never find a single objective definition for what they are. Nevertheless, we work with them every day, and it would certainly be helpful if we could come to terms with their continuous nature. Whatever definition you choose to use in your own research—make it explicit and make it quantifiable. And be willing to discuss it with your peers. It will make your, their, and my research that much better.

Tuesday, January 21, 2014

A multiplicity of communities for community ecology

Community ecologists have struggled with some fundamental issues for their discipline. A longstanding example is that we have failed to formally and consistently define our study unit – the ecological community. Textbook definitions are often broad and imprecise: for example, according to Wikipedia "a community...is an assemblage or associations of populations of two or more different species occupying the same geographical area". The topic of how to define the ecological community is periodically revived in the literature (for example, Lawton 1999; Ricklefs 2008), but in practice, papers rely on implicit but rarely stated assumptions about "the community". And even if every paper spent page space attempting to elucidate what it is we mean by “community”, little consistency would be achieved: every subdiscipline relies on its own communally understood working definition.

In their 1994 piece on ecological communities, Palmer and White suggested “that community ecologists define community operationally, with as little conceptual baggage as possible…”. It seems that ecological subdisciplines have operationalized some definition of "the community", but one of the weaknesses of doing so is that the conceptual basis for these communities is often obscured. Even if a community is simply where you lay your quadrat, you are making particular assumptions about what a community is. And making assumptions to delimit a community is not problematic: the problem is when results are interpreted without keeping your conceptual assumptions in mind. And certainly understanding what assumptions each subfield is making is far more important than simply fighting, unrealistically, for consistent definitions across every study and field.
 
Defining ecological communities.
Most definitions of the ecological community vary in terms of only a few basic characteristics (figure above) that are required to delimit *their* community. Communities can be defined to require that a group of species co-occur together in space and/or time, and this group of species may or may not be required to interact. For example, a particular subfield might define communities simply in terms of co-occurrence in space and time, and not require that interactions be explicitly considered or measured. This is not to say they don't believe that such interactions occur, just that they are not important for the research. Microbial "communities" tend to be defined as groups of co-occurring microbes, but interspecific interactions are rarely measured explicitly (for practical reasons). Similarly, a community defined as "neutral" might be studied in terms of characteristics other than species interactions. Studies of succession or restoration might require that species interact in a given space, but since species composition has or is changing through time, temporal co-occurrence is less important as an assumption. Subdisciplines that include all three characteristics include theoretical approaches, which tend to be very explicit in defining communities, and studies of food webs similarly require that species are co-existing and interacting in space and time. On the other hand, a definition such as “[i]t is easy to define local communities where in species interact by affecting each other’s demographic rates” (Leibold et al. 2004) does not include any explicit relationship of those species with space – making it possible to consider regionally coexisting species.

How you define the scale of interest is perhaps more important in distinguishing communities than the particulars of space, time, and interactions. Even if two communities are defined as having the same components, a community studied at the spatial or temporal scale of zooplankton is far different than one studied in the same locale and under the same particulars, but with interest in freshwater fish communities. The scale of interactions considered by a researcher interested in a plant community might include a single trophic level, while a food web ecologist would expand that scale of interactions to consider all the trophic levels. 

The final consideration relates to the historical debate over whether communities are closed and discrete entities, as they are often modelled in theoretical exercises, or porous and overlapping entities. The assumption in many studies tends to be that communities are discrete and closed, as it is difficult to model communities or food webs without such simplifying assumptions about what enters and leaves the system. On the other hand, some subdisciplines must explicitly assume that their communities are open to invasion and inputs from external communities. Robert Ricklef, in his 2008 Sewall Wright Address, made one of the more recent calls for a move from unrealistic closed communities to the acceptance that communities are really composed of the overlapping regional distributions of multiple organisms, and not local or closed in any meaningful way.

These differences matter most when comparing or integrating results which used different working definitions of "the community". It seems more important to note possible incompatibilities in working definitions than to force some one-size-fits-all definition on everything. In contrast to Palmer and White, the focus should not be on ignoring the conceptual, but rather on recognizing the relationship between practice and concept. For example, microbial communities are generally defined as species co-occurring in space and time, but explicit interactions don't have to be shown. While this is sensible from a practical perspective, the problem comes when theory and literature from other areas that assume interactions are occurring is directly applied to microbial communities. Only by embracing this multiplicity of definitions can we piece together existing data and evidence across subdisciplines to more fully understand “community ecology” in general.

Thursday, November 14, 2013

How many traits make a plant? How dimensionality simplifies plant community ecology.

Daniel C. Laughlin. 2013. The intrinsic dimensionality of plant traits and its relevance to community assembly. Journal of Ecology. Accepted manuscript online: 4 NOV. DOI: 10.1111/1365-2745.12187

Community ecology is difficult in part because it is so multi-dimensional: communities include possibly hundreds of species present, and in addition the niches of each of those species are multi-dimensional. Functional or trait-based approaches to ecology in particular have been presented as a solution to this problem, since fewer traits (compared to the number of species) may be needed to capture or predict a community’s dynamics. But even functional ecology is multi-dimensional, and many traits are necessary to truly understand a given species or community. The question, when measuring traits to delineate a community is: how many traits are necessary to capture species’ responses to their biotic and/or abiotic environment? Too few and you limit your understanding, too many and your workload becomes unfeasible.

Plant communities in particular have been approached using a functional framework (they don't move, so trait measurements aren't so difficult), but the number and types of traits that are usually measured vary from study to study. Plant ecologists have defined functional groups for plants which are ecologically similar, identified particular (“functional”) traits as being important, including SLA, seed mass, or height, or taken a "more is more" approach to measurements. There are even approaches that capture several dimensions by identifying important axes (leaf-height-seed strategy, etc.). Which of these approaches is best is not clear. In a new review, Daniel Laughlin rather ambitiously attempts to answer how many (and which) traits plant ecologists should consider. He asks whether the multi-dimensional nature of ecological systems is a curse (there is too much complexity for us to ever capture), or a blessing (is there a limit on how much complexity actually matters for understanding these systems)? Can dimensionality help plant ecologists determine the number of traits they need to measure? 
From Laughlin 2013. The various trait axes (related to plant organs) important for plant function.
Laughlin suggests that an optimal approach to dimensionality should consider each plant organ (root, leaves, height, figure above). Many of the traits regularly measured are correlated (for example, specific leaf area, leaf dry matter content, lifespan, mass-based maximum rate of photosynthesis, dark respiration rates, leaf nitrogen concentration, leaf phosphorus concentration are all interrelated), and so potentially redundant sources of information. However, there are measurements in the same organ that may provide additional information – leaf surface area provides different information than measures of the leaf economic spectrum – and so the solution is not simply measuring fewer traits per organ. Despite redundancy in the traits plant ecologists measure, it is important to recognize that dimensionality is very high in plant communities. Statistical methods are useful for reducing dimensionality (for example, principle coordinate analysis), but even when applied, Laughlin implies that authors often over-reduce trait data by retaining to only a few axes of information.

Using 3 very large plant species-trait datasets (with 16-67(!) trait measures), Laughlin applies a variety of statistical methods to explore effective dimensionality reduction. He then estimates the intrinsic dimensionality (i.e. the number of dimensions necessary to capture the majority of the information in community structure) for the three datasets (figure below). The results were surprisingly consistent for each data set – even when 67 possible plant traits were available, the median intrinsic number of dimensions was only 4-6. While this is a reasonably low number, it's worth noting that the number of dimensions analyzed in the original papers using those datasets were too low (2-3 only).
From Laughlin 2013. The intrinsic number of traits/dimensions
necessary to capture variation in community structure.
For Laughlin, this result shows that dimensionality is a blessing, not a curse. After all, it should allow ecologists to limit the number of trait measures they need to make, provided they choose those traits wisely. Once the number of traits measured exceeds 8, there appears to be diminishing returns. The caveat is that the traits that are important to measure might differ between ecosystems – what matters in a desert is different than what matters in a rainforest. As always, knowing your system is incredibly important. Regardless, the review ends on a highly optimistic note – that complexity and multi-dimensionality of plant communities might not limit us as much as we fear. And perhaps less work is necessary for your next experiment.

Thursday, November 7, 2013

Managing uncertain restoration outcomes*

Human activity has impacted ecosystems around the globe, and the value of intact, functioning habitats is increasingly appreciated. One of the most important management options to maintain or increase the amount of functioning habitat is to restore destroyed, disturbed and degraded habitats. However, there is much concern about how predictable restoration efforts are and the management strategies that will maximize success. The reality that systems may reach very different, alternative ecosystem states is a problem for managers when they desire well defined outcomes. Thus the ability to understand and predict how different factors affect restoration outcomes would be an important development.

In the current issue of the Journal of Applied Ecology, Grman and colleagues examine how different factors influence prairie restoration outcomes –specifically the diversity and composition of the restored habitat. They considered several management, historical and environmental factors. For management, they compiled information on the type of planting, the diversity and density of sown seeds and fire manipulation. For local environmental variables, they considered different soil characteristics, shade levels, and site area. The historical influences included land-use history, rainfall during seed sowing and site age. Finally, they also considered the landscape context; specifically what habitats surrounded the restoration site.

Grman and colleagues show that restoration outcomes are most influenced by management decisions and site history. The density, composition and diversity of sown seeds had the greatest impact on restoration outcomes. Species richness was highest in sites sown with high diversity. High sowing density resulted in high beta diversity among sites. Site history had significant effects on non-sown diversity, but did not influence the diversity of sown species. Site characteristics failed to predict local diversity, but they were important for among site beta-diversity.


If success is measured in terms of species diversity, then this work clearly shows that management decisions directly influence success. Surprisingly, site characteristics had a minor influence on success, despite conceptual and theoretical models that predict system sensitivity to abiotic influences. This work reinforces the need to develop the best management options for prairie restoration and that the influences of site history and local conditions can be overcome by sowing decisions and site management.

Grman E., Bassett T. & Brudvig L.A. (2013). Confronting contingency in restoration: management and site history determine outcomes of assembling prairies, but site characteristics and landscape context have little effect. Journal of Applied Ecology, 50, 1234-1243.

Wednesday, November 6, 2013

Community structure - what are we missing?

Some of the most frequently used ecological concepts can be difficult to define. Sometimes this lack of clarity leads to a poor understanding and a weak base for further research. A great example is “community structure”, a concept frequently mentioned and rarely defined that probably changes a lot from use to use. The phrase “we’re interested in how communities are structured” is tossed around a lot, and I suppose an understood definition is that community structure encompasses the species that are present in a community and their abundances. Community structure may refer to  both a very simple concept (the abundances of species present in a community) and a very complicated one, connecting as it does mechanisms and models, observational data, and statistical measures. As a result, the precise way that ecologists delineate community structure and quantify it is both varied and vague.

The connection between models, community
structure and metrics.
In the literature, it seems that there are two ways of approaching “community structure”: bottom-up, in which community structure is a predicted outcome of theoretical models of different mechanisms, and top-down, in which community structure is measured in a relatively statistical or descriptive fashion. Both are valuable approaches: while statistical metrics often are interpreted as providing evidence for particular models or mechanisms, the reverse logic – that a model predicts particular results for a given metric – is rarely explicitly considered. Making connections between the model results and the descriptive metrics might actually be fairly difficult. Model predictions are often complex and multidimensional, predicting changes through time, growth rates, the combinations of species that can or cannot coexist (but only if assumptions hold), or particular relationships between measures like diversity, abundances, and range sizes. Metrics are necessarily confined to a few dimensions (or perhaps are ordination approaches), focus on straightforward observational measures like abundance and presence, and further include observational error (sampling, etc). Because community structure means something different to these two approaches, the connections between metrics and models are poorly explored. A theoretician might find it difficult to relate ordinations of communities with the typical predictions from a mathematical model (which might be something like growth rates in relation to changes in abundance), while someone collecting field data might feel that the data they can collect is difficult to relate to the predictions of models.

Part of the problem is that for a long time, the default focus was on what types of interactions structured communities (environment, competition, predation, mutualisms), and niches were assumed to be necessarily driving community structure. The type of measurements and metrics used reflected this search for niches (e.g. comparing environmental gradients with community structure). Many quantitative metrics may tell you something about how community structure relates to different variables (spatial, environment, biotic) and how much variation is still unexplained. The consideration that niches might not always be important eventually led ecologists to compare patterns in community structure to random, null, or neutral expectations. As a result, in the simplest cases the answers to questions about community structure and niches are binary – different from random (niches matter), or not. Looking for complex patterns predicted by models-for example, the relative contribution of niche based and neutral processes to community structure-is difficult using common metrics of community structure (although there are some papers that do a good job of this).

It is interesting that this problem of disconnection between theoretical models of community structure and community structure metrics received the most attention through criticisms of phylogenetic metrics of diversity. There, patterns of over- and under-dispersion were criticized for not being the necessary outcome from models of competition or environmental filtering (i.e. Mayfield and Levine 2010). While those criticisms were mostly fair, they are equally deserved in most studies of species diversity, where patterns in ordinations or beta-diversity are frequently used to infer mechanisms. In contrast, one of the best approaches thus far to integrating model predictions for community structure with metrics of community structure are null models. Though they differ greatly in ecological realism and complexity, null models suggest expected community structure or metric values if none of the expected processes are structuring a community.

One of the greatest failings of the top-down approach is that recognizing patterns outside of the expected, such as those that include stochasticity or a combination of different processes, or the effects of history, is nearly impossible. Models that can incorporate these complexities provide little suggestion of how the patterns we can easily record in communities might reflect complex structuring processes. Ecological research is limited by the poor connection between both top-down and bottom-up approaches and its vague definition of community structure. Patterns more complicated than those that the top-down approach searches for are likely being missed, while relations between models and metrics (or development of new metrics) aren’t considered often enough. One solution might be to more meaningfully define community structure, perhaps as the association (or lack thereof) between the combination of species present in a community and the combination of abiotic and/or biotic processes present. This association is generally compared to an association between species and processes that might arise from random effects alone. The difference is that structure shouldn’t be considered separately from the processes that produce it, and the connections should be explicitly rather than implicitly made.

Monday, October 28, 2013

Waste not, want not? How human food waste changes ecosystems

Daniel Oro, Meritxell Genovart, Giacomo Tavecchia, Mike S. Fowler and Alejandro Martínez-Abraín. 2013. Ecological and evolutionary implications of food subsidies from humans. Ecology Letters. Volume 16, Issue 12, pp 1501–1514. DOI: 10.1111/ele.12187

Humans have always been connected to their environment, directly and indirectly. Ecologists in particular, and people in general, have been thinking about the causes and consequences of these connections for hundreds of years. One form of interaction results when human food resources become available to other animals – for example, through waste dumps, crop waste, fishery by-catches, bird feeders, or road kill. Starting with middens and waste piles in early human settlements, our food waste has always passed t
Garbage dump in India.
o other species. And while rarely considered compared to human impacts like habitat destruction and climate change, a new review by Daniel Oro and colleagues argues that these subsidies have shaped ecosystems around the globe.

Human food waste (aka subsidies) may come from a variety of human activities, with the three most prominent being crop residuals (remnants of harvest remaining on fields), waste dumps, and fishery discards (by-catch thrown overboard). Each of these forms of subsidy occurs globally and large numbers of species rely partially or completely on them for food. For example, dumps are global in distribution, and contain enough edible waste to attract 20-30% of all mammal and bird in a region (particularly omnivorous and carnivorous species). Crop residues usually attract herbivorous or granivorous species (particularly birds), while fisheries’ waste alter marine ecosystems. Eight percent of all catch (~7 million tonnes!) is simply released back into the ocean, and this supplements species across the food web, including half of all seabirds.

Food waste from human activities may not seem so terrible – after all, they are predictably available, easy to access, fast to forage, and can lead to increased condition and fertility among species that take advantage of them. For example, seabirds foraging among fishing boats for by-catch take advantage of the predictability of boat (and food) appearances and as a result have decreased foraging time and areas, higher individual fitness and reproductive success, and ultimately increased population growth. But the authors suggest that these benefits must be considered in a more complicated web of interactions. After all, human food subsidies tend to be much more predictable than natural sources of food and quickly have large effects spanning from individuals, communities, ecosystems, and evolutionary pressures.

Individuals often, though not always, experience positive effects from subsidies – increased biomass, fertility, and survival, accompanied by changes in dispersal and ranges. If food waste draws in high densities of individuals, it may be associated with greater disease occurrence, or draw in predators attracted to easy pickings. Populations also often respond positively to food subsidies, and become larger and more stable as food waste availability increases. But this boon for one species can cascade through the food web, and have large negative effects in communities and ecosystems. For example, yellow-legged gulls are found around dumps and fishing trawlers, taking advantage of the quantities of food available there, and as a result have increased greatly in population. The downside is that in turn these larger populations increase predation pressure on other vulnerable seabird species. Seabirds in particular can create complicated interactions between human food waste and far-flung ecosystems, connecting as they do both terrestrial and marine systems, moving nutrients, pollution, and calories between systems and through trophic levels.

Snow goose exclosure in northern Canada.
Only the small green rectangle has avoided goose grazing.
A famous example of the unexpected consequences of waste subsidies is the snow goose (Chen chen caerulescens). Snow geese have moved from feeding predominantly on marsh plants to landing en masse in farmers’ fields to feed on grain residues. This new and widespread source of food lead to a population boom, and the high numbers of geese stripped away the vegetation in the arctic habitats where they summer and breed. Agriculture food subsidies in southern habitats were felt far away in the arctic, as migratory snow geese tied these systems and food webs together. Though snow geese are unlikely to lose their new source of food, other animals may face plummeting populations or extinctions if food sources disappear. Until the 1970s, in Yellowstone, grizzly bears fed nearly exclusively at a local dump that then closed: the result was both increased mortality and rapid increases in foraging distances and behaviours.

Finally, and of most concern, food waste subsidies can alter the selective pressures a population faces. Species that become reliant on dumps or fields for food may experience changes in selective pressures, leading to selection for traits necessary to exploit these subsidies, and a loss of genotypic/phenotypic variation from the population. Changes in selective pressure change with the situation, of course. In the case of Yellowstone, the dump closure and loss of food source seemed to have large effects on traits important for sexual attractiveness in males, suggesting potential effects on reproductive success. In the best known (and my favourite) example of the selective effect of human food waste, dogs eventually were domesticated from wolf ancestors. Of course dumps can also relax selective pressures, if they allow individuals in poor condition (juveniles, the elderly) to successfully feed and reproduce.

Though food waste subsidies are clearly important and can have wide ranging effects, it is worth noting that the effect and importance of food subsidies is context-dependent. Studies seem to indicate that effects are greatest when food is low naturally or habitat quality is poor; in high quality systems, food waste may only be used by juveniles or individuals in poor condition. Unfortunately, as humans degrade natural habitats, subsidies are only likely to increase in importance as a food source for species. The extent and effects of human food waste are yet another legacy of the global alterations the human species has made. Unfortunately, like so many of the changes we have made, the issues are complex and transcend political and regional boundaries. Practices in one system or nation are tied to effects in another nation, and this complexity can make it difficult to monitor and measure the effects of subsidies as thoroughly as is necessary. This review from Oro et al. certainly makes a case for why our garbage needs to receive more attention.
One example of the ecosystem wide effects of subsidies: here, fisheries inputs.


Wednesday, August 28, 2013

The species we’ve neglected

Species in last 3 months' papers in Ecology Letters.
"Multiple species" tended to be meta-analyses.
Browse the abstracts of a high profile ecological journal (for example, Ecology Letters, right) and one pattern you’ll notice is that high impact, hypothesis-driven ecology usually involves a small pool of focal species. Plants, for example, dominate any discussion of community ecology and have since Clements’ and Gleason’s arguments. It is not that hard to see why – plants don’t move, for one, live in speciose groups, and often complete a full lifecycle in a matter of months. They are also the lowest trophic level and so pesky multiple trophic level interactions can be omitted.

Other groups of species also show up frequently. Insects are popular for some studies of herbivory (again, it is easy to estimate damage to species that can’t move), mutualisms, and predation. Butterflies and birds, being pretty and easy to count, make a nice model for species populations and climate change studies. And while it is easy to sound critical of this kind of system-based myopia, it exists for perfectly good reasons. Immobile plants, after all, are a major source of experimental knowledge upon which much of modern ecology relies. They are easy to work with and manipulate, and their responses are relatively easy to measure (phenology, fitness, biomass, herbivory). Further, once an experimental system is established, using that system becomes increasingly attractive. You have a growing literature to base decisions on, to put your results into context, and against which to prove the novelty or importance of your work. In contrast, if you do your work on the rare bunny-toed sloth-monkey, the novelty of the system may overwhelm the generality of the work. And so the short-term limitation is that established systems allow immediate in-depth studies, while novel systems, though necessary to broaden ecological knowledge, may (initially) relatively be shallow in their returns.

Establishing a new system may be a time-consuming activity with the possibility of failure. But these under-utilized species have something new to tell ecology. This is not to say that the popular systems of species have nothing to tell us anymore – not at all, given all the complexities of ecological dynamics – but they bias the story. The ecological processes at play are not likely much different between novel systems and traditional ones. But the same processes interact in different ways and differ in importance across systems, and so we may have unrealistic expectations about the importance of, say, competition, if we only focus on 1 or 2 systems. To follow Vellend’s (2011) framework, the processes of selection, drift, speciation, and dispersal are part of any ecological system. What differs is their importance, and their importance differs for reasons related to the ecological context and evolutionary history a species experiences. This is the reason that comparing Mark McPeek’s work on neutrality in damselflies with Jonathan Losos’ findings about adaptive radiation in anoles is so interesting. No one questions that adaptive radiations may drive one set of species and neutrality another, the real question is what about their contexts produces to this result. Unfortunately, if our current set of focal species is small, we are limited in our ability to make such informative comparisons.

Many of the limitations on species have been methodological: popular systems tend to involve amenable species. Other species may be very small, very mobile, very difficult to identify, or highly specialized in their habitats. This creates difficulties. But when we overcome them, the results are often revolutionary. For example, consider the current burst of interest in belowground interactions, once their incredible importance to plant community interactions became clear (e.g. Klironomos 2002, Nature). Further, techniques are continually improving in ways which make new systems tenable.

So we should continue to focus on a few well-understood systems, attempting to perfect our understanding and predictive abilities. There is much value in understanding a system as completely as possible. But on the other hand, we can limit ourselves by focusing too much. It seems like one of the big areas for growth in modern ecology is simply to expand into novel ecological systems.

(**It's probably too general and a bit unfair to refer to all plants and all insects as though they are monolithic groups, since they are each large and varied (which is part of the reason they've been useful thus far). And some of their great representation may in fact relate to the number of species available to study. But I do think the general point about the problem of focusing too much holds.**)

Monday, June 17, 2013

Another round in Diamond vs. Simberloff: revisiting the checkerboard pattern debate

Edward F. Connor, Michael D. Collins, and Daniel Simberloff. 2013. "The Chequered History of Checkerboard Distributions." Ecology. http://dx.doi.org/10.1890/12-1471.1.

One of the most vociferous recent debates in community ecology started in the 1970s between Jared Diamond and Dan Simberloff (and colleagues) regarding whether 'checkerboard patterns' of bird distributions provided evidence for interspecific competition. This was an early and particularly heated example of the pattern versus process debate that continues in various forms today. Diamond (1975) proposed that the distribution of birds in the Bismark Archipelago, and particularly the fact that some pairs of bird species did not co-occur on the same islands (producing a checkerboard pattern), was evidence that competition between species limited their distributions. The issue with using this checkerboard pattern as evidence of competition, which Connor and Simberloff (1979) subsequently pointed out, was that a null model was necessary to determine whether it was actually different from random patterns of apparent non-independence between species pairs. Further, other mechanisms (different habitat requirements, speciation, dispersal limitations) could also produce non-independence between species pairs. The original debate may have died down, but the methodology for null models of communities suggested by Connor and Simberloff has greatly influenced modern ecological methods, and continues to be debated and modified to this day.

The original null model of bird distributions in the Bismark Archipelago involved a binary community matrix (rows represent islands, columns represent species) with 0s and 1s representing species presences or absences. Hence, all the 1s in a row represent the species present on the island. The original null model approach involved randomly shuffling the 0s and 1s, maintaining island richness (row sums) and species range sizes (column sums). The authors of a new paper in Ecology admit that the original null models didn’t accurately capture what Diamond meant by a "checkerboard pattern". This is interesting in part because two of the authors (E.F. Connor and Dan Simberloff) lead the debate against Diamond and introduced the binary matrix approach for generating null expectations. So there is a little bit of a ‘mea culpa’ here. The authors note that earlier null models captured patterns of non-overlap between species' distributions but didn’t differentiate between non-overlap between species with overlapping ranges compared to non-overlap between species which simply occurred on sets of geographically distant islands (referred to here as 'regional allopatry'). The original binary matrix approach didn’t consider spatial proximity of species ranges.

With this fact in mind, the authors re-analyzed checkerboard patterns in the Bismark Archipelago, but in such a way as to control for regional allopatry. True checkerboarding was defined as: “a congeneric or within-guild pair with exclusive distribution, co-occurrence in at least one island group, and geographic ranges that overlap more or significantly more than expected under an hypothesis of pairwise independence”. This definition appears closer to Jared Diamond's original definition and so a null model that captures this is probably a better test of the original hypothesis. The authors looked at the overlap of convex hulls defining species’ ranges and when randomizing the binary matrix, added the further restriction that species could occur only within the island groups where they were actually found (instead of being randomly shuffled through any island, as before).

Even with these clarified and more precise null models, the results remain consistent. True checkerboarding appears to rarely occur compared to chance. Of course, this doesn't mean that competition is not important, but “Rather, in echoing what we said many years ago, one can only conclude that, if they do compete, competition does not strongly affect their patterns of distribution among islands.” More generally, the endurance of this particular debate says a lot about the longstanding tension in ecology over the value and wealth of information captured by ecological patterns, and the limitations and caveats that come with such data. There is also a subtle message about the limitations of null models: they are often treated as a magic wand for dealing with observed patterns, but null models are limited by our own understanding (or ignorance) of the processes at play and our interpretation of their meaning. 

Thursday, June 6, 2013

Speaking the language: is jargon always bad?

You hear mostly about the evils of jargon in science. Undeniably jargon is a huge barrier between scientific ideas and discoveries and non-scientists. Translating a complex, nuanced result into a sound bite or recommendation suitable for consumption by policymakers or the public can be the most difficult aspect of a project (something Alan Alda, as part of his Center for Communicating Science, is attempting to assist scientists with). But sometimes the implication in general seems to be that scientific jargon is always undesirable. Is jargon really always a bad thing?

Even between scientists, you hear criticism about the amount of jargon in talks and papers. I have heard several times that community ecology is a frequent offender when it comes to over-reliance on jargon (defn: “words or expressions that are used by a particular profession or group and are difficult for others to understand”). It is fun to come up with a list of jargon frequently seen in  community ecology, because examples are endless: microcosm, mesocosm, niche, extinction debt, stochastic, trophic cascades, paradigm shift, priority effects, alternate stable states, or any phrase ending in ‘dynamics’ (i.e. eco-evolutionary, neutral, deterministic). Special annoyance from me at the usage of multidisciplinary, trans-disciplinary, and inter-disciplinary to all express the exact same thing. I don’t think, despite this list, that jargon is necessarily problematic.

If the meaning implied by the word or phrase is more than the sum of its parts it is probably jargon. Ideally, jargon is a shared, accurate shorthand for communicating with colleagues. A paper published without any jargon at all would be much longer and not necessarily clearer. Instead of saying, “we used protist microcosms”, it would have to say, “we used a community of protist species meant to encapsulate in miniature the characteristic features of a larger community”. (And arguably ecology is still relatively understandable for a newcomer, compared to disciplines like cell and systems biology, where an abstract might seem impenetrable: “Here, we report that, during mouse somatic cell reprogramming, pluripotency can be induced with lineage specifiers that are pluripotency rivals to suppress ESC identity, most of which are not enriched in ESCs.”)

Jargon is useful as a unifying tool: if everyone is using the same nicely defined label for a phenomenon, it is easier to generalize, contrast and compare across research. Jargon is many pieces of information captured in a single phrase: for example, using the term 'ecophylogenetics' may imply not only the application of phylogenetic methods and evolutionary biology to community ecology, but also the accompanying subtext about methodology, criticism, and research history. At its best, jargon can actually stimulate and unify research activities – you could argue that introducing a new term (‘neutral dynamics’) for an old idea stimulated research into the effects of stochasticity and dispersal limitation on community structure.

That’s the best case scenario for jargon. There are also consequences to developing a meaning-laden dialect unique to a subdiscipline. It is very difficult to enter a subdiscipline or move between subdisciplines if you don’t speak the language. New students often find papers difficult to penetrate because of the heavy reliance on jargon-y descriptions: obtaining new knowledge requires you already have a foundation of knowledge. Moving between subdisciplines is hard too – a word in one area may have completely different meaning in another. In a paper on conservation and reserve selection, complementarity might refer to the selection of regions with dissimilar species or habitats. In a biodiversity and ecosystem functioning paper, a not-very distant discipline, complementarity might refer to functional or niche differences among co-occurring species. Giving a talk to anyone but the most specialist audience is hampered by concerns about how much jargon is acceptable or understandable.

Jargon also leads to confusion. When using jargon, you can rely on understood meaning to delimit the boundaries of your meaning, but you may never specify anything beyond those boundaries. Everyone has heard a 30-second spiel so entirely made of jargon that you never develop a clear idea of what the person does. The other issue is that jargon can quickly become inaccurate, so laden with various meanings as to be not useful. The phrase ‘priority effect’, for example, has had so many particular mechanisms associated with it that it can be uninformative on its own. And I think most ecologists are well aware that jargon can be inaccurate, but it’s a difficult trap to get out of. The word “community”, essential to studying community ecology, is so broadly and inconsistently defined as to be meaningless. Multiple people have pointed this out (1, 2, 3) and even suggested solutions or precise definitions, but without lasting impact. One of the questions in my PhD defense was “how did I define an ecological community and why?”, because there is still no universal answer. How do we rescue words from becoming meaningless?

Something interesting, that you rarely see expressed about jargon is that linguists tells us that language is knowledge: how we understand something is not independent of the language we use to describe it. The particular language we think in shapes and limits what we think about: perhaps if you have many ways of finely delineating a concept you will think about it as a complex and subtle idea (the 100-words-for-snow idea). On the other hand, what if you have to rely on vague catch-alls to describe an idea? For example, a phrase like ‘temporal heterogeneity’ incorporates many types of differences that occur through time: is that why most researchers continue to think about differences through time in a vague, imprecise manner? Hard to say. It is hard to imagine where community ecology would be without jargon, and even harder to figure out how to fix all the issues jargon creates.