The EEB & Flow: publishing

Showing posts with label publishing. Show all posts

Thursday, May 28, 2015

Are scientists boring writers?

I was talking with an undergrad who is doing her honours project with me about the papers she’s reading, and she mentioned how difficult (or at least slow going) she’s found some of them. The papers are mostly reviews or straightforward experimental studies, but I remember feeling the same way as an undergraduate. Academic science writing uses its own language, and until you are familiar with the terms and phrases and article structure, it can be hard going. Some areas, for example theoretical papers, even have their own particular dialects (you don’t see the phrase “mean-field approximation” in widespread usage, for example). Grad school has the advantage of providing total immersion into the language, but for many students, lots of time/guidance and patience is necessary to understand the primary literature. But is science necessarily a boring language?

A recent blog piece argues that academic science writing needs to fundamentally change because it is boring, repetitive, and uninspired. And as a result, the scientific paper needs to evolve. The post quotes a biologist at University of Amsterdam, Filipe Branco dos Santos: he feels that the problem is rooted in the conservative nature of scientists, leading them to replicate the same article structure over and over again. Journals act as the gatekeeper for article style too – submission requirements enforce the inclusion of particular sections (Introduction, Methods, Results, Discussion, etc), and determine every thing from word counts, figure number, text size, and even title structure and length. Reviewers and editors are within their rights to require stylistic changes. The piece includes a few tips for better article writing: choose interesting titles, write in the active tense, use short sentences, avoid jargon, include a lay summary. It’s difficult to disagree with those points, but unfortunately the article makes no attempt to suggest what, precisely, we should be doing differently. Still, it suggests that consideration of the past, present and future of scientific writing is necessary.

One glaring issue with the post is that the argument that scientists are stuck in a pattern established hundreds of years ago ignores just how much science papers have changed, stylistically. Scientific papers are a very old phenomenon – the oldest, Philosophical Transactions of the Royal Society, was first published in 1665. The early papers were not formatted in the introduction / methods / results / discussion style of today, and were often excerpts from letters or reports.

From the first issue, “Of the New American Whale-fishing about the Bermudas” begins:

“Here follows a relation, somewhat more divertising, than the precedent Accounts; which is about the new Whale-fishing in the West Indies about the Bermudas, as it was delivered by an understanding and hardy Sea-man, who affirmed he had been at the killing work himself.”

Ecological papers written in the early 1900s are also strikingly different in style than those today. Sentences are long and complex, words like “heretofore”, “therefore”, and “thus” find frequent usage, and the language is rather flowery and descriptive.

From a paper in the Botanical Gazette in 1913, the first sentence:

“Plant geographers and climatologists have long been convinced that temperature is one of the most important conditions governing the distribution of plants and animals, but very little has as yet been accomplished toward finding out what sort of quantitative relationships may exist between the nature of floral and faunal associations and the temperature conditions that are geographically concomitant therewith.”

While this opening makes perfect sense and establishes the question to be dealt with in the paper, it probably wouldn’t make it past review without comment.

Some of my favourite examples that highlight how much ecological papers have changed come from R.H. Whittaker’s papers. He is clearly an avid (and verbose) naturalist and his papers are peppered with evocative phrases. For example, “If, for example, one stands on a viewpoint in the Southern Appalachian Mountains in the autumn, one sees a complex varicoloured mantle of vegetation covering the mountain topography” and “The student of vegetation seeks to construct systems of abstraction by which relationships in this mantle of vegetation may be comprehended.” Indeed!

Today, in contrast, academic science writing is minimalist – it is direct, focused, and clarity is prized. Sentences are typically shorter, with a single focal thought, and the aim is for a clear narrative without the peripatetic asides common in older work. These shifts in style reflect the prevailing thoughts about how to balance the role of scientific papers as a communication device versus as a contribution to the scientific record. It seems that science papers may be boring now because authors and editors would rather a paper be a little dry rather than be unclear or difficult to replicate. (Of course, some papers manage to be both boring and confusing, so this is not always successful….) Modern papers have a lot of modern bells and whistles too. The move away from physical copies of papers to pdfs and online only colour versions and supplementary information has made sharing results easier and more comprehensive than ever.

If there is going to be a revolution in academic science writing, it will probably be tied to the ongoing technological changes in science and publishing. The technology is certainly already present to make science more interactive to the reader, which might make it less boring? It is already possible to include videos or gifs in online supplements (a great example being this puppet show explaining Diversitree). More seriously, supplements can include data, computer code used for analyses or simulations, additional results. It’s possible to integrate GitHub repositories with articles tied to a paper’s analyses, or link markdown scripts for producing manuscripts. The one limitation is that these approaches is that they aren’t included in the main text and so most people never see them. It’s only a matter of time before we move towards a paper format that includes embedded elements (extending on current online versions that include links to reference papers). One could imagine plots that could be manipulated, or interactive maps, allowing you to explore the study site through satellite images of the vegetation and terrain.

Increasingly interactive papers might make it more fun to work through a paper, but a paper must stand alone without them. For me, the key to a well-written paper is that there is always a narrative or purpose to the writing. Papers should establish a focus and ensure connections between thoughts and paragraphs are always obvious to the reader. The goal is to never lose the reader in the details, because the bigger picture narrative can be read between the lines. That said, I rarely remember if a paper is boringly written: I remember the quality of the ideas and the science. I would always take a paper with interesting ideas and average writing over a stylish paper with no substance. So perhaps academic science writing is an acquired (or learned) taste, and certainly that taste could be improved, but it's clear that science writing is constantly evolving and will continue to do so.

Wednesday, May 20, 2015

I'll take 'things that have nothing to do with my research' for $400

I guess I do have a couple papers with the word fire in their titles?
And to Burns and Trauma's credit, this is a nicely formatted email and the reasons to publish with them are pretty convincing :-)

Monday, February 9, 2015

Can an algorithm tell you where to submit your next paper?

Optimizing the Submission Decision Process. Santiago Salinas, Stephan B. Munch. 2015. Where Should I Send It? PlosONE. DOI: 10.1371/journal.pone.0115451

Choosing where to submit a manuscript is a difficult proposition. For a strong manuscript, you might hope for a high profile, high impact journal, but also to be considered is the hope (or need) for rapid publication, preferably without too many cycles of rejection, revision, and resubmission.

Maybe you saw this around twitter, but a little over a week ago, one paper offered a solution to the “where to publish?” puzzle. Published in PLoSONE (the journal chosen using their algorithm), authors Salinas and Munch provided one answer to "Optimizing the Submission Decision Process".

Surveys usually show that journal impact factor is the highest priority for authors, and a typical measure of paper success. Recognizing the importance of citations--and journal impact factors as an indirect predictor of them--the authors use Markov decision processes to determine the optimal submission process to maximize citation total. This model is a race against time, where delays reduce the total citations and worse, increase the probability that a paper will be scooped (and therefore have minimal citation value). They also considered a more complex model which maximizes citations, while minimizing delays due to rejection, resubmission and revisions.

The top choice of journals for the first model (maximize citations), were Ecology Letters, Science, and Molecular Ecology Resources. For the second model, the top journals to balance citations and time loss, were Molecular Ecology Resources, PLoSONE, and Ecology Letters.

Finally, if you know your willingness to tradeoff the number of times you submit your paper until acceptance and the number of citations it will receive, you can choose between several strategies. One option is a path that involves submission to a high impact journal (Ecology Letters in particular, possibly Ecological Monographs), accepting that you may actually need to resubmit your paper several times but will gain high citations. Alternately, you could choose a journal such as PLoSONE where resubmission is low and citation rate is moderate. Finally, many specialized journals may be faster, but provide relatively low citations. (Fig 3 below).

From Salinas and Munch 2015.

So what the authors get right is that choosing where to submit is a difficult task. Choosing journals is a skill that a scientist hones over a career. Graduate students have the hardest time, I think, not having experience with the underlying complexities (e.g. this journal is slow, this journal prefers experimental work rather than simulation approaches, Science will probably reject you, but at least it will be very fast...). Students usually have to rely on supervisors and more experienced collaborators precisely because they lack informed priors. That being said, the approach from this paper strikes me as a silly (and just bad) way of choosing journals.

The biggest reason is that even though everyone chooses "impact factor" as their primary criteria for choosing a journal in a survey, in practice impact factor is innately balanced against manuscript quality. Sure, there's the odd soul who always starts at Science and works their way down, but most researchers have a reasonably unbiased view of their manuscript's quality, and journal choice is conditioned by that estimate of manuscript quality. (More commentary on this from Marcel Holyoak and others, here). So it's really about maximizing citations, given the quality of a particular work. This implies authors must have knowledge of the journals in their field, not a simplistic algorithm.

Scooping doesn't strike me as the biggest concern for most ecologists either. There is a cost of declining novelty, perhaps, but it would be a rare ecological paper that lost all citation value because something similar had been published slightly earlier. (Or so I think. Is scooping a big issue in ecology?)

Additionally, citations simply aren't the only thing concern for researchers, especially early-career people. The quality of journals that you publish in has important implications. Sending all of your papers to PLoSONE to reduce the time to publication while maximizing citations, while apparently a viable strategy, won't do a lot for a career application (not to pick on PLoSONE, which I think has an important role, but isn't usually the first choice journal for ecological research). Publishing in prestigious journals is usually considered an indicator of research quality.

Journal choice will probably always be a subjective, imperfect behaviour. Even if a more complicated algorithm could be constructed, there are too much subjective inputs--paper quality, subject importance and novelty, journal quality--for the choice to be so simplistic.

Tuesday, February 3, 2015

Predatory open access journals: still keep'n it classy

As most academics are aware, there are hundreds of predatory open access journals that try to trick authors into submitting to their journals, charge exorbitant fees, and do not ensure that articles are peer reviewed or live up to basic scientific standards. The most celebrated cases are journals that embarrassingly publish non-sensical fake papers. I don't know why, but I sometimes go to the journal websites to see what they publish or who is on their editorial boards. I received such an e-mail this morning from SOJ Genetic Science published by Symbiosis, a recognized predatory publisher. This journal, unlike others, actually has a single published issue with an editorial! I thought: "wow, are they trying to be legitimate?"; then I read the editorial. The editorial is probably best described as a nonsensical diatribe about genetics, which lacks any real connection to modern genetic theory. Here is my favourite paragraph:

Predatory open access journals: still keep'n it classy.

Tuesday, June 3, 2014

Guide for writing the ultimate ecology introduction**

**Or find better advice here, here, here.

If only it were so easy...

Friday, May 9, 2014

Scaling the publication obstacle: the graduate student’s Achilles’ heel

There is no doubt that graduate school can be extremely stressful and overwhelming. Increasingly, evidence points to these grad school stressors contributing to mental health problems (articles here and here). Many aspects of grad school contribute to self-doubt and unrelenting stress: is there a job for me after? am I as smart as everyone else? is what I’m doing even interesting?

But what seems to really exacerbate grad school stress is the prospect of trying to publish*. The importance of publishing can’t be dismissed. To be a scientist, you need to publish. There are differing opinions about what makes a scientist (e.g., is it knowledge, job title, etc.), but it is clear that if you are not publishing, then you are not contributing to science. This is what grad students hear, and it is easy to see how statements like this do not help with the pressure of grad school.

There are other aspects of the grad school experience that are important, like teaching, taking courses, outreach activities, and serving on University committees or in leadership positions. These other aspects can be rewarding because they expand the grad school experience. There is also the sense that they are under your control and the rewards are more directly influenced by your efforts. Here then, publishing is different. The publication process does not feel like it is under your control and that the rewards are not necessarily commensurate with your efforts.

Cartoon by Nick Kim, Massey University, Wellington, accessed here

Given the publishing necessity, how then can grad students approach it with as little trauma as possible? The publication process will be experienced differently by different people, some seem like they can shrug off negative experiences while others internalize them, with negative experiences gnawing away at their confidence. There is no magic solution to making the publishing experience better, but here are some suggestions and reassurances.

1) It will never be perfect! I find myself often telling students to just submit already. There is a tendency to hold on to a manuscript and read and re-read it. Part of this is the anxiety of actually submitting it, and procrastination is a result of anxiety. But often students say that it doesn’t feel ready, or that they are unhappy with part of the discussion, or that it is not yet perfect. Don’t ever convince yourself that you will make it perfect –you are setting yourself up for a major disappointment. Referees ALWAYS criticize, even when they say a paper is good. There is always room for improvement and you should view the review process as part of the process that improves papers. If you think of it this way, then criticisms are less personal (i.e., why didn’t they think it was perfect too?) and feel more constructive, and you are at peace with submitting something that is less than perfect.

2) Let's dwell on part of the first point: reviewers ALWAYS criticize. It is part of their job. It is not personal. Remember, the reviewers are putting time and effort into your paper, and their comments should be used to make the product better. Reviewers are very honest and will tell you exactly what could be done to improve a manuscript. They are not attacking you personally, but rather assessing the manuscript.

3) Building on point 2, the reviewers may not always be correct or provide the best advice. It is OK to state why you disagree with them. You should always appreciate their efforts (unless they are unprofessional), but you don’t have to always agree with them.

4) Not every paper is a literature masterpiece. Effective scientific communication is sometimes best served by very concise and precise papers. If you have an uncomplicated, relatively simple experiment, don’t make more complex by writing 20 pages. Notes, Brevia, Forum papers are all legitimate contributions.

5) Not every paper should be a Science or Nature paper (or whatever the top journals are in a given subdiscipline). Confirmatory or localized studies are helpful and necessary. Large meta-analyses and reviews are not possible without published evidence. Students should try to think how their work is novel or broadly general (this is important for selling yourself later on), but it is ok to acknowledge that your paper is limited in scope or context, and to just send it to the appropriate journal. It takes practice to fit papers to the best journals, so ask colleagues where they would send it. This journal matching can save time and trauma.

6) And here is the important one: rejection is ok, natural, and normal. We all get rejections. What I mean by this is that we all get rejections. Your rejection is not abnormal, you don’t suck more than others, and your experience has been experienced by all the best scientists. When your paper is reviewed, and then rejected, there is usually helpful information that should be useful in revising your work to submit elsewhere. Many journals are inundated with papers and are looking for reasons to reject. In the journal I edit, we accept only about 18% of submissions, and so it doesn’t take much to reject a paper. This is unfortunate, but currently unavoidable (though with the changing publishing landscape, this norm may change). Rejection is hard, but don’t take it personally, and feel free to express your rage to your friends.

Publishing is a tricky, but necessary, business for scientists. When you are having problems with publishing, don’t internalize it. Instead complain about it to your friends and colleagues. They will undoubtedly have very similar experiences. Students can be hesitant to share rejections with other students because they feel inferior, but sharing can be therapeutic. When I was a postdoc at NCEAS, the postdocs would share quotes from their worst rejection letters. What would have normally been a difficult, confidence-bashing experience, became a supportive, reassuring experience.

Publishing is necessary, but also very stressful and potentially adding to low-confidence and a feeling that grad school is overwhelming. I hope that the pointers above can help make the experience less onerous. But when you do get that acceptance letter telling you that your paper will be published, hang on to that. Celebrate and know that you have been rewarded for your hard work, but move on from the rejections.

*I should state that my perspective is from science, and my views on publishing are very much informed by the publishing culture in science. I have no way of knowing if the pressures in the humanities or economics are the same for science students.

Tuesday, February 18, 2014

P-values, the statistic that we love to hate

P-values are an integral part of most scientific analyses, papers, and journals, and yet they come with a hefty list of concerns and criticisms from frequentists and Bayesians alike. An editorial in Nature (by Regina Nuzzo) last week provides a good reminder of some of the more concerning issues with the p-value. In particular, she explores how the obsession with "significance" creates issues with reproducibility and significant but biologically meaningless results.

Ronald Fischer, inventor of the p-value, never intended it to be used as a definitive test of “importance” (however you interpret that word). Instead, it was an informal barometer of whether a test hypothesis was worthy of continued interest and testing. Today though, p-values are often used as the final word on whether a relationship is meaningful or important, on whether the the test or experimental hypothesis has any merit, even on whether the data is publishable. For example in ecology, significance values from a regression or species distribution model are often presented as the results.

This small but troubling shift away from the original purpose for p-values is tied to concerns about false alarms and with replicability of results. One recent suggestion for increasing replicability is to make p-values more stringent - to require that they be less that 0.005. But the point the author makes is that although p-values are typically interpreted as “the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true”, this doesn't actually mean that a p-value of 0.01 in one study is exactly consistent with a p-value of 0.01 found in another study. P-values are not consistent or comparable across studies because the likelihood that there was a real (experimental) effect to start with alters the likelihood that a low p-value is just a false alarm (figure). The more unlikely the test hypothesis, the more likely a p-value of 0.05 is a false alarm. Data mining in particular will be (unwittingly) sensitive to this kind of problem. Of course one is unlikely to know what the odds of the test hypothesis are, especially a priori, making it even more difficult to correctly think about and use p-values.

from: http://www.nature.com/news/scientific-method-statistical-errors-1.14700#/b5

The other oft-repeated criticism of p-values is that a highly significant p-value make still be associated with a tiny (and thus possibly meaningless) effect size. The obsession with p-values is particularly strange then, given that the question "how large is the effect?", should be more important than just answering “is it significant?". Ignoring effect sizes leads to a trend of studies showing highly significant results, with arguably meaningless effect sizes. This creates the odd situation that publishing well requires high profile, novel, and strong results – but one of the major tools for identifying these results is flawed. The editorial lists a few suggestions for moving away from the p-value – including to have journals require effect sizes and confidence intervals be included in published papers, to require statements to the effect of “We report how we determined our sample size, all data exclusions (if any), all manipulations and all measures in the study”, in order to limit data-mining, or of course to move to a Bayesian framework, where p-values are near heresy. The best advice though, is quoted from statistician Steven Goodman: “The numbers are where the scientific discussion should start, not end.”

Monday, October 21, 2013

Is ecology really failing at theory?

Sam Scheiner. 2013. The ecological literature, an idea free distribution. Ecology Letters. Early View. DOI: 10.1111/ele.12196

“Ecology is awash with theory, but everywhere the literature is bereft”. That is Sam Scheiner's provocative start to his editorial about what he sees as a major threat to modern ecology. The crux of his argument is simple – theory is incredibly important, it allows us to understand, to predict, to apply, to generalize. Ecology began as a study rooted in system-specific knowledge or natural history in the early 1900s, and developed into a theory-dominated field in the 1960s, when many great theoreticians came to the forefront of ecology. But today, he fears that theory is dwindling in importance in ecology. To test this, he provides a small survey of ecological and evolutionary journals for comparison (Ecology Letters, Oikos, Ecology, AmNat, Evolution, Journal of Evolutionary Biology), recording papers from each journal as either containing no theory, being ‘theory motivated’, or containing theory (either tests of, development of, or reviews of theory). The results showed that papers in ecological journals on average include theory only 60% of the time, compared to 80% for evolutionary papers. Worse, ecological papers seem to be more likely to develop theory than to test it. Scheiner’s editorial (as the title makes clear) is an indictment of this shortcoming of modern ecology.

Plots made based on data table in Scheiner 2013. Results combined for all evolution and all ecology papers.
The proportion of papers in each category - all categories starting with
"Theory" refer to theory-containing papers.

Plots made based on data table in Scheiner 2013. Results for papers from individual journals.
The proportion of papers of each type - all categories starting with
"Theory" refer to theory-containing papers.

This is not the kind of conclusion that I find myself arguing against too often. And I mostly agree with Scheiner: theory is the basis of good science, and ecology has suffered from a lack of theoretical motivation for work, or pseudo-theoretical motivation (e.g. productivity-diversity, intermediate diversity patterns that may lack an explanatory mechanism). But I think the methods and interpretation, and perhaps some lack of recognition of differences between ecological and evolutionary research make the conclusions a little difficult to embrace fully. There are three reasons for this – first, is this brief literature review a good measure of how and why we use theory as ecologists? Second, do counts of papers with or without theory really scale into impact or harm? And third, is it fair to directly compare ecological and evolutionary literature, or are there differences in the scope, motivations, and approaches of these fields?

~~If we are being truly scientific, this might be a good time to point out that~~ The 95% confidence intervals for the percentage of ecology papers with theory overlap with the confidence intervals for the percentage of evolutionary papers with theory ~~suggesting the difference that is the crux of the paper is not significant.~~ [Thanks to a commenter for pointing out this difference is likely significant]. While significant at the 5% level, the amount of overlap is enough that whether this difference is meaningful is less clear. (I would accept an argument that this is due to small sample sizes though). The results also show that choice of journal makes a big difference in terms of the kinds of paper found within – Ecology Letters and AmNat had more theoretical papers or theory motivated papers, while Oikos had more tests of theory and Ecology had more case studies. This sort of unspoken division of labour between journals means that the amount of theory varies greatly. And most ecologists recognize this - if I write a theory paper, it will be undoubtedly targeted to a journal that commonly publishes theory papers. So to more fully represent ecology, a wider variety of journals and more papers would be helpful. Still, Scheiner's counterargument would likely be that even non-theory papers (case studies, etc) should include more theory.

It may be that the proportion of papers that include theory is not a good measure of theory’s importance or inclusion in ecology in general. For example, Scheiner states, “All observations presuppose a theoretical context...the simple act of counting individuals and assessing species diversity relies on the concepts of ‘individual’ and ‘species,’ both of which are complex ideas”. While absolutely true, does this suggest that any paper with a survey of species’ distributions needs to reference theory related to species’ concepts? What is the difference between acknowledging theory used via a citation and more involved discussion of theory? In neither of these cases is the paper “bereft” of theory, but it is not clear from the methods how this difference was dealt with. As well, I think that ecological literature contains far more papers about applied topics, natural history, and system-specific reports than evolutionary biology. Applied literature is an important output of ecology, and as Scheiner states, builds undeniably on years of theoretical background. But on the other hand, a paper about the efficacy of existing reserves in protecting diversity using gap analysis is both important and may not have a clear role for a theoretical section (but will no doubt cite some theoretical and methodological studies). Does this make it somehow of less value to ecology than a paper explicitly testing theory? In addition, case reports and data *are* a necessary part of the theoretical process, since they provide the raw observations on which to build or refine theory. In many ways, Scheiner's editorial is a continuation of the ongoing tension between theory and empiricism that ecology has always faced.

The point I did agree strongly with is that ecology is prone to missing the step between theory development and data collection, i.e. theory testing. Far too few papers test existing theories before the theoreticians have moved on to some new theory. The balance between data collection, theory development, and theory testing is probably more important than the absolute number of papers devoted to one or the other.

Scheiner’s conclusion, though, is eloquent and easy to support, no matter how you feel about his other conclusions: “My challenge to you is to examine the ecological literature with a critical eye towards theory engagement, especially if you are a grant or manuscript reviewer. Be sure to be explicit about the theoretical underpinnings of your study in your next paper…Strengthening the ecological literature by engaging with theory depends on you.”

Monday, September 30, 2013

Struggling (and sometimes cheating) to catch up

Scientific research is maturing in a number of developing nations, which are trying to join North American and European nations as recognized centres of research. As recent stories show, the pressure to fulfill this vision--and to publish in English-language, international journals--has led to some large-scale schemes to commit academic fraud, in addition to cases of run-of-the-mill academic dishonesty.

In China, a widely-discussed incident involved criminals with a sideline in the production of fake journal articles, and even fake versions of existing medical journals in which authors could buy slots for their articles. China has been criticized for widespread academic problems for some time, for example, 2010 the New York Times published a report suggesting academic fraud (plagiarism, falsification or fabrication) was rampant in China and would hold the country back in its goal to become an important scientific contributor. In the other recent incident, four Brazilian medical journals were caught “citation stacking”, where each journal cited the other three excessively, thus avoiding notice for obvious journal self-citation, while still increasing their journal’s impact factor. These four journals were among 14 having their impact factors suspended for a year, with other possible incidences that were flagged but could not be proven involved Italian, a Chinese, and a Swiss journal.

There are some important facts that might provide context to these outbreaks of cheating. Both Brazil and China are nations where to be a successful scientist in the national system, you need to prove that you are capable of success on the world stage. This is probably a tall order in countries where scientific research has not traditionally had an international profile and most researchers do not speak English as their first language. In particular it leads to focus on values which are comparable across the globe, such as journal impact factors, as measures of success. In China, there is great pressure to publish in journals included on the Science Citation Index (SCI), a list of leading international journals. When researcher, department, and university success is quantified with impact factors and SCI publications, it becomes a numbers game, a GDP of research. Further, bonuses for publications in high caliber journals can double a poorly-paid researcher’s salary: a 2004 survey found that for nearly half of Chinese researchers, performance based pay was 50+ percent of their income. In Brazil, the government similarly emphasizes publications in Western journals as evidence of researcher quality.

It’s easy to dismiss these problems as specific to China or Brazil, and there are some aspects of the issue that are naturally country-specific. On the other hand, if you peruse Ivan Oransky’s Retraction Watch website, you’ll notice that academic dishonesty leading to article retraction is hardly restricted to researchers from developing countries. At the moment, the leading four countries in retractions due to fraud are the US, Germany, Japan, and then China, suggesting that Western science isn’t free from guilt. But in developing nations the conditions are ripe to produce fraud. Nationalistic ambition is funnelled into pressure on national scientists to succeed on the international stage; disproportionate focus on metrics of international success; high monetary rewards to otherwise poorly paid individuals for achieving these measures of success; combined with the reality that it is particularly difficult for researchers who were educated in a less competitive scientific system and who may lack English language skills, to publish in top journals. The benefits of success for these researchers are large, but the obstacles preventing their success are also huge. Combine that with a measure of success (impact factor, h-index) that is open to being gamed, and essentially relies on honesty and shared scientific principles, and it is not surprising that system fails.

Medical research was at the heart of both of these scandals, probably because the stakes (money, prestige) are high. Fortunately (or not) for ecology and evolutionary biology, the financial incentives for fraud are rather smaller, and thus organized academic fraud is probably less common. But the ingredients that seem to lead to these issues – national pressures to succeed on the world stage and difficulty in obtaining such success; combined with reliance on susceptible metrics – would threaten any field of science. And issues of language and culture are so rarely considered by English-language science (eg.), that it can be difficult for scientists from smaller countries to integrate into global academia. There are really two ways for the scientific community to respond to these issues of fraud and dishonesty – either treat these nations as second-class scientific citizens and assume their research may be unreliable, or else be available and willing to play an active role in their development. There are a number of ways the latter could happen. For example, some reputable national journals invite submissions from established international researchers to improve the visibility of their journals. In some nations (Portugal, Iceland, Czech Republic, etc), international scientists review funding proposals, so that an unbiased and external voice on the quality of work is provided. Indeed, the most hopeful fact is that top students from many developing nations attend graduate school in Europe and North America, and then return home with the knowledge and connections they gained. Obviously this is not a total solution, but we need to recognize fraud as problem affecting and interacting with all of academia, rather than solely an issue of a few problem nations.

Monday, June 10, 2013

The slippery slope of novelty

Coming up with a novel idea in science is actually very difficult. The many years during which smart people have, thought, researched, and written about ecology and evolution means that there aren’t necessarily many easy openings remaining. If you are lucky (or unlucky) enough to know someone with an encyclopedic knowledge of the literature, it becomes quickly apparent that only rarely has an idea not been suggested anywhere in the history of the discipline. Mostly science results from careful steps, not novel leaps and bounds. The irony is that to publish in a top journal, a researcher must convince the editor and reviewers that they are making a novel contribution.

There are several ways of thinking about the role of the novelty criterion - first, the effect it has had on research and publishing, but also more fundamentally, how difficult it is to even define scientific novelty in practice. Almost every new student spends considerable effort attempting to come up with a completely "novel" idea, but a strict definition of novelty – research that is completely different than anything published in the field in the past - is nearly impossible. Science is incrementally built on a foundation of existing knowledge, so new research mostly differs from past research in terms of scale and extent. Let's say that extent characterizes how different an idea must be from a previous one to be novel. Is neutral theory different enough from island biogeography (another, earlier, explanation for diversity which doesn’t rely on species-differences) to be considered novel? Most people would suggest that it is distinct enough as to be novel, but clearly it is not unrelated to works that came before it. What about biodiversity and ecosystem functioning? Is the fact that its results are converging with expectations from niche theory (ecological diversity yields greater productivity, etc) take away from its original, apparent novelty?

Then there is the question of scale, which considers the relation of an new idea to those found in other disciplines or at previous points in time. For example, when applying ideas that originate in other disciplines, the similarity of the application or the relatedness of the other discipline alters our conclusions about its novelty. Applying fractals to ecology might be considered more novel than introducing particular statistical methods, for example. Priority (were you first?) is probably the first thing considered when evaluating scientific novelty. But ideas are so rarely unconnected to the work that came before them, so then we evaluate novelty as a matter of degree. The most common value judgment seems to be that re-inventing an obscure concept first describe many years ago is more novel than re-inventing an obscure concept that was recently described.

In practice, then, the working definition of novelty may be that something like ‘an idea or finding doesn't exist the average body of knowledge in the field’. The problem with this is that not everyone has an average body of knowledge – some will be aware of every obscure paper written 50 years ago, and for them nothing is novel. Others have a lesser knowledge or more generous judgement of novelty and for them, many things seems important and new. A great deal of inconsistency in the judgement of papers for a journal with a novelty criterion results simply from the inconsistent judgement of novelty. This is one of the points that Goran Arnqvist makes in his critique of the use of novelty as a criterion for publishing (also, best paper title in recent memory). Novelty is a slippery slope. It forces papers to be “sold” and so overvalues flashy and/or controversial conclusions and undervalues careful replication and modest advances. And worse, it ignores the truth about science, which is that science is built on tiny steps founded in the existing knowledge from hundreds of labs and thousands of papers. And that we've never really come up with a consistent way to evaluate novelty.

(Thanks Steve Walker for the bringing up the original idea)

Sunday, May 19, 2013

The end of the impact factor

Recently, both the American Society for Cell Biology (ASCB) and the journal Science both publicly proclaimed that the journal impact factor (IF) was bad for science. The ASCB statement argues that IFs limit meaningful assessment of scientific impact for both published articles and especially other scientific products. The Science statement goes further, and claims that assessments based on IFs lead researchers to alter research trajectories and try to game the system rather than focussing on the important questions that need answering.

Impact factors: tale of the tail
The impact factor was created by Thomson Reuters and is simply the number of citations a journal has received in the the previous two years, divided by the number of articles published over that time span. Thus it is a snapshot of a particular type of 'impact'. There are technical problems with this metric -for example, that citations accumulate at different rates across different subdisciplines. More importantly, and what all publishers and editors know, is that IFs generally rise and fall with the extreme tail of the distribution of the number of citations. For a smaller journal, it just takes one heavily cited paper to make the IF jump up. For example if a journal publishes one paper that accumulates 300 citations and it published just 300 articles over the two years, then its IF can jump up by 1, which can alter the optics. In ecology and evolution, IFs greater than 5 are usually are viewed as top journals.

Regardless of these issues, the main concern expressed by ACSB and Science is that a journal-level metric should not be used to assess an individual researcher's impact. Should a researcher publishing in a high IF journal be rewarded (promotion, raise, grant funded, etc.) if their paper is never cited? What about their colleague who publishes in the lower IF journal, but accrues a high number of citations?

Given that rewards are, in part, based on the journals we publish in, researchers try to game the system by writing articles for certain journals and journals try to attract papers that will accrue citations quickly. Journals with increasing IFs usually see large increases in the number of submissions, as researchers are desperate to have high IF papers on their CVs. Some researchers send papers to journals in the order of their IFs without regard for the actual fit of the paper to the journal. This results in an overloaded peer-review system.

Rise of the altmetric
Alternative metrics (altmetrics) movement means to replace journal and article assessment from one based on journal citation metrics to a composite of measures that include page views, downloads, citations, discussions on social media and blogs, and mainstream media stories. Altmetrics attempts to capture a more holistic picture of the impact of an article. Below is a screenshot from a PLoS ONE paper, showing an example of altmetrics:

By making such information available, the impact of an individual article is not the journal IF anymore, but rather how the article actually performs. Altmetrics are particularly important for subdisciplines where maximal impact is beyond the ivory towers of academia. For example, the journal I am an Editor for, the Journal of Applied Ecology, tries to reach out to practitioners, managers and policy makers. If an article is taken up by these groups, they do not return citations, but they do share and discuss these papers. Accounting for this type of impact has been an important issue for us. In fact, even though our IF may be equivalent to other, non-applied journals, our articles are viewed and downloaded at a much higher rate.

The future
Soon, how articles and journals are assessed for impact will be very different. Organizations such as Altmetric have developed new scoring systems that take into account all the different types of impact. Further, publishers have been experimenting with altmetrics and future online articles will be intimately linked to how they are being used (e.g., seeing tweets when viewing the article).

Once the culture shifts to one that bases assessment on individual article performance, where you publish should become less important, and journals can feel free to focus on an identity that is based on content and not citations. National systems that currently hire, fund and promote faculty based on the journals they publish in, need to carefully rethink their assessment schemes.

May 21st, 2013 Addendum:

You can sign the declaration against Impact Factors by clicking on the logo below:

Friday, May 3, 2013

Navigating the complexities of authorship: Part 2 -author order

Authorship can be tricky business. It is easy to establish agreed upon rules within, say, your lab or among frequent collaborators, but with large collaborations, multiple authorship traditions can cause tension. Different groups may not even agree on who should be included as an author (see Part 1), much less what order they should appear. The number of authors per paper has steadily increased over time reflecting broad cultural shifts in science. Research is now more collaborative, relying on different skill sets and expertise.

Average number of authors per publication in computer science, compiled by Sven Bittner

Within large collaborations are researchers who have contributed to differing degrees and author order needs to reflect these contribution levels. But this is where things get complicated. In different fields of study, or even among sub-disciplines, there are substantial differences in cultural norms for authorship. According to Tscharntke andcolleagues (2007), there are four main author order strategies:

Sequence determines credit (SDC), where authors are ordered according to contribution.
Equal contribution (ED), where authors are ordered alphabetically to give equal credit.
First-last-author emphasis (FLAE), where last author is viewed as being very important to the work (e.g., lab head).
Percent contribution indicated (PCI), where contributions are explicitly stated.

The main approaches in ecology and evolutionary biology are SDC and FLAE, though journals are increasingly requiring PCI, regardless of order scheme. This seems like a good compromise allowing the two main approaches (SDC & FLAE) to persist without confusing things. However, PCI only works if people read these statements. Grant applications and CVs seldom contain this information, and the perspective from these two cultures can bias career-defining decisions.

I work in a general biology department with cellular and molecular biologists who wholeheartedly follow FLAE. They may say things like “I need X papers with me as last author to get tenure”. As much as I probe them about how they determine author order in multi-lab collaborations, it is not clear to me how exactly they do this. I know that all the graduate students appear towards the front in order of contribution, but the supervisor professors appear in reverse order starting from the back. Obviously an outsider cannot disentangle the meaning of such ordering schemes without knowing who the supervisors were.

The problem is especially acute when we need to consider how much people have contributed in order to assign credit (see Part 3 on assigning credit). With SDC, you know that author #2 contributed more than the last author. With FLAE, you have no way of knowing this. Did the supervisor fully participate in carrying out the research and writing the paper? Or did they offer a few suggestions and funding? The are cases where the head of ridiculously large labs appears as last author on dozens of publications a year, and grumbling from those labs insinuate that the professor hasn’t even read half the papers.

Under SDC, this person should appear as the last author, reflecting this minimal contribution, but this shouldn’t give the person some sort of additional credit.

In my lab, I try to enforce a strict SDC policy, which is why I appear as second author on a number of multi-authored papers coming out of my lab. I do need to be clear about this when my record is being reviewed in my department, or else they will think some undergrad has a lab somewhere. Even with this policy, there are complexities, such as collaborations with other labs we follow FLAE, such as with many European colleagues. I have two views on this, which may be mutually exclusive. 1) There is a pragmatic win-win, where I get to be second author and some other lab head gets the last position and there is no debate about who deserves this last position. But 2) this enters morally ambiguous territory where we each may receive elevated credit depending on whether people look at the order through SDC or FLAE.

I guess the win-win isn’t so bad, but it would nice if there was an unambiguous criterion directing author order. And the only one that is truly unambiguous is SDC –with ED (alphabetical) for all the authors after the first couple in large collaborations. The recent paper by Adler and colleagues(2011) is a perfect example of how this should work.

References:

Adler, P. B., E. W. Seabloom, E. T. Borer, H. Hillebrand, Y. Hautier, A. Hector, W. S. Harpole, L. R. O’Halloran, J. B. Grace, T. M. Anderson, J. D. Bakker, L. A. Biederman, C. S. Brown, Y. M. Buckley, L. B. Calabrese, C.-J. Chu, E. E. Cleland, S. L. Collins, K. L. Cottingham, M. J. Crawley, E. I. Damschen, K. F. Davies, N. M. DeCrappeo, P. A. Fay, J. Firn, P. Frater, E. I. Gasarch, D. S. Gruner, N. Hagenah, J. Hille Ris Lambers, H. Humphries, V. L. Jin, A. D. Kay, K. P. Kirkman, J. A. Klein, J. M. H. Knops, K. J. La Pierre, J. G. Lambrinos, W. Li, A. S. MacDougall, R. L. McCulley, B. A. Melbourne, C. E. Mitchell, J. L. Moore, J. W. Morgan, B. Mortensen, J. L. Orrock, S. M. Prober, D. A. Pyke, A. C. Risch, M. Schuetz, M. D. Smith, C. J. Stevens, L. L. Sullivan, G. Wang, P. D. Wragg, J. P. Wright, and L. H. Yang. 2011. Productivity Is a Poor Predictor of Plant Species Richness. Science 333:1750-1753.

Tscharntke T, Hochberg ME, Rand TA, Resh VH, Krauss J (2007) Author Sequence and Credit for Contributions in Multiauthored Publications. PLoS Biol 5(1): e18. doi:10.1371/journal.pbio.0050018

Thursday, April 11, 2013

Navigating the complexities of authorship: Part 1 –inclusion

One of the highlights of grad school is publishing your very first papers in peer-reviewed journals. I can still remember the feeling of seeing my first paper appear in print (yes on paper and not a pdf). But what this novice scientist should not be fretting over is which colleagues should be included as authors and whether they are breaking any norms. The two things that should be avoided are including as authors, those that did not substantially contribute to the work, and excluding those that deserve authorship. There have been controversial instances where breaking these authorship rules caused uncomfortable situations. None of us would want someone writing a letter to a journal arguing that they deserved authorship. Nor is it comfortable to see someone squirming out of authorship, arguing they had minimal involvement when an accusation of fraud has been levelled against a paper. How to determine who should be an author can be difficult.

From PHD Comics (http://www.phdcomics.com/comics/archive.php?comicid=562)

Even though I spell out my own rules below, it is important to be flexible and to understand that different types of papers and differing situations can have an impact on this decision. That said, you do not want to be arbitrary in this decision. For example, if two people contribute similar amounts to a paper, you do not want to include only one because you personally dislike the other. You should have a benchmark for inclusion that can be defended. The cartoon above highlights the complexity and arbitrariness of authorship –and the perception that there are many instances of less than meritorious inclusion.

Journals do have their own guidelines, and many now require statements about contributions, but even these can be vague, still making it difficult to assess how much individuals actually contributed. When I discuss issues of authorship with my own students, I usually reiterate the criteria from Weltzin et al. (2006). I use four criteria to evaluate contribution:

1) Origination of the idea for the study. This would include the motivation for the study, developing the hypotheses and coming up with a plan to test hypotheses.

2) Running the experiment or data collection. This is where the blood, sweat and tears come in.

3) Analyzing the data. Basically moving from a database to results, including deciding on the best analyses, programming (or using software) and dealing with inevitable complexities, issues and problems.

4) Writing the paper. Putting everything together can sometimes be the most difficult and external motivation can be important.

My basic requirements for authorship are that one of these steps was not possible without a key person, or else there was a person who significantly contributed to more than one of these. Such requirements mean that undergraduates assisting with data collection do not meet the threshold for authorship. Obviously these are idealized and different types of studies (e.g., theory or methodological papers) do not necessarily have all these activities. Regardless, authors must have contributed in a meaningful way to the production of this research and should be able to defend it. All authors need to sign off on the final product.

While this system is idealized, there are still complexities making authorship decisions difficult or uncomfortable. Here are three obvious ones –but there are others.

Data sharing

Large, synthetic analyses require multiple datasets and some authors are loath to share their hard work without credit. This is understandable, as a particular dataset could be the product of years of work. But when is inclusion for authorship appropriate? It is certainly appropriate to offer authorship if the questions being asked in the synthesis overlap strongly with planned analyses for the dataset. Both the data owner and the synthesis architect have a mutual interest in fostering collaboration. In this case every effort should be made to include the data owner in the analyses and writing of the manuscript.

When is it not appropriate to include data owners as authors? First and foremost, if the data is publically available, then it is there for further independent investigation. No one would offer authorship to each originator of a gene sequence in Genbank. Secondly, if it is a dataset that has already been used in many publications and has fulfilled its intended goals, then it should be made available without authorship strings. I’ve personally seen scientists reserve the right of authorship for the use of datasets that are both publically available and have satisfied the intended purpose long ago.

The basic rule of thumb should be that if the dataset is recent and still being analyzed, and if the owner has an interest in examining similar questions, then authorship should be offered –with the caveat that additional work is required, beyond simply supplying the data.

Idea ontogeny

I thought about labeling this section ‘idea stealing’ but thought that wasn’t quite right. An idea is a complex entity. It lives, dies and morphs. It is fully conceivable to listen to a news story about agricultural subsidies, which somehow spurs an idea about ecosystem dynamics. We all have conversations with colleagues and go to talks, and these interactions can morph into new scientific ideas, even subconsciously. We need to be careful and acknowledge how much an idea came from a direct conservation with another scientist. Obviously if a scientist says “you should do this experiment…”, then you need to acknowledge them and perhaps turn your idea into a collaboration.

Funding

Now here is the tricky one. Often people are authors because they control the purse strings. Yes, a PI has done an excellent job of securing funding, and should be acknowledged for this. If the study is a part of a funded project, where the PI developed the original idea, then the PI fully deserves to be included. However, if the specific study is independent from the funded project in terms of ideas and work plan, but uses funding from this project, then this contribution belongs in the acknowledgements and does not deserve authorship. There are cases where the PI of an extremely large lab gets dozens of papers a year, always appearing last in the list of authors (see part 2 on author order -forthcoming), and it is legitimate to view their contributions skeptically. Their relationship to many of the papers is likely financial and they probably couldn’t defend the science. I had a non-ecologist colleague ask me if it was still the case that graduate students in ecology produce papers without their advisors, to which I said yes (Caroline has several papers without me as an author).

Clearly there are cultural differences among sub disciplines. However, I do feel that authorship norms need to be robust and enforced. Cheaters (those gratuitously appearing on numerous papers –see part 3 on assigning credit; also forthcoming) reap the rewards and benefits of authorship, with little cost. It is disingenuous to list authors that have not have a substantial input into the publication, and the lead author is responsible for the accuracy of authorship. The easiest way to ensure that authors are really authors is to make an effort to include them in various aspects of the paper. For example, give them every opportunity to provide feedback –send them the first results and early drafts, have Skype for phone meetings with them to get their input and incorporate that input. Ultimately, we all should walk away from a collaboration feeling like we have contributed and made the paper better, and we should be proud to talk about it to other colleagues.

Many of these ideas were directly informed by this great paper by Weltzin and colleagues (2006):

Weltzin, J. F., Belote, R. T., Williams, L. T., Keller, J. K. & Engel, E. C. (2006) Authorship in ecology: attribution, accountability, and responsibility. Frontiers in Ecology and the Environment, 4, 435-441.

http://www.esajournals.org/doi/abs/10.1890/1540-9295(2006)4%5B435:AIEAAA%5D2.0.CO%3B2

Thursday, January 17, 2013

Who are you writing your paper with?

Choosing who you work with plays an important role in who you become as a scientist. Every grad student knows this is true about choosing a supervisor, and we’ve all heard the good, the bad, and the ugly when it comes to student-advisor stories. But writing a paper with collaborators is like dealing with the supervisor-supervisee relationship writ small. Working with coauthors can be the most rewarding or the most frustrating process, or both. Ultimately, the combination of personalities involved merge in such a way as to produce a document that is usually more (but sometimes less) than the sum of its parts. The writing process and collaborative interactions are fascinating to consider all on their own.

Field Guide to Coauthors

The Little General

The Little General is willing to battle till the death for the paper to follow his particular pet idea. Regardless of the aim or outcome of an experiment, a Little General will want to connect it to his particular take on things. Two Little Generals on a paper can spell disaster.

The Silent Partner

These are the middle authors, the suppliers of data and computer code, people who were involved in the foundations of the work, but not actively a part of the writing process.

The Nay-sayer

These are the coauthors who disagree, seemingly on principle, with any attempt to generalize the paper. Given free rein, such authors can prevent a work from having any generality beyond the particular system and question in the paper. These authors do help a paper become reviewer-proof, since every statement left in the paper is well-supported.

The Grammar Nazi

The Grammar Nazi returns your draft of the paper covered in edits, but he has mostly corrected for grammar and style rather than content. This is not the worst coauthor type, although it can be annoying, especially if these edits are mostly about personal taste.

The Snail

This is the coauthor that you just don’t hear from. You can send them reminder emails, give them a phone call, pray to the gods, but they will take their own sweet time getting anything back to you. (And yes, they are probably really busy).

The Cheerleader

The Cheerleader can encourage you through a difficult writing process or fuel an easy one. These are the coauthors who believe in the value of the work and will help motivate you through multiple edits, rejections, or revisions, as needed.

The Good Samaritan

The Good Samaritan is a special type of person. They aren’t authors of your manuscript, but they read it for you out of pure generosity They might provide better feedback and more useful advice than any of your actual coauthors. They always end up in the acknowledgements, but you often feel like you owe them more.

The Sage

The Sage is probably your supervisor or some scientific silverback. They read your manuscript and immediately know what’s wrong with it, what it needs, and distill this to you in a short comment that will change everything. The Sage will improve your work infinitely, and make you realize how far you still have to go.

There are probably lots of other types that I haven't thought of, so feel free to describe them in the comments. And, it goes without saying that if you coauthored a paper with me, you were an excellent coauthor with whom I have no complaints. Especially Marc Cadotte, who is often both Cheerleader and Sage :)

Thanks to Lanna Jin for the amazing illustrations!