Do paper citations and indices correlate?

Evaluating the impact of research activity is a complex issue, that is guaranteed to stir hot debates whatever the audience or the context. The way evaluators have access to research output is mostly via publications, whatever the type – articles, books, conference proceedings, technical reports etc. It is important to note that in many fields of research, publications are not (or should not) themselves the output of research. In particular, in natural sciences and mathematics, the output of science comes under many guises such as theorems, software, datasets, techniques, chemical compounds and materials, patents etc.  the publications being only a report on the research activity and its outcome. Nevertheless, as a consequence, a large part of the evaluation relies on the publications. The most obvious way to do so is by reading the publications, the so-called peer review. This is what is done before a manuscript is accepted for publication in scientific journals (and increasingly now after it is published). However, to assess funding applications, project achievements, individual and institution performances, most evaluations rely in part on the analysis of publication impact.

[small digression. Let’s be clear about something. Everyone claiming that peer-review of papers is being used to evaluate funding applications, individuals for positions or promotions, or institute performance, is either an hypocrite or has never been part of such an evaluation committee. This never happens, for two reasons, one negative and one positive. The first one is that nobody would have time to perform such as exercise. Members of evaluation panels are often senior researchers, chosen for their recognized track records. They lead research groups and are completely over-committed. Reading a paper seriously, understanding its content, its novelty, takes a significant amount of time. The notion that we read dozens of papers from dozens of scientists for a given panel is just a fantasy. The second, positive, reason is that members of evaluation committees present a very limited collective expertise. For instance I am part of a committee covering the totality of the research spectrum. In this committee, there are only a handful of people covering the entirety of life sciences! It is VERY FORTUNATE that we are not actually judging the papers ourselves!]

To improve on arbitrary judgments based on unconscious bias triggered by journal names, and to complement evaluation by external reviewers, people try to use quantitative metrics, developed by the field called bibliometrics, and in particular citation analysis. For instance, The UK Research Excellence Framework (REF) provides guidance for the use of citation data. It is important to note that those metrics are not sufficient, and REF is actively assessing how best to use them.

In the field of natural sciences citation, a variety of metrics are used to evaluate the impact of articles, individuals and institutes, including citation counts, h-indices and impact factors (yes, this is very wrong, IF are meant for journals not for papers and authors). Recently, a new metrics has been proposed to assess the impact of a given article, the Relative Citation Ratio.

Scientists are inherently navel gazing (or maybe it is just me), and I was curious to see how all these correlated for me. So I collated my bibliometrics data using Google Scholar.  First let’s look at the classic measurements. If I plot the citations of each paper versus the impact factor of the journal it was published in for the year of its publication, the correlation is not overwhelming …


The paper describing SBML is clearly an outlier and makes hard to judge the rest of the plot, so let’s discard it for the time being (yes, I should also discard the outliers in the other direction, but hey, this is a blog post, not a research paper …)

citvsifnosbmlNow, the correlation is clear, but still not overwhelming. The correlation seems to disappear for the highest impact factors, above 18. However, there is an obvious correction to bring to citation counts: recent papers are less cited than old papers. Because I am now a senior scientist, I tend to publish a bit more in papers of high impact factors. Examples are papers reporting the results of large collaborations and invited reviews. So we need to correct for paper age by dividing the counts with the number of year elapsed since their publication.


Indeed, the correlation is clearer. But there is still a lot of noise. I would not say that choosing a higher impact factor is a foolproof way to getting more citations. And I would certainly not say that a paper in a high impact factor journal has necessarily a big impact!

Let’s now turn to the Relative citation ratio. How does it compare to the Impact factor?


Well, the correlation is quasi-identical to the one with the average citations per year. Which of course leads us to the main comparison, which is between the RCR and the citation counts.


The correlation is much better. The outlier with 37 citations and an RCR of 0 is actually an artifact of Google Scholar. Of course, the RCR offers more than just an improved citation count. For instance, it also compares a paper’s impact to the impact of all papers reporting research funded by the NIH. A problem of the current tool though, is that its citation data comes from the Web of Science databases. Those databases do not contain all the scientific journals. They do not record citations in books. And of course they are not open. The RCR is a neat tool, but considering the strong correlation with pure citations, at least in my case, I think just looking at the citation counts is actually a good easy to use proxy for impact.

All that focused on article per article impact. But would total citations be a good proxy to evaluate individual researchers? Continuing the navel gazing exercise, I extracted the data for people in my institute who set up a Google profile. I omitted the PhD students, because publication records and citations are too noisy. I divided the positions in department heads, tenure group leaders, tenure track group leaders (5 year positions, most often a first experience of group leader), senior research associates (indefinite contracts but not group leaders) and post-doctoral fellows.


The correlation between total citations and h-index is quite impressive. This is probably due to the fact that we do not have distortions due to anomalous papers (e.g. BLAST or Clustal in bioinformatics). The occasional highly cited papers (e.g. SBML in my case) are just averaged out. And what comes out clearly is that in the majority of cases, positions match publication impact.  Are total citations or h-index the best predictor? We can plot the rank in both classifications.


The H-index seems to correlate a bit better with tenure, SRA  and tenure track positions. The separation between tenure track and post-docs is more blurry because some post-docs are quite senior and have impressive CVs. But overall, the separations are quite clear. And so is the message. In my institute, there is little hope to become tenure track if you have less than 1000 citations and a single digit h-index. For tenure, the bar would be close to 3000 citations and h-index in the mid-tenth. When it comes to department heads we’re talking 10000 citations and an h-index of 50.

Now all that is of course very focused on my field of research. Molecular, cellular and systems biology is a very peculiar community. The publication habits, the criteria of excellence, everything is very homogeneous, almost military. It is also a fairly inward looking community. Not only there are very little contacts with other sciences, but there are very little contacts with the other components of life sciences as well. A fair amount of its members are actually convinced all scientists in all fields are thinking and acting alike. They would be surprised, and dismayed, to witness what I once saw in a conference: German computer science students impersonating us, exchanging pompous sentences about journal articles, impact factors and citations. They had the time of their life. Very humbling.

All that to say that everything in this blog post should be taken with more than a grain of salt.