Scientists must compete for limited funding as well as for academic positions and recognition. Many factors contribute to success, but Hirsch’s h-index puts the emphasis squarely on citations (Hirsch, 2005). In such a system, it is perceived that more citations should lead to more funds, promotions, job security, et cetera. Hirsch’s formula rapidly became the dominant parameter informing academic ranking and resource allocation decisions, but it comes with limitations and it’s easy to manipulate. Yet most scholars and evaluation committees continue to embrace the h-index as the major way to keep score. This shows the metric has value – especially because it aims to make scientific assessment a level playing field – but we shouldn’t get carried away with how we use it.
In particular, the response to the obvious potential to game the h-index via excessive self-citation has been problematic. Inflating one’s papers with unnecessary citations of one’s previous work can easily boost the h-index and give a direct advantage to researchers in the competitive race for academic rewards (Bartneck & Kokkelmans, 2011).
Would that advantage be undeserved? Is the use of self-citations unfair? Does it taint competition?
Clearly, anyone can decide to recur to self-citation. In this respect it is hard to define it as an unfair practice. There is nothing wrong or shameful in building on one’s previous work. Actually this is itself a sign of a productive scientific path, and should thus not be treated as an academic misdemeanor. Still, the use of scientifically unnecessary self-citations is generally regarded as inappropriate.
Setting a clear line between an acceptable and a non-acceptable amount of self-citation is virtually impossible. Yet, given the present reliance on the h-index, we need to find a way to ensure that what it measures is not intentionally altered by the self-citing behaviors of individual scientists.
Calculating the h-index without including self-citations is one way to go. However, this “curated” h-index is inappropriate in instances where self-cites are the result of coordinated, sustained, leading-edge effort. Let’s appreciate rather than discard these legitimate citations. Furthermore, excluding does nothing to address the effect that once you cite yourself enough, others tend to follow. Ignoring the data gives a free pass to those using the strategy.
The best solution is to be transparent with self-citation reporting. Implementing this is easy as we only need to turn the h-index inward to create what we refer to as a self-citation index, or s-index: A scientist has an index of s if he or she has published s articles that have received at least s-citations (Flatt et al., 2017). Defining s similar to h makes it easy for people to understand, and importantly, pairing of h and s indices, unlike excluding, appreciates the worth of self-cites, while at the same time deterring excessive tendencies. With respect to the latter, reporting instead of excluding self-cites will allow us for the first time to see clearly how much the different scientific fields are resorting to self-citing, thereby making excessive behavior more identifiable, explainable, and accountable. Our aim is not to establish a numerical threshold of acceptable s-index. Rather, we propose a transparency-based approach whereby the s-index is shown alongside the h-index providing an indication of how self-citation contributes to the bibliometric impact of one’s work. The approach does not criminalize self-citation, but it offers a tool to help make sense of it.
Table 1: Curated h-index vs. s-index
Curated h-index |
s-index |
Excludes all self-cites from h-index reporting | Pairs unrestrained h and s indices |
Discards warranted self-citation | Appreciates self-citation that results from sustained, productive, leading-edge effort |
Leaves room for citing oneself to attract outside cites | Promotes good citation habits |
Serves to partially protect the h-index from gaming | Relies on peer review to score scientific impact and success |
Certainly some will question whether we really need a self-citation index (Davis, 2017). A concern being that the s-index can’t distinguish legit vs. illegit self-citation or a prolific author from a shameless self-promoter for that matter. Let’s consider. First, when aiming to distinguish legit vs. illegit self-referencing, there is no metric substitute for peer review. We believe that experts from the various disciplines will be able to handle the citation data responsibly and effectively. If not them, who? Second, when it comes to prolific authors, why unfairly punish them by discarding self-citations, especially in those cases where it may be the only data available? The same goes for early career researchers with fewer publications who legitimately use self-citations to make their work visible to other peers. It’s sensible to treat self-cites as signs of progress, rather than scrapping them as nonrelevant.
Exposing how the h-index can be prone to alteration, the s-index speaks in favor of other more qualitative parameters being considered for the assessment of academic value. Moreover, different disciplines may have developed different self-citation habits – each of which is perfectly acceptable within that discipline. The s-index thus leaves room for each disciplinary community to adjust how to consider this measure based on its own standards.
In the end, we must decide which path to take. We can attack the gaming problem by hacking away at the citation data to neatly remove all occurrences of self-cites. However, we and others are not happy with such handling. The other option would be to embrace visibility in a community of peers, which relies on expert judgment rather than curated scorekeeping. Regardless, if metrics are only as valid as the data behind them, then we do good to not hide any of the useful bits.
Figure 1: A snapshot of citation habits for two researchers growing in the same field. Once the s-index is factored in, similar h-indices correspond to very different “carrots”. The s-index, however, does not say whether thin or thick carrots are to be preferred. This choice will largely depend, as it were, on the recipe, that is, on the purpose of the assessment itself. Including the s-index as an additional metric thus provides important context to guide decisions based on academic value.
]]>The number of authors per article in the subject area Multidisciplinary is 3.3 on average with a maximum of 58 authors. The mean number of coauthors is decreasing by 0.1 per year in the respective time period (Figure 1). The articles in this analysis (n = 1111) were cited 14.5 times on average with a maximum of 348 citations.
Figure 1: Boxplot of the number of authors per paper in the subject area Multidisciplinary. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. Due to a limitation of the y-axis, some outliers might not be visible. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
The results of the Advanced search in Scopus were restricted by an algorithm with
For details and code see Schmidt et al. 2017.
]]>Scientific impact reflects the influence that a finding or publication has on science or on society. Such impact can be short term or long term and it is important not to emphasize one over the other to too great an extent. For example, a publication that reports positive results from a clinical trial that demonstrates the efficacy of a potential new therapy could have considerable short-term impact, facilitating the development of a new treatment into clinical practice and directly affecting the lives of many individuals. On the other hand, a publication that describes a new concept or mechanism may not have much short-term impact but could transform how researchers think about their fields and drive other research in completely new directions for years or decades to come.
While it is true that many published articles are never cited, this does not mean that scientific articles are not the right way to communicate research findings. Articles may not be cited for many different reasons. First, some scientific publications are not truly research publications but relatively short articles that offer perspectives on one or more research findings and these many not be cited, even if they are well read. Second, many research publications cover topics that partially or largely overlap with other publications and modern citation practices generally limit the number of articles that are cited. Finally, some academic articles cover topics of limited interest or are of relatively low quality. With all of that said, research articles should (and will) remain as an important mechanism for communication scientific research results.
Scientific research will continue to become more and more interdisciplinary with concepts and technologies from one field being used by researchers in other, traditionally distinct, fields. This will make many research papers quite complicated with a number of different components. In order to communicate these results, it is likely that more open data sharing, data analysis, and commenting formats will evolve to allow readers to provide their insights to one another.
]]>The number of authors per article in the subject area Mathematics is 2.9 on average with a maximum of 9 authors. The mean number of coauthors is increasing by 0.1 per year in the respective time period (Figure 1). The articles in this analysis (n = 3657) were cited 8.2 times on average with a maximum of 357 citations.
Figure 1: Boxplot of the number of authors per paper in the subject area Mathematics. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. Due to a limitation of the y-axis, some outliers might not be visible. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
The results of the Advanced search in Scopus were restricted by an algorithm with
For details and code see Schmidt et al. 2017.
]]>The number of authors per article in the subject area Chemical Engineering is 2.6 on average with a maximum of 15 authors. The mean number of coauthors is increasing by 0.3 per year in the respective time period (Figure 1). The articles in this analysis (n = 1303) were cited 5.9 times on average with a maximum of 112 citations.
The number of authors per article in the subject area Chemistry is 5.5 on average with a maximum of 16 authors (Figure 2). The mean number of coauthors is increasing by 0.1 per year in the respective time period. The articles in this analysis (n = 3142) were cited 13 times on average and 562 as maximum.
Figure 1: Boxplot of the number of authors per paper in the subject area Chemical Engineering. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. Due to a limitation of the y-axis, some outliers might not be visible. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
Figure 2: Boxplot of the number of authors per paper in the subject area Chemistry. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. Due to a limitation of the y-axis, some outliers might not be visible. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
The number of authors per article in the subject area Pharmacology, Toxicology, and Pharmaceutics is 5.5 on average with a maximum of 47 authors. The mean number of coauthors is increasing by 0.01 per year in the respective time period (Figure 3). The articles in this analysis (n = 1165) were cited 16.8 times on average with a maximum of 526 citations.
Figure 3: Boxplot of the number of authors per paper in the subject area Pharmacology, Toxicology, and Pharmaceutics. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. Due to a limitation of the y-axis, some outliers might not be visible. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
The results of the Advanced search in Scopus were restricted by an algorithm with
For details and code see Schmidt et al. 2017.
]]>The number of authors per article in the subject area Arts and Humanities is 2.1 on average with a maximum of 8 authors. The mean number of coauthors is decreasing by 0.01 per year in the respective time period (Figure 1). The articles in this analysis (n = 749) were cited 6.6 times on average with a maximum of 50 citations.
The number of authors per article in the subject area Social Sciences is 2.7 on average with a maximum of 14 authors (Figure 2). The mean number of coauthors is increasing by 0.1 per year in the respective time period. The articles in this analysis (n = 1050) were cited 9.7 times on average and 485 as maximum.
Figure 1: Boxplot of the number of authors per paper in the subject area Arts and Humanities. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
Figure 2: Boxplot of the number of authors per paper in the subject area Social Sciences. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
The results of the Advanced search in Scopus were restricted by an algorithm with
For details and code see Schmidt et al. 2017.
]]>The number of authors per article in the subject area Computer Science is 4 on average with a maximum of 21 authors. The mean number of coauthors is increasing by 0.1 per year in the respective time period (Figure 1). The articles in this analysis (n = 1558) were cited 11.4 times on average with a maximum of 199 citations.
Figure 1: Boxplot of the number of authors per paper in the subject area Computer Science. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. Due to a limitation of the y-axis, some outliers are not shown. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
The number of authors per article in the subject area Decision Sciences is 3.1 on average with a maximum of 13 authors (Figure 2). The mean number of coauthors is decreasing by 0.02 per year in the respective time period. The articles in this analysis (n = 192) were cited 7 times on average and 35 as maximum which is the smallest number.
Figure 2: Boxplot of the number of authors per paper in the subject area Decision Sciences. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. Due to a limitation of the y-axis, some outliers are not shown. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
The results of the Advanced search in Scopus were restricted by an algorithm with
For details and code see Schmidt et al. 2017.
]]>The number of authors per article in the subject area Dentistry is 5.5 on average with a maximum of 25 authors. The mean number of coauthors is increasing by 0.04 per year in the respective time period (Figure 1). The articles in this analysis (n = 1536) were cited 13.6 times on average with a maximum of 183 citations.
Figure 1: Boxplot of the number of authors per paper in the subject area Dentistry. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. Due to a limitation of the y-axis, some outliers are not shown. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
The number of authors per article in the subject area Health Professions is 3.7 on average with a maximum of 16 authors (Figure 2). The mean number of coauthors is increasing by 0.1 per year in the respective time period. The articles in this analysis (n = 808) were cited 7 times on average and 140 as maximum.
Figure 2: Boxplot of the number of authors per paper in the subject area Health Professions. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. Due to a limitation of the y-axis, some outliers are not shown. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
The number of authors per article in the subject area Nursing is 3.7 on average with a maximum of 61 authors. The mean number of coauthors is decreasing by 0.3 per year in the respective time period (Figure 3). The articles in this analysis (n = 1282) were cited 7.5 times on average with a maximum of 126 citations.
Figure 3: Boxplot of the number of authors per paper in the subject area Nursing. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. Due to a limitation of the y-axis, some outliers are not shown. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
The number of authors per article in the subject area Psychology is 4.1 on average with a maximum of 61 authors (Figure 4). The mean number of coauthors is increasing by 0.05 per year in the respective time period. The articles in this analysis (n = 954) were cited 14.4 times on average and 200 as maximum.
Figure 4: Boxplot of the number of authors per paper in the subject area Psychology. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. Due to a limitation of the y-axis, some outliers are not shown. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
The results of the Advanced search in Scopus were restricted by an algorithm with
For details and code see Schmidt et al. 2017.
]]>The number of authors per article in the subject area Immunology and Microbiology is 9.7 on average with a maximum of 83 authors. The mean number of coauthors is increasing by 0.4 per year in the respective time period (Figure 1). The articles in this analysis (n = 968) were cited 34.4 times on average with a maximum of 2938 citations which is the second highest value across the 27 subject areas according to our methodology (see below).
The number of authors per article in the subject area Neuroscience is 8 on average with a maximum of 342 authors (Figure 2). The mean number of coauthors is increasing by 0.5 per year in the respective time period, which is the second highest value across the 27 subject areas. The articles in this analysis (n = 1122) were cited 38.8 times on average and 924 as maximum.
The number of authors per article in the subject area Biochemistry, Genetics, and Molecular Biology is 16.7 on average with a maximum of 1269 authors, which are the second highest numbers across all subject areas. The mean number of coauthors is decreasing by 2.1 per year in the respective time period (Figure 3). The articles in this analysis (n = 818) were cited 33.4 times on average with a maximum of 1774 citations.
Figure 1: Boxplot of the number of authors per paper in the subject area Immunology and Microbiology. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. Due to a limitation of the y-axis, some outliers are not shown. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
Figure 2: Boxplot of the number of authors per paper in the subject area Neuroscience. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. Due to a limitation of the y-axis, some outliers are not shown. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
Figure 3: Boxplot of the number of authors per paper in the subject area Biochemistry, Genetics, and Molecular Biology. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. Due to a limitation of the y-axis, some outliers are not shown. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
The results of the Advanced search in Scopus were restricted by an algorithm with
For details and code see Schmidt et al. 2017.
]]>The number of authors per article in the subject area Earth and Planetary Sciences is 8.7 on average with a maximum of 468 authors, which is the fourth highest number across all subject areas in our methodology (see below). The mean number of coauthors is increasing by 0.3 per year in the respective time period (Figure 1). The articles in this analysis (n = 2818) were cited 23.4 times on average with a maximum of 1375 citations (fifth highest value among the 27 subject areas).
The number of authors per article in the subject area Environmental Sciences is 5.9 on average with a maximum of 49 authors (Figure 2). The mean number of coauthors is increasing by 0.1 per year in the respective time period. The articles in this analysis (n = 1759) were cited 17.5 times on average and 624 as maximum (seventh highest value among the 27 subject areas).
Figure 1: Boxplot of the number of authors per paper in the subject area Earth and Planetary Sciences. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. Due to a limitation of the y-axis, some outliers are not shown. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
Figure 2: Boxplot of the number of authors per paper in the subject area Environmental Sciences. The box denotes 25–75% of the values with the median (bold line) in it. The small circles are outliers. Due to a limitation of the y-axis, some outliers are not shown. The yellow line shows a linear model of the mean number of authors per article with a confidence interval of 0.95 shown in light grey. Data source: Scopus. CC BY 4.0 Schmidt, Fecher, Kobsda.
The results of the Advanced search in Scopus were restricted by an algorithm with
For details and code see Schmidt et al. 2017.
]]>