Field normalization at different aggregation levels

Thumbnail Image
Publication date
Defense date
Journal Title
Journal ISSN
Volume Title
Google Scholar
Research Projects
Organizational Units
Journal Issue
This paper studies the impact of differences in citation practices using the model introduced in Crespo et al. (2012) according to which the number of citations received by an article depends on its underlying scientific influence and the field to which it belongs. Using a dataset of about 4.4 million articles published in 1998- 2003 with a five-year citation window, the main results are the following four. Firstly, we estimate a set of exchange rates (ERs) to express the citation counts of articles in a wide quantile interval into the equivalent counts in the all-sciences case. For example, in the fractional case we find that in 187 out of 219 sub-fields the ERs are reliable in the sense that the coefficient of variation is smaller than or equal to 0.10. ERs are estimated over the [660, 978] interval that, on average, covers about 62% of all citations. Secondly, in the fractional case the normalization of the raw data using the ERs (or the sub-field mean citations) as normalization factors reduces the importance of the differences in citation practices from 18% to 3.8% (3.4%) of overall citation inequality. Thirdly, the results in the fractional case are essentially replicated when we adopt the multiplicative approach. Fourthly, whenever we are restricted to an intermediate aggregate level with 19 fields, the estimation of the ERs and the linear normalization procedures also offer good results. However, the aggregation of normalized distributions at the lowest aggregate level using sub-field ERs (or sub-field mean citations) as normalization factors, lead to similar or slightly better results at the field level
Bibliographic citation