Computing Applications Virtual extension

Distinguishing Citation Quality For Journal Impact Assessment

By Andrew Lim, Hong Ma, Qi Wen, Zhou Xu, and Brenda Cheang

Posted Aug 1 2009

Introduction
Data collection
Identifying Relevant Citations
Assessing Journal Impact by Revised PageRank Method
Comparisons with other Citation Indices
Roles of Journals: Source, Hub, and Storer
Conclusion
References
Authors
Footnotes
Figures
Tables

The research community has long and often been fervently keen on debating the topic of journal impact. Well, just what is the impact of a journal? Today, the Science Citation Index (SCI) recognizes over 7,000 journals. The sheer number of available journals renders it pivotal for researchers to accurately gauge a journal’s impact when submitting their papers, as it has become commonplace that researchers regard publishing their work in established journals to have significant influence on peer recognition. For journals in Management Information System (MIS), such research studies have continuously been published since the 1990s. Nine of them have been summarized by Carol Saunders,¹¹ whereby seven were based on respondent perceptions by surveying experts, and two were based on the citation quantity to indicate the journal impact.

It is generally accepted that citation analysis is purported to be a more objective method than the expert survey.² The main reason is citation analysis uses objective measurements, which are based on the viewpoint that the influence of a journal and its articles is determined by their usefulness to other journals and articles, and where their usage can be reflected by citations that they have received. However, using citation quantity only is also considered to have bias to a certain degree, due to a widely held notion that citation quantity does not represent citation quality. Regarding the impact in the MIS discipline for example, a citation by a paper published in a prestigious MIS journal should far outweigh a citation by a paper published in an unremarkable MIS journal or in an external journal outside the MIS field. Such intuition suggests that the citation quality can be divided into the following two aspects:

Citation Relevance (CR): indicating how relevant the journal giving the citation is to the discipline we are interested in;
Citation Importance (CI): indicating how important the journal giving the citation is in the discipline we are interested in.

However, these concerns about citation quality have not been properly addressed in citation analysis literature.

To address our concerns about citation quality for assessing journal impact, we propose a method, which first clusters “pure” MIS journals to identify relevant citations, and then score the impact for each journal, according to its citations that are received from pure MIS journals and weighted by citation importance. Although our method is only applied to MIS journals, it is general enough to evaluate the impact of journals in other disciplines.

Data collection

The ISI Web of Science Database is one of the most popular citation databases for more than 7000 academic journals, among which 65 journals have appeared in at least one of the nine studies in the literature¹¹ on journal impact assessment for the MIS field. These 65 journals, including Communication ACM (CACM), European Journal of Information Systems (EJIS), Information Systems Research (ISR), Information & Management (I&M), and MIS Quarterly (MISQ), are considered to be MIS-related and used in our study. For each journal, the frequency of its recent citations between 2001 and 2005, referenced to every other cited journal, was drawn from the ISI Web of Science and aggregated to form a 65 times 65 citation matrix. Coordinates along rows and columns of the matrix indicate the citing and the cited journals respectively.

With regard to the self-citation rates, we find that MIS related journals are widely self-cited and show a considerable dependence on the contribution of self-citations that can lead to significant changes in the assessment result. Neither a complete removal of self-citation nor a complete inclusion is viable. Therefore, we assigned each journal a ceiling of self-citation rates according to the number of non-self citations to a journal that it cited most. In the citation matrix, we then bounded every element in the diagonal by the maximum value of other elements in the same row. The complete citation matrix will be provided on request.

For the ease of our analysis, we further computed the citation proportion matrix, where each element represents the proportion of citations referenced to the cited journal, in terms of the percentage of the total citations from the citing journal.

Identifying Relevant Citations

Besides those well-known MIS journals, our list of MIS-related journals, as well as other lists in the literature^{2, 5} often include a few multidisciplinary journals, such as Management Science (MS), Journal of the ACM (JACM), or Operations Research (OR). There have been studies indicating the inclusion of these journals can pose a problem.⁴ From the citation analysis view, the citations from the multidisciplinary journals are not accurate statistics in reflecting the influence of the cited journals on the MIS field. For this reason, the citation relevance study should be integral to any citation analysis study, which unfortunately has often been overlooked in the literature. Instead of excluding all multidisciplinary journals, as supported by some MIS scholars,⁴ we opt to remove the citations from multidisciplinary journals. Ideally, only citations referenced by pure MIS papers should be counted for impact assessment, but it is impossible to identify them due to the unavailability of paper classifications. As a reasonable approximation, we considered only citations from pure MIS journals, a categorization which was determined by clustering journals with similar citation patterns in the following manner:

To feature the citation pattern of each journal, we adopted a log multiplicative model⁸ to provide the best fit for the citation data. Unlike the practice in the literature,^{6, 8} we applied the model to the citation proportion matrix rather than the citation matrix. This is regarding to the fact that some MIS journals, such as CACM, have restrictions on the number of references for each paper, putting them at a disadvantage if using the sheer citation numbers.

Using a free software lEM,¹² we achieved the best fit of the model for the citation proportion matrix, which includes five dimensions of association between citations sent and received by each journal. Such a five-dimension vector thus featured the citation pattern for each journal.

For every pair of journals, the distance between their feature vectors was used to measure their dissimilarity. We thus applied Ward’s method,¹⁰ one of the most popular variants of the agglomerative hierarchical clustering procedures, to identify inherent clusters among all the 65 MIS-related journals. Six major clusters are obtained, as shown later in Table 1. The one with fifteen journals, including MISQ, ISR, I&M, and EJIS, was considered to form the core set of pure MIS journals. Other than the core set, the cluster for the computer science and engineering discipline has the largest populations of 14 journals, including CACM, IEEE Transactions, and JACM.

Assessing Journal Impact by Revised PageRank Method

Given the core set of 15 pure MIS journals, we considered only citations from the core as relevant citations for our analysis. To further differentiate the importance of citations, we had to understand the relationship between citation importance and journal impact. Intuitively speaking, a citation from an influential journal should be considered more important than one from an unremarkable journal; while a journal receiving more of important citations should be considered more influential. Based on this idea, an invariant method was developed in the 1970s to evaluate the impact of physics journals.⁹ It has recently been adopted, using the name of PageRank, to rank the impact of web pages by Google very successfully.⁷

The PageRank method basically solves a set of linear equations by treating the impact of each journal as a variable taking positive value. These linear equations establish the intuitive relationship, as explained above, between the citation importance and the journal impact in a simplified manner: the impact of a journal A is equal to the sum of the product of the impact of every journal B multiplied by the proportion of citations from B to A. As we can see, the PageRank method uses the impact of citing journal to indicate the importance of a citation.

It turns out that the impact of journals, defined by the PageRank method above, is exactly the eigenvector of the proportional citation matrix with a unit eigenvalue.¹ However, such an eigenvector may not exist if there is a pair of journals for which one cannot reach the other by following references.⁷ Particularly for our study, ignoring irrelevant citations prevents every journal outside the core from reaching any journal inside the core.

For the reason above, we need to extend the PageRank method as follows to fit for citation matrix with a core. Since the pure MIS journals in the core can reach each other by following their references, we first applied the standard PageRank method on their proportional citation matrix to obtain the impact of every journal in the core. For a journal A outside the core, its impact was then redefined by the sum of the product of the impact of every journal B multiplied by the ratio of the citation number from B to A and the total citation number from B to the core.

We call this extension for PageRank Revised PageRank, and its results for journal impact Revised PageRank Score, or RPRS for short. It can be seen that the revised PageRank still keeps the linear relationship between the citation importance and the journal impact, but uses the total citation number from each journal to the core, instead of that to all journals, to normalize each citation number. Therefore it is still able to distinguish the citation importance in assessing journal impact. The new method guarantees a unique feasible valuation for impact of all MIS-related journals, and is well defined for every citation matrix structured with a core journal set.

For a better understanding of how to calculate RPRS, let us consider a simplified example as shown in Figure 1, whereby MISQ, ISR, and I&M represent three pure MIS journals in the core set, CACM serves as an multidisciplinary journal outside the core, and self-citations are ignored. Figures along the arrows indicate the frequencies of citations from the core. Here self citations are ignored for simplification. Among the 369 and the 993 citations from ISR and I&M to the core set, there are 89.16% and 82.78% going to the MISQ. This implies a linear equation for MISQ, such as, RPRS of MISQ = 89.16% of RPRS of ISR + 82.78% of RPRS of I&M.” Similarly we can write down the other two linear equations by considering citations to ISR and to I&M. By assuming the sum of RPRS of MISQ, ISR, and I&M to be one, the system of linear equations above has a unique positive solution for values of RPRS, such as, 0.467 for MISQ, 0.401 for ISQ, and 0.132 for I&M. Based on that, RPRS of CACM can be assessed by taking (176/306×0.467+(155/369)×0.401+(428/993)×0.132=0.494, whereby each ratio in the bracket is the ratio between citations to CACM and citations to the core from the citing pure MIS journals.

Applying the revised PageRank method for all journals in this study, we obtained RPRS’s for impact of pure MIS journals in the core set first, as shown in 2^nd rightmost column of Table 1. Based on that, RPRS’s for impact of journals outside the core set were then calculated and are also reported in Table 1. Figures in the last column of Table 1 have numbered the ranks of all journals according to the increasing order of their RPRS’s, which also determines the order for journals being listed within each cluster.

According to RPRS’s shown in Table 1, we are able to identify which journals have high impact on the MIS field, as well as how influential they are. Among 15 pure MIS journals, the seven most authoritative journals, including MISQ, ISR, JMIS, IM, EJIS, JSIS, and IJEC, obtained more than 92% of the total RPRS of the core set. The top two ones, MISQ and ISR, obtained about 57% of the total RPRS of the core. It is worth noticing that these journals with high RPRS’s are also well known for their good reputations in the MIS field, according to recent surveys of MIS scholars.⁵

In addition to pure MIS journals, some journals in other professions also appear to have high impact on the MIS field in terms of their large RPRS’s. For example, MS, CACM and OS have the highest RPRS among journals in the professions of Operations Research, Computer Science and Engineering, and Management respectively. RPRS’s of these three journals are even higher than most of the pure MIS journals. They are among the top 5 of all 65 MIS-related journals, and the other two are MISQ and ISR.

Another interesting finding is that among journals with the top 20 highest RPRS’s, there are eight journals from the Management field, even one more than those from pure MIS journals. This reveals a strong impact of Management journals on the MIS field, which is also consistent with findings in the literature.³

Comparisons with other Citation Indices

As shown in above, the impact of a journal can be assessed by its RPRS, which, however, is likely to be affected by the number of papers published in a journal. In order to eliminate journal size effect, we calculated the RPRS per paper (RPRS/P) by averaging its RPRS over its size, whereby the size of a journal was approximated by the number of papers it published between 2001 and 2005. In order to examine the effectiveness of RPRS and RPRS/P, another four citation indices were calculated for comparisons, including the total citations (or per paper) that a journal received from the core set of pure MIS journals (or from all journals). Therefore, we obtained in total six citation indices for every journal.

Table 2 summarizes the rank (instead of the value) for each of the six citation indices above, and only for the pure MIS journals or journals with RPRS’s in the top 20 highest. According to Table 2, it can be seen that with or without differentiating the citation relevance, the results of journal impact assessment appear to be very different. For example, ISR has an RPRS in the top 3 highest, but is only ranked as 16^th in terms of its total citations received. Another convincing example is from EJOR and OR. Both of their RPRS’s are excluded from the top 20 highest, but their total citations received are pretty high, ranked as 6^th and 8^th of all. However, among all the citations received by EJOR or OR, less than 1% is from pure MIS journals. Although EJOR and OR have good reputations among the Operations Research profession, neither of them has been ranked in the top in a recent survey of MIS scholars.⁵ Thus it is more convincing to differentiate citation differences for the journal impact assessment, just as we did in calculating RPRS.

According to the ranks for RPRS, MISQ and ISR stand in the top. This may not be surprising for MISQ, since its total citations received are also ranked in top 1. However, the rank in top 3 highest RPRS’s is significant to ISR, because its total citations received is ranked only as 6^th. To understand whether or not such a change of the rank is reasonable, we have compared citations to ISR with those to the other journal I&M. Although I&M receives 50% more citations than ISR, the frequency of its citations from MISQ is only 58, much less than ISR, whose is 248. By amplifying the importance of citations from MISQ, which is quite natural, the RPRS of I&M is ranked only as 9^th, much lower than _ISR. This relative ranks between ISR and I&M is also consistent with the recent survey of MIS scholars.⁵ We therefore believe RPRS is likely to provide a more reasonable assessment of journal impact than other citation indices that ignore citation importance.

Comparing RPRS in Table 2 with the frequency of total citations from the core also demonstrates the different effects of self citations. For example, JCIS is ranked as 17^th for its frequency of total citations from the core, but only ranked as 49^th for its RPRS. As we observed, 99% of the citations JCIS received are self citations, and it has only once been cited by MISQ and ISR. This goes to show that the revised PageRank method has reduced the importance of self citations for such journals, and therefore produced a more reasonable assessment for their impact.

Table 2 also shows some differences between ranks of RPRS and RPRS/P. For example, EJIS and JSIS, two pure MIS journals who publish fewer than 25 articles annually, have higher ranks (10^th and 8^th) for their RPRS/P than ranks (15^th and 19^th) for their RPRS. This implies a paper published in EJIS or JSIS is likely to have high impact in the MIS field, regardless of their small journal sizes. However, some multidisciplinary journals, such as MS, CACM, and HBR etc., have relatively higher ranks (2^nd, 4^th, and 10^th) for RPRS than those (9^th, 16^th, and 19^th) for RPRS/P. These three journals are large, each publishing more than 100 articles per year. Since quite a few papers they published are not related to the MIS field, it would be ideal if we could count only MIS-related articles for the calculation of RPRS/P. However, identifying MIS-related articles is very difficult for us to accomplish, due to the limited amount of available data.

Roles of Journals: Source, Hub, and Storer

Journals with high RPRS should be considered as more influential knowledge sources in the MIS field. As highlighted in Table 2, among the top 20 journals with the highest RPRS’s, only seven are pure MIS journals, while 13, the majority, are not, with eight from Management, two from Operations Research, two from Computing, and one from AI. Therefore the pure MIS journals together with Management journals form the two major knowledge sources for the MIS field.

It is interesting to see that journals with high RPRS’s have a significant difference in the proportions of citations they to the pure MIS journals. As shown in Figure 2, pure MIS journals in the core set have sent relatively higher proportions (above 30%) of their citations to pure MIS journals. For journals that are not pure MIS journals, only CACM (28.7%), DS (15.1%), and DSS (22.6%) cite journals in the core set frequently, but others, such as AMR(0.3%), AMR (0.2%) and etc, scarcely refer the pure MIS journals.

The observation above implies the roles of CACM, DS, and DSS are the hubs that exchange knowledge between the MIS and other disciplines. Journals, such as AMR, serve as only a knowledge source for the MIS field. It is worth noticing that the reason for HBR to rarely cite the pure MIS journals is likely to be different from that for others, because most papers published in HBR have no references. Lastly, for pure “MIS” journals with small RPRS, such as ISJ, ISM, JOCEC, IJIM, JCIS, ISF, and WIRT, they still have large proportions (above 30%) of citations to the other pure MIS journals, implying their roles as knowledge storers only for the MIS field.

Conclusion

A new method is proposed to differentiate the citation quality for assessing journal impact. For applying the method application in the MIS discipline, we first identified a core set of pure MIS journals by clustering 65 MIS-related journals according to their citation patterns. Only citations from pure MIS journals are considered to be relevant, and their importance is thus differentiated by an extension of the standard PageRank method, revised PageRank, to assess each journal for its impact in the MIS discipline. Based on empirical results, we have demonstrated the effectiveness of our new method, and also revealed different roles that journals played in terms of their impact in the MIS discipline.

It must be noted that the results are specific to the selection of journals. For some notable omissions, such as DataBase, whose citation records are not available in the ISI Web of Science, we are currently looking for another data source. The method we proposed relies on citations among journals. However, since the MIS field has some multiple disciplinary journals, it would be ideal to analyze the citations among articles, and understand the field that each article belongs to. This may require tremendous time and resources.

Besides assessing journal impact within a fixed time period, our method can also easily be applied to citation data for varying time periods to capture the change of journal impact, as long as this data is accessible. Although results based on the current method may be far from judging the journal impact accurately, we believe that the insights behind the method as well as its findings can serve as a valuable base for further study. Finally, to facilitate the use of our method in assessing journals of other disciplines, our team collected citation data for 7000+ journals and established an on-line journal ranking system at http://journal-ranking.com/ranking/web/index.html, whose engine for assessing journal impact has been implemented using the revised PageRank method.

Figures

Figure 1.

Figure 2.

Tables

Table 1.

Table 2.

Submit an Article to CACM

CACM welcomes unsolicited submissions on topics of relevance and value to the computing community.

You Just Read

Distinguishing Citation Quality For Journal Impact Assessment

View in the ACM Digital Library

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

DOI

10.1145/1536616.1536645

August 2009 Issue

Published: August 1, 2009

Vol. 52 No. 8

Pages: 111-116

Table of Contents

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Explore More

BLOG@CACM Apr 17 2024

Technical Marvels

Herbert Bruderer

Computer History

BLOG@CACM Apr 16 2024

The Value of Data in Embodied Artificial Intelligence

Shaoshan Liu

Artificial Intelligence and Machine Learning

News Apr 15 2024

‘Not Our Problem’

David Geer

Data and Information

Credit: Getty Images cybercriminal emerges from manhole-cover app icon on mobile phone screen, illustration

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More

Data collection

Identifying Relevant Citations

Assessing Journal Impact by Revised PageRank Method

Comparisons with other Citation Indices

Roles of Journals: Source, Hub, and Storer

Conclusion

Figures

Tables

Distinguishing Citation Quality For Journal Impact Assessment

DOI

August 2009 Issue

Related Reading

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.