Fact-checked by Grok 2 weeks ago

g -index

The g-index is a bibliometric measure designed to quantify the productivity and citation impact of a researcher's body of work, proposed by Leo Egghe in 2006 as an extension of the . It is defined as the largest integer g such that the researcher's top g most-cited publications have collectively received at least g2 citations when ranked in descending order of citations. This formulation emphasizes the total citation performance of an author's most influential papers, providing a single numeric value that balances quantity of output with qualitative impact. Unlike the , which requires each of the h highest-cited papers to individually have at least h citations, the g-index aggregates citations across the top g papers, thereby assigning greater weight to exceptionally highly cited works and often yielding a higher value than the h-index for the same publication set. Egghe introduced the g-index in his paper "Theory and practise of the g-index," published in Scientometrics, to address limitations in the h-index by better capturing "global citation performance" for researchers with skewed citation distributions. The index is particularly sensitive to highly productive authors or those with a few standout publications, making it useful for distinguishing nuanced differences in impact among elite scholars. In practice, the g-index is computed using citation databases such as , , or , where publications are sorted by citation count and cumulative totals are checked against the g2 threshold. It has been integrated into academic evaluation frameworks, tenure reviews, and ranking systems as a complementary metric to the , though critics note its tendency to produce inflated values compared to the , which can complicate direct comparisons across disciplines. Despite these considerations, the g-index remains a key tool in for assessing long-term scholarly influence, with ongoing refinements and variants explored in subsequent research.

Overview

Definition

The g-index is a bibliometric indicator that measures the of a researcher's body of work by identifying the largest number g such that the top g most-cited publications collectively account for at least g^2 citations. This ranks publications in decreasing order of their counts, emphasizing the overall and captured by highly cited works within the core set. Here, g serves dual roles: it denotes both the quantity of top publications considered and the squared threshold for their minimum total citations, providing a balanced view of quantity and quality in scholarly output. Formally, given a researcher's publications ordered by decreasing citation counts c_1 \geq c_2 \geq \dots \geq c_n, the g-index is the maximum integer g satisfying \sum_{i=1}^g c_i \geq g^2. This formulation extends concepts like the , which it always meets or exceeds (g \geq h), by incorporating the full citation strength of the top papers rather than a uniform threshold.

Purpose and motivation

The g-index was developed by Leo Egghe as an enhancement to the , aiming to provide a more comprehensive measure of a researcher's scientific output by incorporating the overall citation performance of their most cited works. Introduced in , it addresses the need for an index that better reflects the global impact of an author's publications, particularly in fields where citation counts vary widely. A key shortcoming of the h-index is its equal treatment of all papers within the h-core, regardless of significant disparities in their citation counts, which can undervalue the influence of a few standout, highly cited publications. The g-index mitigates this by emphasizing the total citations received by the top g articles, thereby giving greater weight to outliers and providing a fuller picture of an author's and scholarly visibility. This adjustment ensures that researchers with skewed citation distributions—common in many disciplines—are not penalized for having a concentration of high-impact works alongside more modestly cited ones. Conceptually, the g-index seeks to strike a more effective balance between the quantity of publications (productivity) and their qualitative impact (citations), offering a refined tool for scientometric evaluations that distinguishes scientists more accurately based on their true influence. By inheriting the strengths of the while extending its scope to account for citation totals in the leading set of papers, it promotes a nuanced of excellence in diverse academic contexts.

History and development

Introduction by Leo Egghe

Leo Egghe, a Belgian scientometrician affiliated with , has made significant contributions to the field of informetrics through his extensive research on and scholarly impact measures. His work often focuses on mathematical models for evaluating scientific productivity and influence, building on foundational concepts in . In 2006, Egghe introduced the g-index as a refinement to existing author-level metrics, particularly in response to the rapid adoption of the proposed by in 2005. This development occurred amid increasing interest in quantitative tools for assessing researchers' citation performance beyond traditional metrics like total citations. The g-index was formally presented in Egghe's paper "Theory and practise of the g-index," published in (volume 69, issue 1, pages 131–152), where he outlined its theoretical foundations and practical applications for measuring the global of a researcher's body of work. This publication marked a key advancement in informetrics, emphasizing the need for indices that balance productivity and citation distribution. Following its introduction, the g-index saw early extensions aimed at addressing its integer-based limitations for more precise evaluations. In 2009, Raf Guns and Ronald Rousseau reviewed and formalized real and rational variants of the g-index, extending prior work on similar adaptations for the h-index. The real g-index (gr) generalizes the metric to continuous values by interpolating based on the citation threshold, while the rational g-index (grat) achieves non-integer results through a fractional adjustment tied to the citations of the (g+1)th paper. These variants enhance granularity by allowing values between integers, reducing abrupt changes in scores for minor citation shifts. Related indices emerged contemporaneously, building on the g-index's emphasis on highly cited works. The hg-index, proposed by Sergio Alonso, Francisco José Cabrerizo, Enrique Herrera-Viedma, and Francisco Herrera in 2009 (with a preprint in 2008), combines the and via the √(h × g) to balance broad productivity with citation impact from top papers. Similarly, the contemporary (hc), introduced by Antonis Sidiropoulos, Dimitrios Katsaros, and Yannis Manolopoulos in 2007, incorporates temporal weighting to favor recent citations, aligning with the g-index's goal of mitigating underemphasis on influential outliers in standard metrics. The g-index gained adoption in the late 2000s as bibliometric tools proliferated, reflecting its utility in complementing the . By 2007, it was integrated into the initial release of , a widely used software by Anne-Wil Harzing for retrieving and analyzing citations, enabling easy computation alongside other metrics. This incorporation facilitated broader application in academic evaluations during the period. Criticisms of the g-index, particularly its saturation effect—where the metric caps at the total number of publications (g ≤ P) and additional citations to existing top papers fail to raise it without new outputs—sparked debates on its responsiveness for prolific researchers. This limitation, noted in analyses of citation rank distributions, prompted adjusted formulas like the real and rational variants to mitigate discreteness and related stagnation issues, alongside entirely new indices to better capture ongoing impact.

Calculation

Step-by-step method

To compute the g-index for an author or researcher, begin by compiling a complete list of their publications along with the corresponding number of citations each has received. This dataset forms the basis for the calculation, drawing from bibliometric databases such as or . Next, sort the publications in descending order of citation counts, denoted as c_1 \geq c_2 \geq \dots \geq c_n, where n is the total number of publications. This ordering ensures that the most highly cited works are prioritized in the evaluation. Then, for each possible value of g from 1 to n, calculate the cumulative sum of citations for the top g publications:
S_g = \sum_{i=1}^{g} c_i.
This step aggregates the of the leading publications incrementally.
Identify the largest integer g such that S_g \geq g^2. This threshold condition captures the point where the collective citations of the top g papers meet or exceed the square of g, defining the g-index value. For edge cases, if an author has zero publications, the g-index is defined as 0. Uncited publications (where c_i = 0) do not contribute to the top g papers due to the descending sort and thus have no impact on the computed value. Computationally, the process can be implemented iteratively: initialize the cumulative sum at 0 and increment g from 1, updating S_g by adding the next c_g at each step, while checking the condition S_g \geq g^2; stop at the first g where the inequality fails, as subsequent values will not satisfy it given the non-increasing citation sequence. This approach avoids exhaustive checks for all g up to n and ensures efficiency, particularly for large publication lists.

Illustrative example

Consider a hypothetical researcher with 10 publications, receiving the following citation counts in descending order: 25, 12, 10, 8, 7, 5, 4, 3, 2, 1. To compute the g-index, calculate the cumulative sum S_g of the top g most-cited publications and find the largest g such that S_g \geq g^2. The sorted list remains the same as provided. For g=1, S_1 = 25 \geq 1^2 = 1; for g=2, S_2 = 37 \geq 4; for g=3, S_3 = 47 \geq 9; for g=4, S_4 = 55 \geq 16; for g=5, S_5 = 62 \geq 25; but for g=6, S_6 = 67 < 36. Thus, the g-index is 5. This result indicates that the researcher's top 5 publications collectively account for at least 25 citations, highlighting a subset of high-impact work while accounting for uneven citation distributions. The following summarizes the :
gCumulative citations S_gg^2Threshold met?
1251Yes
2374Yes
3479Yes
45516Yes
56225Yes
66736No

Properties and characteristics

Mathematical properties

The g-index is defined as the largest integer g such that the total number of citations received by the top g most-cited publications is at least g^2; for any of publications with non-negative citation counts, this largest g always exists and is unique, as the cumulative citation sum is a non-decreasing over the ordered publication list, ensuring a single maximal value satisfying the threshold condition. The g-index exhibits monotonicity properties, being non-decreasing when citations to any publications increase or when additional publications are added to the set; specifically, it satisfies aggregate monotonicity, whereby transferring citations from lower-cited to higher-cited publications (or adding them to the top-cited ones) does not decrease the index, and it inherits the basic monotonicity of the while responding more sensitively to increases in highly cited works. In relation to the , the g-index is always at least as large as the h-index for the same set of publications (g \geq h), since the top h publications each receive at least h citations, yielding a total of at least h^2 citations and thus satisfying the g-index condition for g = h. The g-index relates to the total number of citations C received by all publications such that g \leq \sqrt{C}, with equality approached in highly skewed citation distributions typical of scientific output, where the top g publications capture most of C, leading to the approximation g \approx \sqrt{C}. Regarding saturation, the g-index is bounded above by the total number of publications n (g \leq n), and if C < n^2, then g < n; however, unlike the , it grows more rapidly in response to citation outliers in the top publications, allowing it to increase substantially even when the total publication count is fixed, though it may saturate if additional citations do not sufficiently boost the cumulative sum for larger g.

Advantages over similar metrics

The g-index addresses the inherent in citation distributions more effectively than the by assigning higher scores to researchers whose top publications receive disproportionately high citations, thereby rewarding "blockbuster" papers that may dominate an author's impact. Unlike the , which sets a uniform threshold across its core publications and disregards excess citations beyond that level, the g-index incorporates the total citations from the top g papers, provided they meet or exceed g², allowing it to better capture the long-tail effects of elite works in skewed environments. This design enhances the g-index's sensitivity to high-impact contributions, amplifying the visibility of researchers with a few exceptionally cited articles while still accounting for overall productivity, without fully penalizing those with more balanced but less standout outputs. By emphasizing the cumulative citations in the leading publications, it provides a more nuanced measure of influence in fields where citation patterns are highly variable, distinguishing it from metrics like the that remain insensitive to outliers in citation counts. The g-index maintains computational simplicity comparable to the h-index, requiring only a sorted list of an author's publications by descending citation order to identify the largest g where the sum of the top g citations is at least g², making it equally accessible for evaluation without added complexity. Empirical analyses support these strengths; for instance, a 2008 study of researchers at Spain's CSIC found the g-index more sensitive than the for "selective" scientists—those with fewer but highly cited works—yielding higher g/h ratios and improved differentiation in rankings. Similarly, an examination of 26 physicists demonstrated the g-index's superior discrimination between citation patterns and stronger correlation with measures of core citation intensity compared to the , underscoring its utility in assessing overall impact.

Comparisons and variants

Relation to h-index

The g-index and h-index share a foundational approach in evaluating scholarly impact by ranking an author's publications in decreasing order of citations received and applying a threshold to identify a productive core of work. The , introduced by in , defines h as the largest number such that the author has h publications each with at least h citations. In contrast, the g-index, proposed by Leo Egghe in , refines this by using a threshold: g is the largest number such that the top g publications collectively have at least g² citations. This shared ranking mechanism allows both indices to balance quantity and impact, but the g-index's squared criterion emphasizes the influence of highly cited works more prominently. A key mathematical relation between the two indices is the g ≥ h for any set of publications, with holding precisely when the top h publications each receive exactly h citations and the remaining publications receive fewer. This follows directly from the definitions, as the sum of citations for the top h publications under the h-index condition meets or exceeds h², satisfying the g-index criterion for g = h, while the quadratic scaling prevents g from falling below h. Egghe established this property to demonstrate the g-index's consistency with the while extending its sensitivity. In cases of uneven citation distributions—common in many fields—the g-index significantly exceeds the h-index, highlighting disparities that the linear h-index overlooks. For example, consider a researcher with one publication receiving 100 citations and several others receiving only 1–2 each; the h-index might remain low (e.g., h = 1 or 2) due to the lack of multiple papers meeting a uniform threshold, whereas the g-index rises substantially (e.g., g ≈ 10) because the highly cited paper's impact boosts the cumulative sum to satisfy the squared requirement for larger g. Such skewed scenarios underscore the g-index's design to better capture "" from standout contributions. The development of the g-index was directly prompted by the h-index's rapid adoption, with Egghe positioning it as a refinement to address the h-index's relative insensitivity to highly skewed patterns in uneven distributions. This reflects ongoing efforts to adapt metrics to the realities of scientific , where a few influential works often dominate an author's record.

Differences from other citation indices

The g-index differs from the i10-index, a metric employed by that simply counts the number of publications by an author receiving at least 10 citations each, by providing a more nuanced assessment that incorporates the total volume among the author's most cited works rather than applying a fixed . While the i10-index emphasizes breadth through a uniform cutoff, the g-index rewards both and the disproportionate impact of highly cited articles, resulting in values that better reflect distributions without arbitrary limits. In contrast to the AR-index, which adjusts for the recency of citations by applying age-weighting to the h-core publications—defined as the of the sum of these weighted citations—the g-index remains a static measure that disregards temporal factors in citation accumulation. This makes the g-index less adaptive for fields with rapid publication cycles, where recent works may not yet garner sufficient citations, whereas the AR-index mitigates such biases by prioritizing contemporary impact. Unlike total citation counts, which aggregate all citations across an author's oeuvre and can be inflated by prolific output including many low-impact papers, the g-index normalizes for by focusing solely on the collective s of the top g publications, thereby reducing favoritism toward high-volume but low-quality contributors. For instance, an author with numerous modestly cited articles might accumulate high total citations, but the g-index would yield a lower score if those citations fail to meet the g² threshold, highlighting sustained influence over sheer quantity. The hg-index, a metric that computes the of the and g-index as √(h × g) to balance productivity and intensity, addresses some limitations of the standalone g-index by integrating the h-index's emphasis on consistent output; thus, while the g-index offers simplicity and greater sensitivity to outliers, it may undervalue balanced careers compared to the more equilibrated hg-index.

Applications and usage

In academic evaluation

The g-index is employed in various academic evaluation processes to assess researchers' productivity and impact, particularly in tenure reviews where it helps quantify the of highly cited works alongside counts. For instance, it serves as a supplementary in and tenure dossiers, providing a more nuanced view of distribution than simpler counts. In allocation decisions, panels may reference the g-index to prioritize applicants with strong profiles from top s, as seen in evaluations within medical and scientific fields. Additionally, it contributes to institutional ranking systems by leveraging data from major databases like and , where derivative tools compute the index to rank departments or individuals. Since around 2010, the g-index has gained traction in faculty evaluations across and , often integrated into multi-metric panels that include the to balance breadth and depth of impact. In European contexts, such as university assessments, it has been applied to rank performance by emphasizing cumulative citations from leading articles. In , particularly in , it features in evaluations of academic monographs in and social sciences, with data sourced from platforms like , aiding in the appraisal of scholarly influence amid rising emphasis on quantitative indicators. This adoption complements traditional by offering an objective layer, though it is typically one of several metrics considered. The g-index proves particularly effective in natural sciences, where citation norms are high and prolific output is common, allowing it to highlight impactful contributions in fields like physics and . In contrast, its utility diminishes in disciplines, characterized by lower overall citation rates and fewer highly cited outliers, making the index less discriminatory for comparative purposes. Tools such as Harzing's software facilitate its computation using data, often alongside other indices for comprehensive profiles in evaluations.

Limitations and criticisms

The g-index, by design, assigns greater weight to highly cited publications through its requirement that the top g articles collectively receive at least g² citations, which can lead to an overemphasis on a few papers with exceptional impact while undervaluing researchers who produce a steady stream of moderately cited work. This sensitivity to citation extremes favors "selective" scientists with uneven portfolios over those with more balanced outputs, potentially distorting evaluations of overall . Like other citation-based metrics, the g-index exhibits strong field dependence, yielding lower values in disciplines with inherently modest citation rates, such as the social sciences, where even high-quality may not accumulate citations rapidly enough to achieve a meaningful g. This disparity arises because the index relies on absolute citation counts without normalization for disciplinary norms, disadvantaging scholars in low-citation fields despite their contributions' merit. The g-index lacks built-in adjustments for confounding factors like self-citations, co-authorship contributions, or variations in document types (e.g., journal articles versus books), which can inflate scores and compromise fairness. Studies have demonstrated that self-citations significantly the g-index, with corrections often reducing its value, highlighting risks of artificial elevation through strategic citing practices. Furthermore, the g-index's static formulation fails to incorporate career length or the recency of publications, resulting in systematic biases against early-career researchers who have had less time to amass citations, even if their work is promising. This temporal insensitivity perpetuates advantages for established academics and discourages equitable comparisons across career stages.

References

  1. [1]
    Theory and practise of the g-index | Scientometrics
    Jun 20, 2013 · Egghe, L. Theory and practise of the g-index. Scientometrics 69, 131–152 (2006). https://doi.org/10.1007/s11192-006-0144-7
  2. [2]
    Full article: Understanding the 'g-index' and the 'e-index'
    May 6, 2021 · The g-index is defined as 'the largest number such that the top 'g' articles received together at least g2 citations. The citations are ...
  3. [3]
    1.4.6 G-index - Harzing.com
    The g-index was proposed by Leo Egghe (2006). It aims to improve on the h-index by giving more weight to highly-cited articles.
  4. [4]
    Is g-index better than h-index? An exploratory study at the individual ...
    Jun 17, 2008 · However, g-index is more sensitive than h-index in the assessment of selective scientists, since this type of scientist shows in average a ...
  5. [5]
    Measuring your research impact: G-Index
    Feb 14, 2025 · The G-index was proposed by Leo Egghe in his paper "Theory and Practice of the G-Index" in 2006 as an improvement on the H-Index.
  6. [6]
    Research Impact: Author Metrics - Library Guides
    Oct 7, 2025 · Advantages · Takes into account highly cited works · Makes the difference between an author's respective impacts more apparent. The inflated ...
  7. [7]
    How to derive an advantage from the arbitrariness of the g-index
    In this way one can avoid the disadvantage of the original g-index, namely that the values are usually substantially larger than for the h-index and thus the ...
  8. [8]
    None
    ### Definition and Explanation of the g-index
  9. [9]
    Leo Egghe's research works | Hasselt University and other places
    Leo Egghe's 291 research works with 10082 citations, including: Mathematical reflections on modified fractional counting.
  10. [10]
    Theory and practise of the g-index - Semantic Scholar
    I propose the index h, defined as the number of papers with citation number > or =h, as a useful index to characterize the scientific output of a researcher.
  11. [11]
    Theory and practise of the g-index
    The g-index is introduced as an improvement of the h-index of Hirsch to measure the global citation performance of a set of articles.
  12. [12]
    Theory and practise of the g-index in - AKJournals
    Search for other papers by Leo Egghe in. Current site · Google Scholar · PubMed Close. View More View Less. Pages: 131–152. Online Publication Date: 12 Sep ...Missing: original | Show results with:original
  13. [13]
    Publish or Perish - Harzing.com
    Publish or Perish is a software program that retrieves and analyzes academic citations. It uses a variety of data sources to obtain the raw citations.Microsoft Windows · macOS · Publish or Perish in the news · Research
  14. [14]
    [PDF] International Journal INFORMATION MODELS & ANALYSES
    ... .................... 37. Citation-Paper Rank Distributions and Associated Scientometric Indicators – a Survey. Vladimir Atanassov, Ekaterina Detcheva ...<|control11|><|separator|>
  15. [15]
  16. [16]
    An axiomatic analysis of Egghe's g-index - ScienceDirect.com
    In order to give more weight to highly cited publications, Leo Egghe (2006a) proposed the so-called g-index: “A scientist has index g, if g is the largest ...Missing: practise | Show results with:practise
  17. [17]
    Axiomatics for the Hirsch index and the Egghe index - ScienceDirect
    Specifically, the difference between the h-index and the g-index can be essentially traced back to the choice between two versions of a monotonicity axiom.
  18. [18]
    [PDF] Axiomatics for the Hirsch index and the Egghe index
    Nov 11, 2010 · The last one consists of a monotonicity condition, for the h-index, and an aggregate monotonicity condition, for the g-index. ... Egghe's g-index” ...
  19. [19]
    Theory and Practice of the g-Index - ResearchGate
    Aug 6, 2025 · The g-index is introduced as an improvement of the h-index of Hirsch to measure the global citation performance of a set of articles.
  20. [20]
    [PDF] Distributions of the h-index and the g-index
    EGGHE, Leo (2007) Distributions of the h-index and the g-index. In: Torres ... An improvement of the h-index: the g-index. ISSI Newsletter, 2(1), 8-9 ...
  21. [21]
  22. [22]
  23. [23]
    [PDF] Bibliometrics for Faculty Evaluation: A Statistical Comparison of h ...
    This means that g ≥h so that the g-index score will be higher for all articles and authors then the h-index (Egghe, 2006). What makes this index different ...<|control11|><|separator|>
  24. [24]
    Author-level metrics - Bibliometrics - Research Guides
    Apr 24, 2025 · The g-index, created by Leo Egghe as a response to the h-index, is an author-level metric which places greater weight on highly-cited articles.
  25. [25]
    Evaluation of Academic Competitiveness Based on Scientific ...
    Academic Evaluation ... Jiang Chunlin, Liu Zeyuan, Liang Yongxia, H index and G index: a new index for evaluating academic influence of journals.
  26. [26]
    [PDF] Research on Evaluation System of Academic Monographs ... - EUDL
    China's academic evaluation system for papers and journals has matured, but ... In this paper, h index and G index are selected to evaluate the author's.
  27. [27]
    Reflections on the h-index - Harzing.com
    Feb 6, 2016 · Hence, in order to give more weight to highly-cited articles Leo Egghe (2006) proposed the g-index. The g-index is defined as follows:.