Fact-checked by Grok 2 weeks ago

Naranjo algorithm

The Naranjo algorithm, also known as the Naranjo Adverse Drug Reaction Probability Scale, is a clinical assessment tool consisting of a 10-question questionnaire designed to evaluate the causal relationship between a suspected drug and an adverse event by assigning a probability score. Developed in 1981 by Claudio A. Naranjo and colleagues at the University of Toronto, the scale standardizes the determination of drug-induced adverse reactions through objective criteria, improving inter-rater reliability compared to unstructured clinical judgment alone. The algorithm's purpose is to provide a systematic method for causality assessment in , particularly useful in clinical trials, , and case reports of suspected adverse drug reactions (s). It was validated in a study of 63 alleged s, where it demonstrated high reliability (kappa values of 0.69–0.86 for inter-rater agreement and 0.64–0.95 for intra-rater consistency) and applicability across physicians and pharmacists. Originally created for general evaluation, it has been widely adopted in various medical contexts despite not being tailored to specific conditions like drug-induced . Each of the 10 questions addresses key elements of , such as previous reports of the with the , temporal , dechallenge ( upon ), rechallenge (recurrence upon re-administration), and alternative causes, with scores ranging from -1 (no) to +2 (yes or definitive). The total score, ranging from -4 to +13, categorizes the likelihood as definite (≥9), probable (5–8), possible (1–4), or doubtful (≤0). This scoring system facilitates consistent documentation and reporting of ADRs in and regulatory submissions. While the algorithm remains one of the most commonly used tools for causality due to its simplicity and brevity, it has limitations, including subjectivity in some questions, lack of specificity for certain organ systems, and lower performance in complex cases without rechallenge data. scales, such as the Roussel Uclaf Causality (RUCAM) for , have been developed to address these gaps, but the Naranjo scale continues to influence global practices.

Background

Development

The Naranjo algorithm, formally known as the Probability Scale, was originally developed and published in 1981 by Claudio A. and a team of collaborators in the journal Clinical Pharmacology & Therapeutics. The paper, titled "A method for estimating the probability of adverse drug reactions," introduced a structured scoring system to quantify the likelihood of drug-induced adverse events. This development arose from the recognized need for a more objective and standardized approach to causality assessment in adverse drug reactions (ADRs), which previously relied on subjective clinical judgment often leading to inconsistent inter-rater agreements of only 38%–63%. By building on these earlier limitations, the algorithm aimed to enhance reliability in practices. The project was a collaborative effort involving pharmacologists and clinicians affiliated with the Clinical Pharmacology Program at the Addiction Research Foundation Clinical Institute, as well as the Departments of and at the . Key contributors included Ursula Busto, Edward M. Sellers, Pedro Sandor, Isabel Ruiz, Eve A. Roberts, Eva Janecek, Carlos Domecq, and David J. Greenblatt, all working within this interdisciplinary environment. Initial validation of the scale was conducted retrospectively on 63 cases of suspected ADRs, where it achieved inter-rater agreements of 83%–92% and demonstrated strong within-rater reliability, confirming its utility for consistent causality evaluation.

Purpose

The Naranjo algorithm was developed to provide a quantitative probability score for determining whether a caused an observed , thereby offering a structured approach to assessment in clinical settings. This tool assigns a score based on key factors influencing causal relationships, categorizing the likelihood as definite, probable, possible, or doubtful, which helps clinicians move beyond qualitative evaluations. By introducing this method in 1981, it aimed to enhance the precision and reproducibility of ADR evaluations. A primary rationale for the algorithm is to mitigate the inconsistencies inherent in subjective clinical judgments, where inter-rater agreement can be as low as 38% without . The structured questionnaire format reduces variability among healthcare professionals, achieving higher reliability in assessments, with inter-rater agreement improving to 83–92%. This is particularly valuable in diverse healthcare environments, ensuring more uniform decision-making. The algorithm is designed for broad applicability across all types of adverse drug reactions (ADRs), without restriction to specific drug classes or reaction severities, making it a versatile tool for general monitoring. It supports comprehensive evaluation regardless of the clinical context, from to outpatient settings. Furthermore, by fostering consistent determinations, the Naranjo algorithm promotes reliable reporting to pharmacovigilance databases, such as those maintained by the FDA and WHO, facilitating effective of drug safety. This contributes to global efforts in identifying and mitigating drug-related risks on a larger scale.

The Algorithm

Questionnaire

The Naranjo algorithm employs a structured consisting of 10 specific questions designed to evaluate the likelihood of an (ADR) being caused by a suspected . Each question is answered with one of three options—Yes, No, or Do not know—and assigned corresponding point values ranging from -1 to +2, which collectively contribute to an overall causality assessment. This ternary response format helps minimize subjective bias by standardizing evaluations while allowing for uncertainty in cases with incomplete data. The questions systematically probe key elements of causality, including temporal associations between drug administration and the , the role of alternative causes, effects of (dechallenge) and re-administration (rechallenge), dose-response relationships, and supporting objective evidence. For instance, questions on timing and rechallenge emphasize the importance of chronological links and reproducibility, while those addressing alternative causes and responses help rule out factors. Objective evidence, such as toxic drug levels or confirmatory tests, adds weight to the when available. This approach ensures a balanced evaluation grounded in principles. Below is the complete list of the 10 questions, along with their response options and assigned scores, as originally formulated:
QuestionYesNoDo not know or not done
1. Are there previous conclusive reports on this reaction?+100
2. Did the adverse event appear after the suspected was administered?+2-10
3. Did the adverse event improve when the drug was discontinued or a specific was administered?+100
4. Did the adverse reaction reappear when the drug was readministered?+2-10
5. Are there alternative causes (other than the drug) that could on their own have caused the reaction?-1+20
6. Did the reaction reappear when a was given?-1+10
7. Was the drug detected in the (or other fluids) in concentrations known to be toxic?+100
8. Was the reaction more severe when the dose was increased, or less severe when the dose was decreased?+100
9. Did the patient have a similar reaction to the same or similar drugs in any previous exposure?+100
10. Was the confirmed by any objective evidence?+100
These questions were developed to provide a reproducible for , drawing from established criteria in ADR assessment.

Scoring

The Naranjo algorithm employs a scoring system to quantify the likelihood of an (ADR) based on responses to its 10-question . Most questions are scored as +1 for responses indicating positive of (e.g., "yes" to a supportive factor), 0 for or responses, and -1 for negative (e.g., "no" to a supportive factor or "yes" to an ). Specific questions, such as those assessing temporal (questions 2 and 4), allow for higher weights: +2 for strong positive and -1 for negative , while others like question 5 ( causes) score +2 for absence of alternatives and -1 for their presence. To compute the overall causality score, points from all applicable questions are summed after assigning values to each response. The total score ranges from -4 (indicating strong evidence against ) to +13 (indicating strong evidence for ), reflecting the maximum possible positive and negative contributions across the . Responses of "do not know" or "not applicable/not done" are typically assigned 0 points to avoid penalizing incomplete data while maintaining neutrality in the assessment. For instance, consider a hypothetical case where a develops after starting a new . For question 1 (previous conclusive reports), if reports exist, score +1; for question 2 (event after drug administration), if yes, score +2; for question 3 (improvement on discontinuation), if applicable and yes, score +1; for question 4 (reappearance on rechallenge), if not done, score 0; for question 5 (alternative causes), if none identified, score +2; for question 6 ( response), if not done, score 0; for question 7 (toxic levels), if unknown, score 0; for question 8 (dose-response relationship), if unknown, score 0; for question 9 (previous similar reaction), if no prior exposure, score 0; and for question 10 (objective evidence), if confirmed by lab tests, score +1. Summing these yields a total of +7.

Interpretation

Probability Categories

The Naranjo algorithm employs a score-based system to determine the probability that an observed is caused by a specific , with total scores calculated as integers from responses to a standardized 10-question . The resulting score is categorized into one of four probability levels: definite (≥9 points), probable (5-8 points), possible (1-4 points), or doubtful (≤0 points). These thresholds, established in the original validation for reliable rater agreement, have been shown in subsequent studies to provide high specificity for the definite category—ensuring strong causal confidence—while maintaining for the probable and possible categories to capture a broader range of likely drug-related events. The definite category represents the highest level of evidence strength, indicating a robust causal association typically confirmed by a positive dechallenge (improvement upon ) and rechallenge (recurrence upon re-administration), alongside a clear temporal , objective , and exclusion of alternative causes. In contrast, the probable category signifies a strong but not absolute likelihood, supported by dechallenge confirmation, a plausible temporal sequence, known drug response patterns, and the absence of more convincing alternative explanations. The possible category denotes a moderate probability of causality, where a temporal association exists, the event may align with known drug effects, but concurrent diseases or other factors could reasonably account for the reaction. Finally, the doubtful category indicates minimal or no supporting evidence for drug involvement, with the adverse event more likely attributable to extraneous factors such as underlying patient conditions or unrelated etiologies.

Clinical Implications

The Naranjo algorithm's probability categories play a pivotal role in guiding clinical decisions for suspected adverse reactions (ADRs), enabling healthcare providers to balance risk with therapeutic needs. For definite (score ≥9) and probable (score 5-8) categories, which indicate a high likelihood of drug causality, standard recommendations include immediate discontinuation of the suspected to prevent further harm, initiation of alternative therapies where feasible, and mandatory reporting to regulatory authorities for purposes. These actions are essential in acute settings, such as hospitals, where rapid intervention can mitigate severe outcomes like organ damage or life-threatening events. In contrast, for possible (score 1-4) and doubtful (score ≤0) categories, where is less certain, emphasizes continued of symptoms, additional diagnostic testing to identify alternative causes, and efforts to rule out factors such as comorbidities or concurrent medications before attributing the reaction to the drug. This approach allows for dosage adjustments or temporary observation without hasty withdrawal, preserving treatment efficacy for essential medications. The algorithm facilitates multidisciplinary collaboration among physicians, pharmacists, and nurses by providing a standardized framework for risk-benefit analysis, ensuring consistent communication and coordinated care plans tailored to the patient's overall health profile. For instance, in complex cases like , teams use the categories to prioritize drug withdrawals and follow-up, enhancing decision-making across specialties. By stratifying ADR likelihood, the Naranjo categories contribute to through targeted interventions, such as avoiding unnecessary drug cessations in low-probability scenarios that could lead to therapeutic gaps or disease progression, while prioritizing escalations in high-probability cases to avert escalation of harm. This selective strategy supports broader safety initiatives, including reduced iatrogenic risks and optimized in clinical environments.

Validation and Reliability

Studies and Evidence

The Naranjo algorithm was originally developed and validated in a study by et al., where it was applied to 28 prospectively collected cases of alleged adverse drug reactions (ADRs) assessed by three independent physicians, demonstrating with values ranging from 0.69 to 0.86, equivalent to 83-92% agreement. Subsequent research has further validated the tool's performance across larger datasets. A 2021 analysis published in evaluated 1,676 pediatric ADR cases at a , classifying 50% as probable, 49% as possible, 1.5% as definite, and 0.2% as doubtful, underscoring the algorithm's ability to categorize the majority of events as at least possible while identifying a small fraction as highly likely. In a 2025 replicability and validation study conducted in a Canadian clinical setting, the Naranjo algorithm was applied to 12 serious adverse events from hospital reports, yielding weighted of 0.92 and unweighted of 0.84 for inter-rater agreement between two reviewers, confirming its reliability in real-world with good consistency among healthcare professionals. This study also reported sensitivity of 1.00 in detecting true causative drugs and specificity of 0.31 in excluding non-causative ones. Meta-analyses and comparative studies have synthesized evidence on the algorithm's diagnostic accuracy, indicating overall around 0.84 and specificity of 0.50 when applied to diverse populations, though values vary by context and comparator tools. These findings affirm the algorithm's balanced performance in probabilistic assessment, with higher for detecting potential ADRs but moderate specificity in ruling out alternatives. The algorithm is integrated into global efforts, including assessments supporting FDA reporting and WHO's Monitoring Centre database analyses, where it aids in standardizing evaluations for international surveillance.

Limitations

The Naranjo algorithm relies heavily on subjective judgment for answering its 10 questions, which introduces significant inter-rater variability in assessments. Studies evaluating have reported values ranging from 0.44 to 0.86 across different datasets and raters, indicating moderate agreement at best, with lower values often attributed to differences in clinical experience, specialty, and interpretation of ambiguous cases such as alternative causes or temporal relationships. This subjectivity can lead to inconsistent classifications of (ADRs), particularly when evidence is incomplete or when raters disagree on the plausibility of dechallenge or rechallenge outcomes. The algorithm performs poorly for severe or type B ADRs, which are idiosyncratic and unpredictable, as opposed to type A reactions that follow dose-dependent . It is better suited for assessing predictable, augmented reactions but struggles with idiosyncratic cases due to its emphasis on temporal sequence, dechallenge, and drug level testing—factors that are often unhelpful or inapplicable in type B scenarios, such as allergic or reactions where causality may involve immune-mediated mechanisms not captured by the scale. Performance issues extend to specific populations and scenarios, including , the elderly, and . In settings, the algorithm yields high rates of "unknown" or "do not know" responses (over 85% for questions on rechallenge and ), limiting its ability to differentiate ADR severity or provide actionable insights, and it often defaults to "possible" categorizations without correlating well with clinical outcomes. Among elderly patients, ethical barriers to rechallenge (a key scoring element) and the tool's focus on single-drug causality fail to account for common drug-drug interactions in , reducing its reliability in multimorbid older adults where ADRs are frequent and complex. Developed in 1981 before the era, the Naranjo algorithm does not incorporate biomarkers, genetic polymorphisms, or pharmacogenomic factors that are now recognized as critical in susceptibility, such as HLA alleles in reactions. This omission limits its applicability in modern , where can refine causality but is absent from the scale's criteria, potentially leading to underestimation of risks in genetically predisposed individuals.

Applications

In Pharmacovigilance

The Naranjo algorithm plays a central role in by facilitating the assessment of for suspected adverse (ADRs) within spontaneous reporting systems. Healthcare professionals routinely apply the algorithm to evaluate individual case reports before submission to international databases such as the U.S. Food and Drug Administration's (FDA) FDA Adverse Event Reporting System (FAERS) and the European Medicines Agency's () EudraVigilance. This pre-submission step helps standardize the determination of whether a is likely responsible for an observed , ensuring that reports include a preliminary judgment based on the algorithm's and scoring criteria. In signal detection processes, the algorithm contributes by classifying ADRs into probability categories—such as definite, probable, possible, or doubtful—which enables regulatory bodies to prioritize higher-likelihood cases for in-depth review and potential signals during post-marketing . For instance, probable or definite categorizations flag reports for aggregation and , supporting the identification of emerging drug issues across large datasets. This prioritization enhances the efficiency of workflows, allowing agencies to focus resources on investigating patterns that may warrant label updates or regulatory actions. The algorithm has seen widespread global adoption in pharmacovigilance since the early 1990s and is widely used in clinical and hospital-based post-marketing surveillance practices in countries including the United States, Canada, and India. It is recommended in guidelines from organizations like the American Society of Health-System Pharmacists (ASHP) for systematic ADR evaluation. In practice, examples include its use in hospital-based ADR monitoring programs, where it standardizes causality assessments for compiling annual safety reports and contributing to national databases, as demonstrated in secondary care hospitals in India and pediatric facilities in the United States.

Comparisons with Other Tools

The Naranjo algorithm, which employs a numerical scoring system ranging from -4 to +13 to categorize adverse drug reactions (ADRs) as definite, probable, possible, or doubtful, differs fundamentally from the World Health Organization-Uppsala Monitoring Centre (WHO-UMC) causality assessment system. The latter uses a descriptive, categorical approach classifying reactions as certain, probable/likely, possible, unlikely, conditional/unclassified, or unassessable, without assigning numerical scores. Studies evaluating their concordance have generally reported moderate agreement, with values typically ranging from 0.4 to 0.7, indicating that while both tools often align on probable , discrepancies arise in borderline cases due to the Naranjo's reliance on quantifiable criteria versus the WHO-UMC's emphasis on clinical context. In comparison to the (LCAT), another algorithmic method with yes/no questions similar to the but refined for broader evaluation, the Naranjo lacks explicit weighting for expert clinical judgment or alternative cause exclusion, rendering it simpler and faster to apply but potentially less nuanced in complex scenarios involving multiple confounders. The LCAT demonstrates higher (approximately 97%) for identifying possible ADRs compared to the Naranjo's 81%, though both exhibit low specificity (around 20-30%), leading to overclassification of non-causal events. This makes the Naranjo preferable for routine, resource-limited settings where speed outweighs detailed probabilistic refinement. The Roussel Uclaf Causality Assessment Method (RUCAM), tailored specifically for drug-induced liver injury (), contrasts with the Naranjo's general-purpose design by incorporating liver-specific parameters such as time to onset, course upon rechallenge, and exclusion of non-drug causes, resulting in higher specificity (approximately 89%) and sensitivity (86%) for cases compared to the Naranjo's lower performance in this domain (sensitivity around 54%, specificity variable but often under 50% for definite ). While the Naranjo's broad applicability suits diverse ADRs, RUCAM's structured focus enhances accuracy for specialized hepatic reactions, though it requires more domain expertise. Overall, the Naranjo algorithm is widely favored for its ease of use, brevity, and versatility across types, facilitating consistent assessments in without specialized training. However, for domain-specific evaluations like or scenarios demanding higher nuance or sensitivity, tools such as RUCAM or LCAT offer superior performance, underscoring the Naranjo's strengths in general rather than specialized contexts.

References

  1. [1]
    A method for estimating the probability of adverse drug reactions
    This systematic method offers a sensitive way to monitor ADRs and may be applicable to postmarketing drug surveillance.
  2. [2]
    Adverse Drug Reaction Probability Scale (Naranjo) in Drug Induced ...
    May 4, 2019 · The Adverse Drug Reaction (ADR) Probability Scale was developed in 1991 by Naranjo and coworkers from the University of Toronto and is often ...
  3. [3]
    A method for estimating the probability of adverse drug reactions
    A method for estimating the probability of adverse drug reactions. Clin Pharmacol Ther. 1981 Aug;30(2):239-45. doi: 10.1038/clpt.1981.154. Authors. C A Naranjo ...Missing: original paper
  4. [4]
    [PDF] Naranjo Adverse Drug Reaction Probability Scale - NCBI
    Modified from: Naranjo CA et al. A method for estimating the probability of adverse drug reactions. Clin. Pharmacol Ther 1981; 30: 239-245.
  5. [5]
    Improving the assessment of adverse drug reactions using the ...
    Jan 3, 2018 · Physician reviewers determined 997 ADRs. The percentage of ADRs was 94% if the total NA score reached 5. The modified NA consisted of 5 ...Missing: initial | Show results with:initial
  6. [6]
    Adverse Drug Reactions - StatPearls - NCBI Bookshelf
    Jan 10, 2024 · When an ADR diagnosis is in doubt, several decision aids and algorithms can be employed. Available tools include the Naranjo algorithm, the ...
  7. [7]
    Using the Naranjo Algorithm to Pinpoint Adverse Drug Reactions in ...
    Vigilant monitoring and multidisciplinary follow-up are crucial in CF polypharmacy, even with medications considered safe. 2. The Naranjo Algorithm is a ...
  8. [8]
    A method of estimating the probability of adverse drug reactions
    Aug 6, 2025 · Using her detailed timeline and medication history, we completed the Naranjo Adverse Drug Reaction Probability Scale. 15 The key ...Missing: original paper
  9. [9]
    [PDF] Naranjo Algorithm - Guidewell
    Naranjo Algorithm. (Naranjo CA et al. “A method for estimating the probability of adverse drug reactions”. Clin. Pharmacol. Ther. August 1981). The Adverse ...
  10. [10]
    Utilization of the Naranjo scale to evaluate adverse drug reactions at ...
    Jan 13, 2021 · The relationship between the Naranjo scaling system and pediatric adverse drug reactions (ADR) is poorly understood.
  11. [11]
    Serious Adverse Events: A Replicability and Validation Study of ...
    Apr 24, 2025 · In 1981 when the Naranjo tool was published (31), it was tested for reliability and validity in 63 cases in clinical trials among six raters.
  12. [12]
    Comparison of Three Methods (An Updated Logistic Probabilistic ...
    Aug 6, 2025 · For the probabilistic method, sensitivity, specificity, positive and negative predictive values were 0.96, 0.56, 0.92 and 0.71, respectively.
  13. [13]
    Comparison of the Liverpool Causality Assessment Tool vs. the ...
    Feb 27, 2023 · The Liverpool Causality Assessment Tool is a more sensitive tool than the Naranjo Scale in the assessment of possible ADRs, but both tools have poor SP.INTRODUCTION · METHODS · RESULTS · DISCUSSION
  14. [14]
    Methods for estimating causal relationships of adverse events with ...
    In summary, we present a modified Naranjo scale and a modified FDA algorithm that may be used to assess the causal relationships between adverse events and ...
  15. [15]
    Development and Inter-Rater Reliability of the Liverpool Adverse ...
    In a further 40 cases, the Liverpool tool (0, 66, 81, 133) showed 'good' IRR (kappa 0.6) while Naranjo (1, 90, 185, 4) remained 'moderate'. Conclusion. The ...
  16. [16]
    Methods for causality assessment of idiosyncratic drug‐induced liver ...
    Aug 21, 2024 · Agreement between WHO-UMC causality scale and the Naranjo algorithm for causality assessment of adverse drug reactions. J Family Med Prim ...
  17. [17]
    Utilization of the Naranjo scale to evaluate adverse drug reactions at ...
    Jan 13, 2021 · Abstract. The relationship between the Naranjo scaling system and pediatric adverse drug reactions (ADR) is poorly understood.
  18. [18]
    Adverse Drug Reactions in Multimorbid Older People Exposed to ...
    Apr 30, 2024 · The Naranjo method presents challenges in an older population, as it may be considered unethical to re-administer a drug (one of the Naranjo ...
  19. [19]
    (PDF) Improving the assessment of adverse drug reactions using the ...
    The final score classifies the likelihood of the adverse reaction as definite, probable, possible, or unlikely. ... high specificity (0.95) and moderate ...
  20. [20]
    Dilemmas of the causality assessment tools in the diagnosis of ...
    (see Table 1) Naranjo's algorithm in comparison with the aforesaid methods has the advantage of being simple and brief, in addition to reduction in inter-rater ...
  21. [21]
    [PDF] Adverse Drug Reaction Reporting - ASHP
    Naranjo Algorithm. • Which medication(s) is/are suspected to be causing the problem? • Has/have the suspected medication(s) been discontinued? • What are the ...Missing: pharmacovigilance | Show results with:pharmacovigilance
  22. [22]
    Comparison of WHO-UMC and Naranjo Scales for Causality ...
    Jun 1, 2024 · Conclusions: A moderate level of agreement was observed in this study between the WHO-UMC and Naranjo scales. The level of agreement among these ...
  23. [23]
    Agreement between WHO-UMC causality scale and the Naranjo ...
    In the present study, we assessed agreement between the two widely used causality assessment scales, that is, the WHO-UMC criteria and the Naranjo algorithm.
  24. [24]
    Comparison of the Liverpool Causality Assessment Tool vs. the ...
    The Liverpool Causality Assessment Tool is a more sensitive tool than the Naranjo Scale in the assessment of possible ADRs, but both tools have poor SP.
  25. [25]
    Roussel Uclaf Causality Assessment Method for Drug-Induced Liver ...
    Jul 28, 2019 · In short, to assess DILI, the least is to consider CAMs that are liver specific. Indeed, the general CAMs as proposed by WHO UMC or Naranjo (NAR) ...
  26. [26]
    Review Causality assessment methods in drug induced liver injury
    While the CIOMS was found to be the most frequently used scale (16.4%) followed by the Naranjo (13.1%), more than 62% of the reports did not use any causality ...