System usability scale
The System Usability Scale (SUS) is a standardized, ten-item questionnaire developed to provide a quick and reliable measure of subjective perceptions of usability for various systems, products, or interfaces, yielding a single score from 0 to 100 that reflects overall user satisfaction and effectiveness.[1] It focuses on aspects such as ease of use, learnability, and integration of functions, serving as a non-diagnostic tool for benchmarking and comparing usability across designs or iterations.[1] SUS was created in the mid-1980s by John Brooke while working at Digital Equipment Corporation (DEC) in the UK, specifically to evaluate the usability of the ALL-IN-1 office system within the company's Integrated Office Systems Group.[1] Brooke released the scale into the public domain in 1986 without copyright restrictions, allowing free use and adaptation, which contributed to its widespread adoption.[1] By 2013, it had been cited in over 1,200 publications and applied across diverse domains, including software, websites, mobile apps, and hardware, demonstrating its versatility and enduring relevance.[1] The scale aligns with international standards like ISO 9241-11, emphasizing user satisfaction as a key component of usability alongside effectiveness and efficiency.[1] The SUS questionnaire alternates between positively and negatively worded statements to reduce response bias, with respondents rating their agreement on a five-point Likert scale from 1 ("Strongly Disagree") to 5 ("Strongly Agree").[1] The ten items are:- I think that I would like to use this system frequently.
- I found the system unnecessarily complex.
- I thought the system was easy to use.
- I think that I would need the support of a technical person to be able to use this system.
- I found the various functions in this system were well integrated.
- I thought there was too much inconsistency in this system.
- I would imagine that most people would learn to use this system very quickly.
- I found the system very cumbersome to use.
- I felt very confident using the system.
- I needed to learn a lot of things before I could get going with this system.[1]
Overview
Definition
The System Usability Scale (SUS) is a standardized, 10-item questionnaire employing a Likert scale to assess users' subjective perceptions of a system's usability.[6] It is designed as a unidimensional tool, capturing an overall impression of ease of use across diverse systems, interfaces, or products without reliance on specific technologies.[7] SUS generates a single composite score ranging from 0 to 100, with higher values reflecting stronger perceived usability; notably, this score represents a normalized measure rather than a literal percentage.[6] The questionnaire's brevity and simplicity enable quick administration, making it suitable for benchmarking usability in various contexts.[8]Purpose and Benefits
The System Usability Scale (SUS) was developed to provide a quick, reliable, and low-cost method for assessing subjective perceptions of system usability, enabling benchmarking and comparisons across different systems or design iterations over time.[9] This approach allows evaluators to capture a broad snapshot of user satisfaction without the need for extensive qualitative analysis or expert intervention, making it particularly valuable in resource-constrained environments.[9] Key benefits of SUS include its high reliability even with small sample sizes, typically as few as 8-12 participants, which reduces the time and expense associated with large-scale user testing.[10] Administration is straightforward and brief, often taking just 1-2 minutes to complete the 10-item questionnaire after a user session, facilitating integration into various evaluation workflows.[11] Furthermore, SUS demonstrates broad applicability across diverse domains, including software interfaces, websites, hardware devices, and mobile applications, without requiring domain-specific adaptations.[9] SUS scores exhibit strong correlations with established usability metrics, such as the Software Usability Measurement Inventory (SUMI, r = 0.79) and the Website Analysis and Measurement Inventory (WAMMI, r = 0.948), positioning it as a standardized tool for global usability assessments that does not rely on expert observers.[9] This alignment enhances its utility in providing consistent, quantifiable insights that inform design improvements efficiently.[8]History and Development
Origins
The System Usability Scale (SUS) was developed in 1986 by John Brooke, a usability engineer at Digital Equipment Corporation (DEC) in Reading, United Kingdom, as part of the company's internal usability engineering program for evaluating electronic office systems.[12] This effort targeted products like the ALL-IN-1 integrated office system, where subjective user feedback was needed alongside objective task performance data.[1] Brooke's primary motivation was to create a simple, low-cost attitude scale that could supplement task-based usability testing by capturing users' overall perceptions of system ease and effectiveness, without demanding expertise in psychometrics from evaluators or participants.[12] Drawing inspiration from established psychometric approaches, such as Likert scaling, he simplified the design to make it practical for non-experts in industrial settings, where evaluation sessions were often limited to 25-30 minutes.[12] To develop it, Brooke started with an initial pool of 50 potential questionnaire statements, refining them through informal testing with about 20 colleagues in DEC's office systems engineering group—ranging from secretaries to programmers—to select items that produced consistent, strong responses.[12] Early pilots of the SUS were conducted by Brooke on DEC products during human factors lab sessions with UK customers, using video equipment to observe interactions, and later extended to portable evaluations at customer sites in the US and Europe.[1] These initial applications revealed the tool's value in generating quick, actionable feedback on interface usability to support product iterations in DEC's phase review process, despite the absence of formal validation at the time.[1]Publication and Early Adoption
The System Usability Scale (SUS) received its first formal publication in 1996, when John Brooke contributed a chapter titled "SUS: A 'Quick and Dirty' Usability Scale" to the edited volume Usability Evaluation in Industry, published by Taylor & Francis. This non-peer-reviewed chapter documented the scale, which Brooke had originally developed a decade earlier at Digital Equipment Corporation and shared informally among colleagues since 1986 to facilitate quick usability comparisons across systems.[9] The publication marked a pivotal step in making SUS accessible beyond internal corporate use, emphasizing its role as a simple, low-cost tool for subjective usability assessments.[6] Following its release, SUS began gaining traction within the human-computer interaction (HCI) and usability communities during the late 1990s, primarily through discussions at professional conferences such as those organized by the Usability Professionals' Association (now UXPA) and citations in emerging HCI literature.[13] By the early 2000s, the scale had been adopted in a growing number of empirical studies, with researchers applying it to evaluate diverse interfaces and validating its psychometric properties; for instance, James R. Lewis conducted early reliability analyses, reporting a Cronbach's alpha of 0.85 across 77 cases, underscoring its internal consistency.[14] This period saw SUS referenced in over 100 research works, reflecting its appeal for rapid post-task evaluations in both academic and industrial settings.[15] The inclusion of SUS in Usability Evaluation in Industry—a key resource for practitioners—further propelled its dissemination, positioning it as a de facto standard for benchmarking perceived usability without requiring extensive training or resources. Subsequent citations in HCI journals and conference proceedings during the early 2000s solidified its reputation, with studies demonstrating its versatility across product development phases and contributing to its widespread circulation among usability engineers.[13]Questionnaire Structure
Items and Response Format
The System Usability Scale (SUS) questionnaire comprises ten specific statements that probe users' subjective experiences with a system, focusing on aspects such as ease of use, learnability, and overall satisfaction. These items were originally formulated by John Brooke to provide a simple yet effective tool for gauging perceived usability across diverse systems.[6] The ten items are worded as follows:- I think that I would like to use this system frequently.
- I found the system unnecessarily complex.
- I thought the system was easy to use.
- I think that I would need the support of a technical person to be able to use this system.
- I found the various functions in this system were well integrated.
- I thought there was too much inconsistency in this system.
- I would imagine that most people would learn to use this system very quickly.
- I found the system very cumbersome to use.
- I felt very confident using the system.
- I needed to learn a lot of things before I could get going with this system.[6]
Design Rationale
The System Usability Scale (SUS) was designed with 10 items to provide a brief yet reliable assessment of perceived usability, minimizing respondent fatigue while capturing essential aspects of user experience. This structure emerged from an initial pool of 50 statements, from which the 10 were selected based on their strong intercorrelations (ranging from r = ±0.7 to ±0.9) and ability to elicit extreme responses when evaluating highly usable versus unusable systems.[1] The choice of 10 items was influenced by the need for a "quick and dirty" tool suitable for rapid evaluations, drawing inspiration from ISO 9241-11, which defines usability through effectiveness, efficiency, and satisfaction, but simplifying the focus to subjective satisfaction as a proxy for overall usability.[1] To mitigate common response biases such as acquiescence or extreme responding, the SUS incorporates alternating positively and negatively worded items, requiring participants to engage thoughtfully with each statement rather than defaulting to agreement patterns. Odd-numbered items (1, 3, 5, 7, 9) probe positive attitudes toward the system, while even-numbered items (2, 4, 6, 8, 10) address frustrations or shortcomings, thereby enhancing the scale's sensitivity to nuanced user perceptions.[1] Although the SUS was intentionally constructed as a unidimensional measure of overall perceived usability, subsequent factor analyses have identified a two-factor structure consisting of usability (items 1, 2, 3, 5, 6, 7, 8, 9) and learnability (items 4, 10).[1][14] This provides simplicity for global assessments while maintaining high internal consistency, as demonstrated by Cronbach's alpha exceeding 0.90.[14]Administration and Scoring
Administration Guidelines
The System Usability Scale (SUS) is typically administered after users have interacted with the system in a structured evaluation session, allowing sufficient time to form informed opinions on its usability. Guidelines recommend conducting the assessment post-task or post-session, following at least 20-30 minutes of hands-on use to ensure participants have experienced key features without undue fatigue.[6][11] Administrators should provide clear and neutral instructions to participants, such as "Please rate your experience using the system just now" or "Answer based on your overall impressions," to minimize bias and promote honest responses. The 10-item questionnaire can be delivered via paper forms in moderated lab settings, online surveys for remote or unmoderated tests, or verbally for populations with accessibility needs, such as those with low literacy or disabilities. Ensuring anonymity—by not collecting identifying information unless necessary—further encourages candid feedback without fear of repercussions.[11][8] Best practices emphasize recruiting representative end-users rather than technical experts or internal stakeholders, as the scale measures subjective perceptions from the target audience's perspective. Aim for a minimum sample size of 5 participants to detect initial trends in usability, though 12 or more is ideal for greater reliability in identifying patterns across responses. Avoid any leading questions, comments, or influences during or immediately before administration that could sway opinions, and verify all items are completed to maintain data integrity.[8][16][6]Scoring Calculation
The System Usability Scale (SUS) score is computed by first adjusting the responses from each of the 10 questionnaire items to a common 0-4 scale, then summing these adjusted contributions and scaling the total to a 0-100 range.[17] For the five odd-numbered items (1, 3, 5, 7, and 9), which are positively worded, the contribution is calculated as the user's response minus 1; responses are on a 5-point Likert scale from 1 (strongly disagree) to 5 (strongly agree), so this adjustment yields values from 0 to 4.[17] For the five even-numbered items (2, 4, 6, 8, and 10), which are negatively worded, the contribution is 5 minus the user's response, again resulting in a 0-4 range after reversal.[17] The adjusted contributions from all 10 items are then summed, producing a total ranging from 0 to 40. This sum is multiplied by 2.5 to obtain the final SUS score on a 0-100 scale.[17] The formula can be expressed as: \text{SUS Score} = \left( \sum_{i=1}^{10} \text{adjusted contribution}_i \right) \times 2.5 [17] All items are treated equally with no weighting applied to individual contributions.[8] For group results, such as in studies involving multiple participants, the SUS score is calculated individually for each respondent before averaging the scores across the group to yield a mean SUS value.[8]Interpretation
Score Ranges and Benchmarks
The System Usability Scale (SUS) scores, ranging from 0 to 100, are interpreted using adjective ratings derived from empirical mapping to user perceptions, based on data from over 2,300 participants across multiple studies. These ratings provide a qualitative framework for understanding score implications, with boundaries established through percentile rankings and mean associations from large-scale validations. Specifically, scores of 0-25 are rated as "Worst Imaginable," 25-40 as "Awful," 40-60 as "Poor," 60-70 as "OK," 70-80 as "Good," 80-90 as "Excellent," and 90-100 as "Best Imaginable."[18] These categories reflect significant differences in perceived usability, except between "Worst Imaginable" and "Awful," which show overlapping means around 20-32.[18] SUS scores follow a normal distribution, enabling reliable percentile-based comparisons across studies and products. A global benchmark average is approximately 68, drawn from analyses of over 5,000 responses across nearly 500 studies spanning more than 30 years of data collection.[19][20] Scores above 80 indicate above-average usability, positioning a product in the top 15% of evaluated systems, while scores below 68 fall below this norm.[20] For cross-study comparisons, SUS percentile ranks are recommended over raw scores to account for contextual variations.[20] Domain-specific norms adjust these benchmarks to reflect category expectations; for example, software interfaces average around 70, while websites typically score about 65, based on aggregated data from over 1,000 web evaluations and various graphical user interfaces.[15] These norms, derived from diverse product types including cell phones, customer premise equipment, and interactive voice response systems, underscore the importance of contextual benchmarking for accurate interpretation.[15]| Adjective Rating | SUS Score Range | Example Mean from Validation |
|---|---|---|
| Best Imaginable | 90-100 | 90.9 |
| Excellent | 80-90 | 85.4 |
| Good | 70-80 | 71.3 |
| OK | 60-70 | 50.9 |
| Poor | 40-60 | 46.2 |
| Awful | 25-40 | 31.7 |
| Worst Imaginable | 0-25 | 21.5 |