Accuracy and Completeness of Clinical Coding Using ICD-10 for Ambulatory Visits

Posted on the 24 February 2019 by Tyshka


This study describes a simulation of diagnostic coding using an EHR. Twenty-three ambulatory clinicians were asked to enter appropriate codes for six standardized scenarios with two different EHRs. Their interactions with the query interface were analyzed for patterns and variations in search strategies and the resulting sets of entered codes for accuracy and completeness. Just over a half of entered codes were appropriate for a given scenario and about a quarter were omitted. Crohn’s disease and diabetes scenarios had the highest rate of inappropriate coding and code variation. The omission rate was higher for secondary than for primary visit diagnoses. Codes for immunization, dialysis dependence and nicotine dependence were the most often omitted. We also found a high rate of variation in the search terms used to query the EHR for the same diagnoses. Changes to the training of clinicians and improved design of EHR query modules may lower the rate of inappropriate and omitted codes.Go to:


The almost seventy thousand codes that comprise the 10th revision of the International Statistical Classification of Diseases and Related Health Problems (ICD-10-CM) are far more detailed than those in the preceding version that clinicians in the United States were working with since the late 1970s.1 This new level of complexity is expected to not only facilitate documenting and reporting causes of mortality and morbidity but to also extend the ability to identify and manage clinical processes with information technology by identifying changes in medication management and by monitoring data for health maintenance and preventive care purposes.2 Highly granular and more accurate data are also indispensable for rapidly expanding secondary uses such as detecting healthcare fraud, developing patient safety criteria, setting healthcare policy and developing public health initiatives, improving clinical performance and, crucially, allowing large-scale analyses for medical research. Codes are also essential in clinical care for phenotyping and predictive modeling of patient state.3 The transition also has significant implications on reimbursement from health care insurers. Diagnostic codes may be used to determine the severity of illness of a provider’s patient population and affect payment rates with newly adopted payment models.Codes that previously could not differentiate between several types of diabetes, for example, are now refined to capture important distinctions but require clinicians to add to their documentation causal underlying conditions or whether the disease was induced by drugs.4 A more detailed description of laterality and location in the patient’s body is also a newly added specification. The previous emphasis on organs and disease that prioritized physician-oriented content is expanded to also cover human responses to disease that are necessary for advanced nursing and long-term care.5Many electronic health record (EHR) systems integrate clinical documentation and billing information and provide cross-mapping between primarily care-oriented and reimbursement- or report-oriented data. Problem lists, for example, need to conform to standardized vocabularies based on ICD-10 or SNOMED codes for the CMS EHR Incentive Program known as Meaningful Use.6 EHRs may employ their own proprietary reference terminologies that allow to search for and display diagnostic and other concepts in forms that clinicians find customary and meaningful while maintaining mapped connections in the background to more or less granular codes intended for reporting, financial or automated decision-support purposes. Typically, a clinician adding a coded term to a problem list or a fully-qualified ICD-10 code to a billing record starts with typing one or more words, an abbreviation or a string of characters into a free-text query field. A record search engine within the EHR then returns results based on their relevance to the search string and ranks them in a list according to differentiating logic for complete word or partial word matching. For example, a search initiated by typing “esrd” may return many diagnostic codes related to end-stage renal disease. Further search through the results may be necessary, either by reading the list or by repeating the search with different terms, to find the exact code appropriate for the intended purpose. More sophisticated systems provide automated assistance with this refinement by using “wizards” or other support interventions to help locate the target diagnosis quickly.The almost four-fold increase in the number of diagnoses in the current coding system presents a formidable challenge to computer engineers and designers to develop algorithms and human interfaces that can, in the same short time available to clinicians and medical coders in routine practice, query, compare and select the best descriptive diagnostic or other code in the vastly expanded field. A recent survey of perspectives that coders and physicians had about the practical usefulness of ICD-10 showed that most agreed on the need for computer-assisted coding.7 If the query process is not effective, however, clinicians may find themselves facing a choice between accurate and “close enough” coding when time constraints preclude further refinement of the process. This learned behavior would directly contravene the goal of improved and precise documentation of clinical care made possible with the ICD-10 system.This study was intended to describe search behavior of clinicians using tools available in large EHR systems who were entering ICD10 codes. Our objectives were to observe interactive behavior that may contribute to incomplete or inaccurate coding and to analyze variations in coded diagnoses for standardized clinical scenarios. Findings of systematic errors or difficulties in completing the coding task may help inform or revise training that clinicians currently receive and to provide evidence and insight for improvement to electronic coding and indicate a need to revise the coding system itself.Go to:


The study was designed as a simulation of a clinical documentation task where clinicians used standardized case scenarios to enter diagnostic codes into the EHR. We asked 23 physicians to read short vignettes describing a variety of ambulatory visits and then enter relevant ICD-10 codes into a mock patient record. Seventeen participants completed two sets of three scenarios, using each set for a different EHR; six completed only one set, using EHR 1, due to technical reasons. In total, there were 40 completed sets: 23 on EHR 1 (12 Sets A and 11 Sets B), and 17 on EHR 2 (10 Sets A and 7 Sets B). The order of set completion (A vs. B) alternated to minimize possible learning bias.EHR 1 was a commercial and EHR 2 an internally-developed clinical information system. Both required an initial entry of a search term into a free-text field that returned a list of ICD-10 codes with descriptions. If the target term was in the results they could simply select it or further refine the list by entering a different search string. Decision support interventions were available on both systems and were either triggered automatically for a subset of diagnoses in EHR 1, with an option to disregard, or were designed as a part of the entry process on EHR 2.8 Participants choosing to use decision support on EHR 1 could click on modification terms in pre-determined sets and an algorithm would refine the results accordingly to a single ICD10 code. Initial search on EHR 2 returned a list filtered by patient parameters such as age and gender that could be also modified by selecting answers to term-specific questions (e.g., Laterality? Left, Right, etc.) Both systems used algorithms and branching logic during the guided-search phase to refine result lists and suggest fully specified billable codes. For example, if “otitis media” was the initially entered search term, the support intervention would show subsets of optional terms for laterality, chronicity and recurrence. The visual presentation of these terms, their number and content were different for each system.Practicing ambulatory clinicians (22 physicians, 1 physician assistant) were recruited through internal email advertising as a sample of convenience. Twenty-one (91%) had ten or more years of professional experience, sixteen (65%) as primary care providers and seven as specialists. Ten (44%) used EHR 1 for 6 months or more, twenty-one (91%) daily in practice. All were proficient in using EHR 2 as it served as the primary ambulatory record system prior to an institution-wide transition to the commercial system. The ICD10 requirement went into full effect in the hospital during the transition period and clinicians have therefore been entering codes with both EHRs for approximately the same time. Both systems had proprietary interface terminology and participants did not use any other sources such as ICD10 on the web.Participants completed the task individually in the presence of an experimenter on a single workstation that was connected to both EHR systems. They were instructed to read each scenario and then find relevant codes according to their own clinical judgment, and to use their preferred strategy in order to best simulate authentic behavior. The number of codes expected for each scenario was not explicitly stated, only that more than one may be necessary. Interactions of clinicians with the two systems were recorded into an audiovisual media file using Morae9 screen-capture software running in the background. The recordings of full screens and verbal comments were later analyzed.

Scenarios and accuracy-rating criteria for diagnostic codes

Study scenarios were taken verbatim from interactive case studies made available by the Centers for Medicare & Medicaid Services (CMS) on their Road to Ten website.10 Since the CMS recommendations for the correct ICD-10 codes were only one example of proper coding, we developed rating criteria for the appropriateness of codes in order to accommodate other correct coding options. Two physicians (HZR, EAD) independently reviewed codes entered by the participants and rated whether they were appropriate in the context of each scenario. They reached consensus on disagreements through a series of discussions. Ratings were based on two criteria: clinical accuracy and completeness. Although the rating was binary (appropriate or not), the reviewers acknowledged the potential range of responses due to variations in clinical judgment and in the complexities of the structure and intent of the ICD-10 coding system. A code rated “appropriate” could be clinically accurate but incomplete in a way that did not alter the diagnosis or miss clinically crucial information. For example, indicating that allergic rhinitis was seasonal but not explicitly including the causative agent of pollen, or documenting tonsillitis or pharyngitis for a patient with an inflamed pharynx with tonsillar exudate, would be still considered “appropriate”. However, codes without clinically significant information would not. This would include omitting streptococcus as the etiology of tonsillitis in a case where the rapid strep test was known to be positive, or not specifying large intestine in a case of Crohn’s disease complicated by colonic abscess.

Back to Featured Articles on Logo Paperblog