a. Intra- or interdisciplinary
3. Scope of database
a. Target patient population
c. Disease-, site- or treatment-specific
d. Cross-sectional vs. longitudinal
e. Database lifetime
f. Core dataset
4. Sources of data/links to other databases
5. Available infrastructure
6. Choice of database program/housing issues (interdependent with infrastructure available, issues of staffing, maintenance/enhancements/upgrades and cost)
b. Exact role
8. Governance plan
9. Quality assurance
a. The ALCOA data integrity test (Attributable, Legible, Contemporaneous, Original and Accurate)
b. Standardized data sources/definitions
c. Oversight plan
a. Institutional review/ethical board approval
b. Issues of consent/authorization vs. waiver
c. Applicable policies and regulations at all levels (e.g., Code of Federal Regulations; Good Clinical Practice; Health Insurance Portability and Accountability Act [HIPAA and/or other applicable privacy laws; institutional information security policies; other policies and regulations)
11. Assessment of initial feasibility and long-term viability (baseline and maintenance costs being major considerations)
Oravec et al 1) rightly bring up concerns about the quality of diagnosis codes that may impact the internal validity of studies using these data. Quality improvement efforts by the institutions that manage these databases, including the involvement of neurosurgeon input in database design, are one approach to address this issue. An opensource designation of neurosurgical diagnosis codes and relevant complications can also help standardize the way the field uses these datasets. Additionally, like other types of retrospective data, analysis drawn from administrative databases can be used for descriptive and inferential analysis, but cannot establish causality. This should be appropriately mentioned as a limitation in any paper that uses this type of data. These limitations, however, do not obviate the utility of administrative databases in answering particular types of questions and as hypothesis-generating tools. Descriptive and inferential analysis aside, the volume provided by administrative databases, which is unparalleled by any prospective dataset, is especially useful for the generation of predictive models. Ultimately, the goal of Harary et al. was to highlight the fact that administrative databases are an example of only one type of data that falls within the umbrella of “Big Data,” and that they only begun to scratch the surface of what the data science, powered by Big Data and artificial intelligence, can offer the neurosurgical community and patients 2).
Zhou et al. evaluated the sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of Cushing’s syndrome (CS) etiology-related ICD-10 codes or code combinations by comparing hospital discharge administrative database (DAD) with established diagnoses from medical records.
Coding for patients with adrenocortical adenoma (ACA) and those with bilateral macronodular adrenal hyperplasia (BMAH) demonstrated disappointingly low sensitivity at 78.8% (95% CI: 70.1% - 85.6%) and 83.9% (95% CI: 65.5% - 93.9%), respectively. BMAH had the lowest PPV of 74.3% (95% CI: 56.4% - 86.9%). In confirmed ACA patients, the sensitivity for ACA code combinations was higher in patients initially admitted to the Department of Endocrinology before surgery than that in patients directly admitted to the Department of Urology (90.0% vs 73.1%, P = 0.033). The same phenomenon was observed in the PPV for the BMAH code (100.0% vs 60.9%, P = 0.012). Misinterpreted or confusing situations caused by coders (68.1%) and by the omission or denormalized documentation of symptomatic diagnosis by clinicians (26.1%) accounted for the main source of coding errors.
Hospital administrative database is an effective data source for evaluating the etiology of Cushing’s syndrome (CS) but not adrenocortical adenoma (ACA) and bilateral macronodular adrenal hyperplasia (BMAH). Improving surgeons' documentation, especially in the delineation of symptomatic and locative diagnoses in discharge abstracts; department- or disease-specific training for coders; and more multidisciplinary collaboration are ways to enhance the applicability of administrative data for CS etiologies 3).