Print this page

Division of Research Maintained Data Sources

The Division of Research (DOR) maintains several computerized databases to expedite research, including disease registries, compilations of data collected on specific research cohorts, member-linked vital statistics and sociodemographic records from external sources, and long-term longitudinal data on Healthplan members. These data are readily accessible and can be linked together by a single unique identifier, the enrollee medical record number (MRN).

Disease Registries Other DOR-Maintained Data
Surveys and Clinical Study Cohorts Utilization and Clinical Data
Member-linked Public Use Files Administrative Databases
Longitudinal Datasets


Disease Registries

DOR investigators developed and maintain several research quality disease registries. Registries have in common a high degree of validation, usually via chart review; known population sensitivities and specificities for diagnoses; linkages to multiple databases, and ongoing updates. Infrastructural support is provided by TPMG. These registries are highly efficient entry points for studies of natural history or treatment outcomes, and high quality sampling frames for randomized trials and surveys.

Several registries have existed for 10 years or longer, including the SEER-quality KP Northern California Cancer Registry; the KP Diabetes Registry, with >180,000 current members and rich data on treatment, laboratory results, and complications; the KP HIV/AIDS registry, with detailed laboratory, prescription and outcomes data on more than 16,000 patients; and the Neonatal Minimum Dataset (NMDS) which captures physiologic data on all admissions of Kaiser infants to Neonatal ICUs. Each has provided data for multiple publications (>200 for the cancer registry; >20 each for the diabetes registry and NMDS).

Surveys and Clinical Study Cohorts

Multiphasic Health Checkup (MHC) and Personal Health Appraisal (PHA) Cohorts

Starting in 1964 in Oakland and San Francisco, KP implemented an innovative automated multiphasic health examination to provide cost-effective health checkups. Over the next 28 years, close to a million exams were conducted collecting health data on several hundred thousand Healthplan members.  Data collected included physiologic, blood, urine, and radiology results; demographic characteristics; family health history; and smoking and drinking behavior.  A subset of patients had blood samples taken and preserved in a frozen sera repository.

Member Health Survey

The Member Health Survey (MHS) is a mailed questionnaire survey that has been conducted with independent stratified random samples of approximately 40,000 adult Healthplan members aged 20 and over every three years starting in 1993.  The confidential survey covers sociodemographic characteristics, health status and health conditions, health-related behaviors and lifestyle risk factors, use of selected medications, use of complementary and alternative therapies, receipt of preventive services, preferred methods of receiving health information, and opinions about illness care, preventive services, and health information services.  Supplemental questions asked of members aged 65 and over focus on different aspects of functional status, current living/transportation situation and availability of helpers, medication review, and end-of-life care issues.

Member-linked Public Use Files

Linked California State and Social Security Administration (SSA) Death Records

Starting in 1992, all KPNC Healthplan members have been  probabilistically linked on an annual basis to Death Certificate data supplied by the California Department of Health Services. For years 2001 and later,  U.S.  SSA records will be used to help capture Out-of-State deaths.   The linked records allow researchers to ascertain vital status and ICD-coded cause of death for our current and past enrolled population.

2000 US Census Linkage

All KPNC active Healthplan members in 2000 have been geocoded by their home addresses and linked to the 2000 U.S. Census data (SF3) by their census block location.  This linkage allows ecological measures of race, ethnicity and other sociodemographic variables to be calculated for Healthplan members based on the assumption that the distribution of each characteristic is the same for Kaiser members as for other residents of each census block or block group.

Linked Births

Since 1990, records covering over 99% of all births occurring to KPNC Healthplan members have been consolidated into the Infant Cohort file, containing, birth date and time, birth weight, birth order, hospital of birth, gestational age, length of stay, sex, race/ethnicity, name, and mother’s medical record number, name, age, race/ethnicity, and delivery length of stay. Linkage with maternal hospitalization records provides delivery information including labor and delivery complications, obstetric conditions, mode of delivery, and date and time of delivery. Risk factors for preterm labor and prenatal substance abuse are also recorded for the approximately 85% of women screened at the initiation of prenatal care.  From 1995-1999,  California State Birth records were linked into the Infant Cohort file, providing additional maternal demographic information, including  mother’s educational level, race and ethnicity.


Longitudinal Datasets

Inpatient Hospitalizations

DOR maintains detailed diagnostic and procedural data on all cumulative northern California Healthplan members hospitalized in KP hospitals dating back to 1976.  Diagnostic and procedural data prior to 1979 is ICD-8 coded; subsequently data is ICD-9CM coded.

Healthplan Membership

DOR maintains a historical record, dating back to 1971, of enrollment and disenrollment dates, zip code of residence, and primary source of insurance, for each KPNC member.

Other DOR-Maintained Data

Membership Retention

Reference tables provide 10-year prospective Healthplan disenrollments rates computed from 1982 – 2002, by sex, age-group, and duration of previous membership.  These statistics allow researchers developing new studies to predict study cohort attrition rates due to Healthplan disenrollment.

No-Contact File

 A cumulative list is maintained of Healthplan members who have died or indicated that they do not want to participate in research studies. This data source allows researcher to avoid inappropriate recruitment efforts.

Northern California Kaiser Permanente Data Sources Maintained Outside of DOR

Administrative and clinical databases maintained at the Northern California Regional level provide a detailed and comprehensive record of Healthplan members’ utilization of medical services and associated costs, clinical status, demographic characteristics, and insured benefits.

Utilization and Clinical Data

Hospitalization

Detailed records of each inpatient hospitalization or hospital-based encounter occurring in KPNC medical centers are available online dating back to 1986.  Data include dates and times of admission, discharge, and transfers within special units such as operating rooms, emergency department, intensive care, labor and delivery; DRGs and ICD9-CM coded diagnoses and procedures; and admitting and attending physicians. Approximately 200,000 inpatient discharges occur per year, as well as 120,000 other hospital-based procedures and encounters.

Ambulatory Visits

Approximately 14 million outpatient visits per year are documented in data retained online for 27 months.  Archived visit data is available back to 1987; diagnostic and procedural data back to 1995.  Encounter records include the date and time of each visit, the ICD9-CM and CPT-4 coded diagnoses and procedures, the specific department and subdepartment involved, and the type of provider seen.

Ancillary Services: Laboratory and Imaging

Two types of laboratory test data are available in computerized format:  Clinical, which includes chemistry, hematology, microbiology and anatomic pathology; and Laboratory, which includes cytology, histology and pathology. Laboratory records include date and time of the ordered and performed tests, ordered and performed test codes, ordering facility and caregiver, specimen collection date and time. Test results are available as continuous values or text fields, depending on type of test.  Data are retained online for 28 months; archived data are available back to 1994.

Records of each ordered radiology procedure are available back to 1996; results back to 1997.  Data include the order date and time, completion date and time, the exam ordered coded as a CPT-4 code, and ordering facility and provider.  Results are available as text strings. For certain types of imaging, e.g., mammography,  discrete alert codes are provided to flag abnormal findings.

Pharmacy

Records of all drugs prescribed by KP physicians dispensed in KP inpatient or outpatient pharmacies are captured back to 1994 (outpatient) and 1996 (inpatient).  Information stored includes drug name, NDC code, dosage and therapeutic class; dates of prescription, dispensing and refills; identity of prescribing physician; and prescription cost.

Outside Referrals and Claims

Encounter and cost data for medical services authorized by KP providers but supplied by non-KP vendors is available starting in 1992. Records include the coded reason for referral; the department and identity of the referring provider; the identity and type of vendor or provider performing the referred service; ICD9-CM and CPT4 coded diagnoses and procedures; and billed and paid amounts.  Similarly, emergency or out-of-area claims for medical services are documented back to 1990. 

Immunizations

All adult and  pediatric inoculations, including skin tests, occurring in KP facilities since 1991, are available online in the Kaiser Immunization Tracking System, allowing researchers to examine long-term protective and adverse outcomes.

Chronic Disease Registries

KPNC maintains several registries of Healthplan members with various chronic and high risk conditions, including: Asthma, Coronary Artery Disease, Congestive Heart Failure, Diabetes, and Maternal Prenatal Drug/Alcohol Use.

Administrative Databases

Healthplan Membership

Monthly files include current Healthplan enrollment status, benefit structure, and source of insurance for each active member.  Healthplan subscribers and their covered spouses and dependents can be easily linked.

Patient Demographics

Members’ names, birthdates, sex, physical disabilities, preferred language, addresses, and contact information are included, for approximately 8 million past and current members.

Service Population

Service population provides meaningful denominators for utilization analysis, by incorporating information about members’ geographic utilization patterns.  Service population, computed each quarter, assigns each member to a facility based on the majority of the member's primary care visits, if any, otherwise  to the primary facility of his or her personal care physician

Service Costs

The KPNC cost accounting system provides fully loaded service costs for each patient encounter and provided service, computed using an activity-based costing methodology.  Fixed and variable components of cost are broken out, and costs are available based either on local  facility or regionally-averaged unit costs.

Member Patient Satisfaction Survey

The KPNC Regional Member and Patient Survey (MPS) tracks satisfaction with service and access, beginning in 1994. The survey consists of two parts: (1) a daily stratified random sample of patients immediately following an office visit; and (2) a quarterly stratified random sample of members who have not had a visit in one year . Approximately 105,000 patient surveys and 7000 member surveys are returned per quarter, constituting a response rate of about 48% for the patient survey, about 17% for the member survey, and a total response rate of 43%.