Data source
The Clinical Practice Research Datalink (CPRD) is the world’s largest database of anonymised primary care data, spanning 674 practices in the UK [18]. The database contains the records of 11.3 million patients, 4.4 million of which were alive and under observation on 2 July 2013, representing 6.9% of the UK population [18]. Data on patient diagnostic codes, prescriptions, referrals, and laboratory and test data are automatically collected from each contributing practice. The protocol for this research was approved by the Independent Scientific Advisory Committee (ISAC) of the Medicines and Healthcare Products Regulatory Agency (protocol number 15_011A), and the approved protocol was made available to the journal and reviewers during peer review. Ethical approval for observational research using the CPRD with approval from ISAC has been granted by a National Research Ethics Service committee (Trent MultiResearch Ethics Committee, REC reference number 05/MRE04/87).
Study population
Patient eligibility was defined using only data uploaded by GP practices after the date they were classified as up-to-standard. Patients were required to be aged 35 years or over with T2DM on the study start date, have gender unambiguously recorded, have CPRD records that could be linked to Office for National Statistics data and have at least 2 years or uninterrupted follow-up data from their current registered general practice before the study start date for the evaluation of baseline risk factors. The presence of T2DM was established using a combination of medical codes, product codes and patient age.
Patients were excluded whose records contained medical codes for other forms of diabetes, polycystic ovary syndrome, total pancreatectomy or pancreatic or renal transplant, at any point prior to the study start date.
Follow-up commenced on the study start date of 1 January 2005 and terminated on the study end date of 1 January 2014. Patient follow-up ceased at the earliest event of the study end date, the date of patient mortality, the date of last data collection from the patient’s practice, the date the patient transferred out of their current practice and the date the patient underwent total pancreatectomy, or pancreatic or renal transplant.
Main exposure variable
Albuminuria status was defined using definitions present in the 2013 Kidney Disease Improving Global Outcomes guidelines [12]. Within this, albuminuria stage A1 was classified as normoalbuminuria, while albuminuria stages A2 or A3 were classified as albuminuria. The presence or absence of albuminuria in CPRD patient records was established using a combination of Read codes and test results.
Study adjustment variables
Adjustment variable data were extracted from CPRD on patient: age, gender, body mass index (BMI), smoking status, systolic blood pressure (SBP), glycated haemoglobin (HbA1c) and total-to-high-density-lipoprotein (Total:HDL) cholesterol ratio.
Outcomes
Outcome data was established using International Classification of Diseases version 10 (ICD-10) codes for the ‘underlying cause’ of death listed on UK Office for National Statistics death certificates. Within these, cardiovascular mortality was defined as death with ICD-10 codes I10-I79, cancer mortality was defined as death with ICD-10 codes C00-C97 and other mortality as death with ICD-10 codes other than those listed for cardiovascular or cancer mortality.
Summary analysis
Comparisons between patient albuminuria status and baseline patient characteristics were performed using unpaired two-tailed t tests [19] for continuous outcomes and Fisher’s exact test [20] for categorical outcomes. All p values were adjusted using Bonferroni’s method [21,22,23] to correct for multiple comparisons.
Log-rank tests [24,25,26,27] were used to appraise the equality of the survival functions for each albuminuria status towards each outcome, while Gray’s K-sample test [28] was used to appraise the equality of the cumulative incidence functions for each albuminuria status towards each outcome.
Estimators and models
Absolute risk
The complement of the Kaplan-Meier estimate of survival probability (1-KM) [1], herein referred to as the ‘Kaplan-Meier method’, estimates marginal risk: the cumulative risk by time t is an estimate of the risk of failure from a specific cause in the hypothetical case that all other causes of failure are absent. Briefly, the Kaplan-Meier method achieves this by removing from the at-risk set, at any instant, individuals who have previously experienced either the event of interest or any censoring event that prevents observation of the event of interest [4]. Competing outcomes are not considered except as a form of censoring.
The cumulative incidence competing risk (CICR) [3] method estimates absolute risk accounting for competing risks: the cumulative risk by time t is an estimate of the risk of failure from a specific cause, acknowledging that the absolute risk of the event is lowered by the presence of other competing risks. Individuals are removed from the at-risk set, at any instant, only if they have previously experienced the primary event or any censoring events that are explicitly assumed to be non-informative but retained in the at-risk set if they have experienced a competing outcome [4].
Measures of association
The Cox proportional hazards model [2] estimates cause-specific relative hazard: the ratio of the instantaneous risk in at-risk individuals with one exposure status to the instantaneous risk in at-risk individuals with another exposure status. To obtain its estimate of the cause-specific hazard ratio, the Cox proportional hazards model assumes all individuals under observation experience either the primary outcome or non-informative censoring [29]. The unstratified Lunn-McNeil competing risk model [6], herein referred to as the ‘Lunn-McNeil model’, also estimates the cause-specific hazard ratio but allows for the modelling of non-informative censoring mechanisms as competing outcomes, while assuming a common baseline hazard distribution between outcomes. In these estimates of cause-specific relative hazard, individuals are considered to be at-risk at any instant if they have not yet experienced any of the study outcomes. When using the Lunn-McNeil model to evaluate cardiovascular mortality, cancer mortality and other mortality were modelled as separate competing outcomes. When using the Lunn-McNeil model to evaluate cancer mortality, cardiovascular mortality and other mortality were modelled as separate competing outcomes. The Fine-Gray competing risk model [7] estimates the subdistribution hazard ratio: the ratio of the instantaneous risks defined as above, except that individuals are considered to be at-risk if they have not yet experienced the primary outcome [29]. Individuals are retained in the at-risk set if they have previously experienced competing risk events, analogously to the CICR method for absolute risk. When using the Fine-Gray model to evaluate cardiovascular mortality, non-cardiovascular mortality was modelled as a single competing outcome. When using the Fine-Gray model to evaluate cancer mortality, non-cancer mortality was modelled as a single competing outcome.
When competing risks are present, the different risk sets employed by cause-specific hazard models (like the Cox-PH or Lunn-McNeil model) and subdistribution hazard models (like the Fine-Gray model) give rise to different measures of association. The cause-specific hazard ratio may be thought of as a measure of ‘aetiological association’, i.e. best suited to quantifying causal relationships. Conversely, the subdistribution hazard ratio may be thought of as a measure of ‘prognostic association’, i.e. best suited to quantifying predictive relationships [30].
The proportional cause-specific hazard assumptions of the Cox-PH and Lunn-McNeil models are assessed using Schoenfeld residuals [31]. Schoenfeld-type residuals [32] are used to assess the proportional subdistribution hazard assumption of the Fine-Gray models.
Sensitivity analyses
Sensitivity analyses were conducted to assess the robustness of all risk estimates to the misclassification of cause of mortality on patient death certificates. Within these analyses, mortality was reclassified as being attributable to either cardiovascular disease or cancer if the competing cause was listed as a contributing factor. The potential misspecification of the study primary exposure was assessed by re-assigning patient baseline albuminuria status using only read codes, numerical test values or categorical test values. Additionally, the robustness of estimates of absolute risk was evaluated through the use of an alternate cumulative risk estimator (the Nelson-Aalen estimator), while the robustness of estimates of relative risk was evaluated through the inclusion of time-interaction terms to correct for violations of the proportionality assumption, and the use of alternate parameterisations of the Lunn-McNeil model, in which competing mortality was restructured into a single outcome, representing all mortality not attributable to the primary outcome.