Calibrate Censoring Adjustment to Match DGM Reference Distribution
Source:R/simulate_from_dgm.R
calibrate_cens_adjust.RdUses root-finding to select a value of cens_adjust for
simulate_from_dgm such that a chosen censoring summary
statistic in the simulated data matches the corresponding statistic from
the DGM reference data (dgm$df_super).
Arguments
- dgm
An
"aft_dgm_flex"object fromgenerate_aft_dgm_flex.- target
Character. Calibration target:
"rate"(default) or"km_median".- n
Integer. Sample size passed to
simulate_from_dgm. Default1000.- rand_ratio
Numeric. Randomisation ratio passed to
simulate_from_dgm. Default1.- analysis_time
Numeric. Calendar analysis time passed to
simulate_from_dgm. Must be on the DGM time scale. Default48.- max_entry
Numeric. Maximum staggered entry time passed to
simulate_from_dgm. Default24.- seed
Integer. Base random seed. Each evaluation of the objective function uses this seed for reproducibility. Default
42.- interval
Numeric vector of length 2. Search interval for
cens_adjuston the log scale. Defaultc(-3, 3)(corresponding roughly to a 20-fold decrease/increase in censoring times).- tol
Numeric. Root-finding tolerance. Default
1e-4.- n_eval
Integer. Sample size used inside the objective function during root-finding. Smaller values are faster but noisier; increase for precision. Default
2000.- verbose
Logical. Print search progress and final result. Default
TRUE.- ...
Additional arguments passed to
simulate_from_dgm(e.g.strata_rand,time_eos).
Value
A named list with elements:
cens_adjustCalibrated
cens_adjustvalue.targetCalibration target used.
ref_valueReference metric value from
dgm$df_super.sim_valueAchieved metric value in simulated data at the calibrated
cens_adjust.residualAbsolute difference between
sim_valueandref_value.iterationsNumber of
unirootiterations.diagnosticOutput of
check_censoring_dgmat the calibrated value (invisibly).
Details
Two calibration targets are supported:
"rate"Overall censoring rate (proportion censored). Finds
cens_adjustsuch thatmean(event_sim == 0)in simulated data equalsmean(event == 0)indgm$df_super."km_median"KM-based median censoring time, estimated by reversing the event indicator so censored observations become the "event" of interest. Finds
cens_adjustsuch that the simulated KM median matches the reference KM median.
How the objective function works
At each candidate cens_adjust value, the objective function:
Calls
simulate_from_dgm()withn = n_evaland the candidatecens_adjust.Calls
check_censoring_dgm()withverbose = FALSEto extract the target metric.Returns
sim_metric - ref_metric.
uniroot finds the zero crossing, i.e. the cens_adjust at
which simulated and reference metrics are equal.
Monotonicity
The objective is monotone in cens_adjust for both targets:
Larger
cens_adjust→ longer censoring times → lower censoring rate and higher KM median.Smaller
cens_adjust→ shorter censoring times → higher censoring rate and lower KM median.
If uniroot fails (the target lies outside the search interval),
the boundary values are printed and a wider interval should be
tried.
Stochastic noise
Because the objective function involves simulation, there is Monte Carlo
noise. Setting a fixed seed and a sufficiently large n_eval
(>= 2000) reduces noise enough for reliable root-finding. The
tol argument controls the root-finding tolerance on the
cens_adjust scale (not the metric scale).
Examples
if (FALSE) { # \dontrun{
library(survival)
# Build DGM on months scale
gbsg$time_months <- gbsg$rfstime / 30.4375
dgm <- generate_aft_dgm_flex(
data = gbsg,
continuous_vars = c("age", "size", "nodes", "pgr", "er"),
factor_vars = c("meno", "grade"),
outcome_var = "time_months",
event_var = "status",
treatment_var = "hormon",
subgroup_vars = c("er", "meno"),
subgroup_cuts = list(er = 20, meno = 0)
)
# Calibrate so simulated censoring rate matches reference
cal_rate <- calibrate_cens_adjust(
dgm = dgm,
target = "rate",
n = 1000,
analysis_time = 84,
max_entry = 24
)
cat("Calibrated cens_adjust (rate):", cal_rate$cens_adjust, "\n")
# Calibrate to KM median censoring time instead
cal_km <- calibrate_cens_adjust(
dgm = dgm,
target = "km_median",
n = 1000,
analysis_time = 84,
max_entry = 24
)
cat("Calibrated cens_adjust (km_median):", cal_km$cens_adjust, "\n")
# Use calibrated value in simulation
sim <- simulate_from_dgm(
dgm = dgm,
n = 1000,
analysis_time = 84,
max_entry = 24,
cens_adjust = cal_rate$cens_adjust,
seed = 123
)
mean(sim$event_sim) # event rate
mean(sim$event_sim == 0) # censoring rate — should match ref
} # }