Creates a data generating mechanism (DGM) for survival simulations based on the German Breast Cancer Study Group (GBSG) dataset. Supports heterogeneous treatment effects via treatment-subgroup interactions.
Arguments
- model
Character. Either "alt" for alternative hypothesis with heterogeneous treatment effects, or "null" for uniform treatment effect. Default: "alt"
- k_treat
Numeric. Treatment effect multiplier applied to the treatment coefficient from the fitted AFT model. Values > 1 strengthen the treatment effect. Default: 1
- k_inter
Numeric. Interaction effect multiplier for the treatment-subgroup interaction (z1 * z3). Only used when model = "alt". Higher values create more heterogeneity between HR(H) and HR(Hc). Default: 1
- k_z3
Numeric. Effect multiplier for the z3 (menopausal status) coefficient. Default: 1
- z1_quantile
Numeric. Quantile threshold for z1 (estrogen receptor). Observations with ER <= quantile are coded as z1 = 1. Default: 0.25
- n_super
Integer. Size of super-population for empirical HR estimation. Default: 5000
- cens_type
Character. Censoring distribution type: "weibull" or "uniform". Default: "weibull"
- use_rand_params
Logical. If TRUE, modifies confounder coefficients using estimates from randomized subset (meno == 0). Default: FALSE
- seed
Integer. Random seed for super-population generation. Default: 8316951
- verbose
Logical. Print diagnostic information. Default: FALSE
Value
A list of class "gbsg_dgm" containing:
- df_super_rand
Data frame with randomized super-population including potential outcomes (theta_0, theta_1, loghr_po)
- hr_H_true
Empirical hazard ratio in harm subgroup (Cox-based)
- hr_Hc_true
Empirical hazard ratio in complement subgroup (Cox-based)
- hr_causal
Overall causal (ITT) hazard ratio (Cox-based)
- AHR
Overall average hazard ratio (from loghr_po)
- AHR_H_true
Average hazard ratio in harm subgroup
- AHR_Hc_true
Average hazard ratio in complement subgroup
- hazard_ratios
List matching generate_aft_dgm_flex output format
- model_params
List with AFT model parameters (mu, sigma, gamma, etc.)
- cens_params
List with censoring model parameters
- subgroup_info
List with subgroup definitions and true factor names
- analysis_vars
Character vector of analysis variable names
- model_type
Character indicating "alt" or "null"
Details
This version is aligned with generate_aft_dgm_flex() and
calculate_hazard_ratios() methodology, computing individual-level
potential outcomes and average hazard ratios (AHR).
Subgroup Definition
The harm subgroup H is defined as: z1 = 1 AND z3 = 1, where:
z1: Low estrogen receptor (ER <= 25th percentile by default)
z3: Premenopausal status (meno == 0)
Model Specification
The AFT model uses covariates: treat, z1, z2, z3, z4, z5, and (for "alt") the interaction zh = treat * z1 * z3.
Interaction Effect (k_inter)
The k_inter parameter modifies the zh coefficient in the AFT model:
gamma[zh] <- k_inter * gamma[zh]This affects the hazard ratio for the harm subgroup:
HR(H) = exp(-gamma[treat]/sigma - gamma[zh]/sigma)
HR(Hc) = exp(-gamma[treat]/sigma)
When k_inter = 0, HR(H) = HR(Hc) (no heterogeneity).
See also
simulate_from_gbsg_dgm for generating data from the DGM
calibrate_k_inter for finding k_inter to achieve target HR
Examples
if (FALSE) { # \dontrun{
# Alternative hypothesis with default parameters
dgm_alt <- create_gbsg_dgm(model = "alt", verbose = TRUE)
# Null hypothesis
dgm_null <- create_gbsg_dgm(model = "null", verbose = TRUE)
# Custom subgroup HR via k_inter
dgm_custom <- create_gbsg_dgm(
model = "alt",
k_treat = 1.2,
k_inter = 2.0,
verbose = TRUE
)
# Access AHR metrics (aligned with generate_aft_dgm_flex)
dgm_alt$hazard_ratios$AHR_harm
dgm_alt$hazard_ratios$AHR_no_harm
} # }