ForestSearch Repeated K-Fold Cross-Validation
Source:R/forestsearch_cross_validation.R
forestsearch_tenfold.RdThis function performs multiple independent K-fold cross-validations to assess the variability in subgroup identification. Each simulation:
Randomly shuffles the data
Performs K-fold CV
Records sensitivity and agreement metrics
Results are summarized across all simulations.
Usage
forestsearch_tenfold(
fs.est,
sims,
Kfolds = 10,
details = TRUE,
seed = 8316951L,
parallel_args = list(plan = "multisession", workers = 6, show_message = TRUE)
)Arguments
- fs.est
List. ForestSearch results object from
forestsearch.- sims
Integer. Number of simulation repetitions.
- Kfolds
Integer. Number of folds per simulation (default: 10).
- details
Logical. Print progress details (default: TRUE).
- seed
Integer. Base random seed for fold shuffling. Default 8316951L. Each simulation uses seed + 1000 * ksim for reproducibility.
- parallel_args
List. Parallelization configuration.
Value
List with components:
- sens_summary
Named vector of median sensitivity metrics across simulations
- find_summary
Named vector of median subgroup-finding metrics
- sens_out
Matrix of sensitivity metrics (sims x metrics)
- find_out
Matrix of finding metrics (sims x metrics)
- timing_minutes
Total execution time
- sims
Number of simulations run
- Kfolds
Number of folds per simulation
Details
Runs repeated K-fold cross-validation simulations for ForestSearch and summarizes subgroup identification stability across repetitions.
Parallelization Strategy
Unlike the single K-fold function which parallelizes across folds, this function parallelizes across simulations for better efficiency when running many repetitions. Each simulation runs its K-fold CV sequentially.
See also
forestsearch_Kfold for single K-fold CV
forestsearch_KfoldOut for summarizing CV results