Glossary Entry

Cross-Validation

An evaluation technique that repeatedly splits the data into training and validation folds so every observation is used for both fitting and scoring.

Training Generalization

Also called: cross validation, k-fold cross-validation, k-fold CV

Seed source: scikit-learn documentation

The most common form is k-fold cross-validation: split the data into k folds, train on k-1 of them, validate on the held-out fold, and rotate until every fold has served as the validation set once. Averaging the k scores gives a more stable performance estimate than a single train/validation split, at the cost of fitting the model k times.

It is the standard scoring backbone for hyperparameter tuning, since comparing candidate configurations on one lucky split can easily mislead. The final test set should still be kept outside the whole procedure for an honest last measurement.