TimeSliceCrossValidator.run#
- TimeSliceCrossValidator.run(X, y, sampler_config=None, yaml_path=None, mmm=None, model_names=None, original_scale_vars=None, df_lift_test=None, lift_test_date_column=None, return_models=False)[source]#
- Overloads:
self, X (pd.DataFrame), y (pd.Series), sampler_config (dict[str, Any] | None), yaml_path (str | None), mmm (MMMBuilder | None), model_names (list[str] | None), original_scale_vars (list[str] | None), df_lift_test (pd.DataFrame | None), lift_test_date_column (str | None), return_models (Literal[False]) → az.InferenceData
self, X (pd.DataFrame), y (pd.Series), sampler_config (dict[str, Any] | None), yaml_path (str | None), mmm (MMMBuilder | None), model_names (list[str] | None), original_scale_vars (list[str] | None), df_lift_test (pd.DataFrame | None), lift_test_date_column (str | None), return_models (Literal[True]) → tuple[az.InferenceData, list[MMMBuilder]]
Run the complete time-slice cross-validation loop.
Executes cross-validation by iterating through all folds, fitting a model for each training set, and generating predictions on the combined train+test data.
- Parameters:
- X
pd.DataFrame Feature matrix containing the date column and predictor variables.
- y
pd.Series Target variable.
- sampler_config
dict, optional Sampler configuration to override the validator-level configuration for all folds in this run. If provided, takes precedence over the configuration passed at construction time.
- yaml_path
str, optional Path to a YAML configuration file for building the MMM model per fold. Mutually exclusive with
mmm.- mmm
object, optional An MMM instance or a builder/factory object with a
build_model(X, y)method. Ifbuild_modelreturns a new model object (factory pattern), that object is used for the fold. If it returnsNone(in-place pattern, as with the library’sMMMclass), the instance itself is used after deep-copying. Mutually exclusive withyaml_path.- model_names
listofstr, optional Names to assign to each CV fold in the combined InferenceData. If provided, length must match the number of splits. If not provided, names are generated from each model’s
_model_nameattribute or as'Iteration {i}'.- original_scale_vars
listofstr, optional Contribution variables to register with
add_original_scale_contribution_variable(var=...)on each fold-local model after build and before fit.- df_lift_test
pd.DataFrame, optional Lift-test measurements to apply on each fold-local model. Rows are filtered leakage-safely per fold using
lift_test_date_columnand the fold training end date.- lift_test_date_column
str, optional Name of the date column in
df_lift_test. Required whendf_lift_testis provided.- return_modelsbool, optional
If
True, return the fitted MMM instances for each fold alongside the combined InferenceData. Default isFalse.
- X
- Returns:
arviz.InferenceDataCombined InferenceData where each fold is concatenated along a new coordinate named ‘cv’. Includes a ‘cv_metadata’ group with per-fold train/test data. Returned when
return_modelsisFalse(the default).tuple[arviz.InferenceData,list[MMMBuilder]]A tuple of the combined InferenceData and a list of fitted MMM instances (one per fold). Returned when
return_modelsisTrue.
- Raises:
ValueErrorIf neither
yaml_pathnormmmis provided. Ifmodel_nameslength doesn’t match the number of splits. If no InferenceData objects are produced during CV.
See also
splitGenerate train/test indices for cross-validation.
get_n_splitsReturn the number of splits.
Notes
Per-fold results are also stored in
self._cv_resultsafter calling this method.Examples
Using a YAML configuration:
>>> cv = TimeSliceCrossValidator( ... n_init=100, forecast_horizon=10, date_column="date" ... ) >>> combined_idata = cv.run(X, y, yaml_path="model_config.yml")
Using an MMM instance directly:
>>> cv = TimeSliceCrossValidator( ... n_init=100, forecast_horizon=10, date_column="date" ... ) >>> combined_idata = cv.run( ... X, ... y, ... mmm=mmm, ... original_scale_vars=[ ... "channel_contribution", ... "fourier_contribution", ... "intercept_contribution", ... ], ... )
Using leakage-safe lift tests in CV:
>>> combined_idata = cv.run( ... X, ... y, ... mmm=mmm, ... df_lift_test=df_lift_test, ... lift_test_date_column="date", ... )
Using a model builder object:
>>> cv = TimeSliceCrossValidator( ... n_init=100, forecast_horizon=10, date_column="date" ... ) >>> combined_idata = cv.run(X, y, mmm=mmm_builder)
Returning fitted models alongside the combined InferenceData:
>>> combined_idata, models = cv.run(X, y, mmm=mmm, return_models=True)