TimeSliceCrossValidator.run#

TimeSliceCrossValidator.run(X, y, sampler_config=None, yaml_path=None, mmm=None, model_names=None, original_scale_vars=None, df_lift_test=None, lift_test_date_column=None, return_models=False)[source]#

Overloads:

self, X (pd.DataFrame), y (pd.Series), sampler_config (dict[str, Any] | None), yaml_path (str | None), mmm (MMMBuilder | None), model_names (list[str] | None), original_scale_vars (list[str] | None), df_lift_test (pd.DataFrame | None), lift_test_date_column (str | None), return_models (Literal[False]) → az.InferenceData
self, X (pd.DataFrame), y (pd.Series), sampler_config (dict[str, Any] | None), yaml_path (str | None), mmm (MMMBuilder | None), model_names (list[str] | None), original_scale_vars (list[str] | None), df_lift_test (pd.DataFrame | None), lift_test_date_column (str | None), return_models (Literal[True]) → tuple[az.InferenceData, list[MMMBuilder]]

Run the complete time-slice cross-validation loop.

Executes cross-validation by iterating through all folds, fitting a model for each training set, and generating predictions on the combined train+test data.

Parameters:

Xpd.DataFrame: Feature matrix containing the date column and predictor variables.
ypd.Series: Target variable.
sampler_configdict, optional: Sampler configuration to override the validator-level configuration for all folds in this run. If provided, takes precedence over the configuration passed at construction time.
yaml_pathstr, optional: Path to a YAML configuration file for building the MMM model per fold. Mutually exclusive with mmm.
mmmobject, optional: An MMM instance or a builder/factory object with a build_model(X, y) method. If build_model returns a new model object (factory pattern), that object is used for the fold. If it returns None (in-place pattern, as with the library’s MMM class), the instance itself is used after deep-copying. Mutually exclusive with yaml_path.
model_nameslist of str, optional: Names to assign to each CV fold in the combined InferenceData. If provided, length must match the number of splits. If not provided, names are generated from each model’s _model_name attribute or as 'Iteration {i}'.
original_scale_varslist of str, optional: Contribution variables to register with add_original_scale_contribution_variable(var=...) on each fold-local model after build and before fit.
df_lift_testpd.DataFrame, optional: Lift-test measurements to apply on each fold-local model. Rows are filtered leakage-safely per fold using lift_test_date_column and the fold training end date.
lift_test_date_columnstr, optional: Name of the date column in df_lift_test. Required when df_lift_test is provided.
return_modelsbool, optional: If True, return the fitted MMM instances for each fold alongside the combined InferenceData. Default is False.

Returns:

arviz.InferenceData: Combined InferenceData where each fold is concatenated along a new coordinate named ‘cv’. Includes a ‘cv_metadata’ group with per-fold train/test data. Returned when return_models is False (the default).
tuple[arviz.InferenceData, list[MMMBuilder]]: A tuple of the combined InferenceData and a list of fitted MMM instances (one per fold). Returned when return_models is True.

Raises:

ValueError: If neither yaml_path nor mmm is provided. If model_names length doesn’t match the number of splits. If no InferenceData objects are produced during CV.

See also

split: Generate train/test indices for cross-validation.
get_n_splits: Return the number of splits.

Notes

Per-fold results are also stored in self._cv_results after calling this method.

Examples

Using a YAML configuration:

>>> cv = TimeSliceCrossValidator(
...     n_init=100, forecast_horizon=10, date_column="date"
... )
>>> combined_idata = cv.run(X, y, yaml_path="model_config.yml")

Using an MMM instance directly:

>>> cv = TimeSliceCrossValidator(
...     n_init=100, forecast_horizon=10, date_column="date"
... )
>>> combined_idata = cv.run(
...     X,
...     y,
...     mmm=mmm,
...     original_scale_vars=[
...         "channel_contribution",
...         "fourier_contribution",
...         "intercept_contribution",
...     ],
... )

Using leakage-safe lift tests in CV:

>>> combined_idata = cv.run(
...     X,
...     y,
...     mmm=mmm,
...     df_lift_test=df_lift_test,
...     lift_test_date_column="date",
... )

Using a model builder object:

>>> cv = TimeSliceCrossValidator(
...     n_init=100, forecast_horizon=10, date_column="date"
... )
>>> combined_idata = cv.run(X, y, mmm=mmm_builder)

Returning fitted models alongside the combined InferenceData:

>>> combined_idata, models = cv.run(X, y, mmm=mmm, return_models=True)