TimeSliceCrossValidator.run#

TimeSliceCrossValidator.run(X, y, sampler_config=None, yaml_path=None, mmm=None, model_names=None, original_scale_vars=None, df_lift_test=None, lift_test_date_column=None, return_models=False)[source]#
Overloads:
  • self, X (pd.DataFrame), y (pd.Series), sampler_config (dict[str, Any] | None), yaml_path (str | None), mmm (MMMBuilder | None), model_names (list[str] | None), original_scale_vars (list[str] | None), df_lift_test (pd.DataFrame | None), lift_test_date_column (str | None), return_models (Literal[False]) → az.InferenceData

  • self, X (pd.DataFrame), y (pd.Series), sampler_config (dict[str, Any] | None), yaml_path (str | None), mmm (MMMBuilder | None), model_names (list[str] | None), original_scale_vars (list[str] | None), df_lift_test (pd.DataFrame | None), lift_test_date_column (str | None), return_models (Literal[True]) → tuple[az.InferenceData, list[MMMBuilder]]

Run the complete time-slice cross-validation loop.

Executes cross-validation by iterating through all folds, fitting a model for each training set, and generating predictions on the combined train+test data.

Parameters:
Xpd.DataFrame

Feature matrix containing the date column and predictor variables.

ypd.Series

Target variable.

sampler_configdict, optional

Sampler configuration to override the validator-level configuration for all folds in this run. If provided, takes precedence over the configuration passed at construction time.

yaml_pathstr, optional

Path to a YAML configuration file for building the MMM model per fold. Mutually exclusive with mmm.

mmmobject, optional

An MMM instance or a builder/factory object with a build_model(X, y) method. If build_model returns a new model object (factory pattern), that object is used for the fold. If it returns None (in-place pattern, as with the library’s MMM class), the instance itself is used after deep-copying. Mutually exclusive with yaml_path.

model_nameslist of str, optional

Names to assign to each CV fold in the combined InferenceData. If provided, length must match the number of splits. If not provided, names are generated from each model’s _model_name attribute or as 'Iteration {i}'.

original_scale_varslist of str, optional

Contribution variables to register with add_original_scale_contribution_variable(var=...) on each fold-local model after build and before fit.

df_lift_testpd.DataFrame, optional

Lift-test measurements to apply on each fold-local model. Rows are filtered leakage-safely per fold using lift_test_date_column and the fold training end date.

lift_test_date_columnstr, optional

Name of the date column in df_lift_test. Required when df_lift_test is provided.

return_modelsbool, optional

If True, return the fitted MMM instances for each fold alongside the combined InferenceData. Default is False.

Returns:
arviz.InferenceData

Combined InferenceData where each fold is concatenated along a new coordinate named ‘cv’. Includes a ‘cv_metadata’ group with per-fold train/test data. Returned when return_models is False (the default).

tuple[arviz.InferenceData, list[MMMBuilder]]

A tuple of the combined InferenceData and a list of fitted MMM instances (one per fold). Returned when return_models is True.

Raises:
ValueError

If neither yaml_path nor mmm is provided. If model_names length doesn’t match the number of splits. If no InferenceData objects are produced during CV.

See also

split

Generate train/test indices for cross-validation.

get_n_splits

Return the number of splits.

Notes

Per-fold results are also stored in self._cv_results after calling this method.

Examples

Using a YAML configuration:

>>> cv = TimeSliceCrossValidator(
...     n_init=100, forecast_horizon=10, date_column="date"
... )
>>> combined_idata = cv.run(X, y, yaml_path="model_config.yml")

Using an MMM instance directly:

>>> cv = TimeSliceCrossValidator(
...     n_init=100, forecast_horizon=10, date_column="date"
... )
>>> combined_idata = cv.run(
...     X,
...     y,
...     mmm=mmm,
...     original_scale_vars=[
...         "channel_contribution",
...         "fourier_contribution",
...         "intercept_contribution",
...     ],
... )

Using leakage-safe lift tests in CV:

>>> combined_idata = cv.run(
...     X,
...     y,
...     mmm=mmm,
...     df_lift_test=df_lift_test,
...     lift_test_date_column="date",
... )

Using a model builder object:

>>> cv = TimeSliceCrossValidator(
...     n_init=100, forecast_horizon=10, date_column="date"
... )
>>> combined_idata = cv.run(X, y, mmm=mmm_builder)

Returning fitted models alongside the combined InferenceData:

>>> combined_idata, models = cv.run(X, y, mmm=mmm, return_models=True)