| Type | Default | Details | |
|---|---|---|---|
| model | Any | A scikit-learn compatible regression model instance (e.g. LGBMRegressor(), CatBoostRegressor(), LinearRegression(), etc.). | |
| target_cols | List[str] | List of target variable names to forecast. | |
| lags | Optional[Dict[str, Union[int, List[int]]]] | None | Dictionary specifying lag features to create for each target variable. The value can be an integer (number of lags) or a list of specific lag periods. |
| lag_transform | Optional[Dict[str, list]] | None | Dictionary specifying lag-based transformations to apply for each target variable. The value should be a list of transformation functions (e.g. rolling_mean, expanding_std) with their parameters encapsulated in the function instance. |
| difference | Optional[Dict[str, int]] | None | Dictionary specifying the order of ordinary differencing to apply for each target variable. |
| seasonal_diff | Optional[Dict[str, int]] | None | Dictionary specifying the order of seasonal differencing to apply for each target variable. |
| trend | Optional[Dict[str, str]] | None | Dictionary specifying the trend removal strategy for each target variable. Supported values are ‘linear’, ‘ets’, ‘feature_lr’, and ‘feature_ets’. |
| pol_degree | Optional[Union[int, Dict[str, int]]] | 1 | Polynomial degree for linear trend removal. Can be a single integer applied to all targets or a dictionary specifying the degree for each target variable. |
| ets_params | Optional[Dict[str, Any]] | None | Dictionary specifying ETS model and fit parameters for each target variable when using ‘ets’ trend removal. Each value is a dictionary of parameters for the ExponentialSmoothing model and fitting process. |
| change_points | Optional[Dict[str, List[int]]] | None | Dictionary specifying change points for piecewise linear trend removal for each target variable. The value should be a list of integer indices where the trend slope can change. |
| box_cox | Optional[Dict[str, Union[bool, float, int]]] | None | Dictionary specifying whether to apply Box-Cox transformation for each target variable. The value can be a boolean (True to apply with lambda estimated from data, False to skip) or a float (specific lambda value to use). |
| box_cox_biasadj | Optional[Dict[str, bool]] | None | Dictionary specifying whether to apply bias adjustment when inverting Box-Cox transformation for each target variable. |
| cat_variables | Optional[List[str]] | None | List of categorical feature column names to encode. These will be shared across all target variables. |
| categorical_encoder | Optional[Union[Dict[str, Any], Any]] | None | A categorical encoder instance, or a single-entry dictionary mapping the target column to the encoder when the encoder requires access to the target variable during fitting (e.g. {target_col: MeanEncoder()}). If encoder requiring target access is provided directly without the dict format, first target column in target_cols will be used for fitting the encoder. For encoders that do not require target access, pass the encoder instance directly (e.g. OneHotEncoder()). |
| Returns | None |
ml_mv_forecaster
def ml_mv_forecaster(
model:Any, # A scikit-learn compatible regression model instance (e.g. LGBMRegressor(), CatBoostRegressor(), LinearRegression(), etc.).
target_cols:List[str], # List of target variable names to forecast.
lags:Optional[Dict[str, Union[int, List[int]]]]=None, # Dictionary specifying lag features to create for each target variable. The value can be an integer (number of lags) or a list of specific lag periods.
lag_transform:Optional[Dict[str, list]]=None, # Dictionary specifying lag-based transformations to apply for each target variable. The value should be a list of transformation functions (e.g. rolling_mean, expanding_std) with their parameters encapsulated in the function instance.
difference:Optional[Dict[str, int]]=None, # Dictionary specifying the order of ordinary differencing to apply for each target variable.
seasonal_diff:Optional[Dict[str, int]]=None, # Dictionary specifying the order of seasonal differencing to apply for each target variable.
trend:Optional[Dict[str, str]]=None, # Dictionary specifying the trend removal strategy for each target variable. Supported values are 'linear', 'ets', 'feature_lr', and 'feature_ets'.
pol_degree:Optional[Union[int, Dict[str, int]]]=1, # Polynomial degree for linear trend removal. Can be a single integer applied to all targets or a dictionary specifying the degree for each target variable.
ets_params:Optional[Dict[str, Any]]=None, # Dictionary specifying ETS model and fit parameters for each target variable when using 'ets' trend removal. Each value is a dictionary of parameters for the ExponentialSmoothing model and fitting process.
change_points:Optional[Dict[str, List[int]]]=None, # Dictionary specifying change points for piecewise linear trend removal for each target variable. The value should be a list of integer indices where the trend slope can change.
box_cox:Optional[Dict[str, Union[bool, float, int]]]=None, # Dictionary specifying whether to apply Box-Cox transformation for each target variable. The value can be a boolean (True to apply with lambda estimated from data, False to skip) or a float (specific lambda value to use).
box_cox_biasadj:Optional[Dict[str, bool]]=None, # Dictionary specifying whether to apply bias adjustment when inverting Box-Cox transformation for each target variable.
cat_variables:Optional[List[str]]=None, # List of categorical feature column names to encode. These will be shared across all target variables.
categorical_encoder:Optional[Union[Dict[str, Any], Any]]=None, # A categorical encoder instance, or a single-entry dictionary mapping the target column to the encoder when the encoder requires access to the target variable during fitting (e.g. {target_col: MeanEncoder()}). If encoder requiring target access is provided directly without the dict format, first target column in target_cols will be used for fitting the encoder. For encoders that do not require target access, pass the encoder instance directly (e.g. OneHotEncoder()).
)->None:
“ Initialize the multi-target machine learning forecaster with specified transformations and model.
ml_mv_forecaster.fit
def fit(
df:pd.DataFrame, # Training DataFrame containing all target and feature columns.
)->None:
Fit the model to the data passed in df
| Type | Details | |
|---|---|---|
| df | pd.DataFrame | Training DataFrame containing all target and feature columns. |
| Returns | None |
ml_mv_forecaster.forecast
def forecast(
H:int, # Forecast horizon (number of steps to forecast ahead).
exog:Optional[pd.DataFrame]=None, # Future exogenous regressors (must contain at least H rows).
)->Dict[str, np.ndarray]: # A dictionary where keys are target column names and values are arrays of H forecasted values for each target variable.
Generate forecasts for H future time steps.
| Type | Default | Details | |
|---|---|---|---|
| H | int | Forecast horizon (number of steps to forecast ahead). | |
| exog | Optional[pd.DataFrame] | None | Future exogenous regressors (must contain at least H rows). |
| Returns | Dict[str, np.ndarray] | A dictionary where keys are target column names and values are arrays of H forecasted values for each target variable. |
ml_mv_forecaster.cross_validate
def cross_validate(
df:pd.DataFrame, # Input dataframe.
target_col:str, # Target variable for evaluation.
cv_split:int, # Number of cross-validation folds.
test_size:int, # Test size per fold.
metrics:List[Callable], # Metric functions (e.g. ``[MAE, RMSE]``) used to evaluate forecast accuracy across folds. Call ``.cv_summary()`` after cross-validation to retrieve the aggregated scores.
step_size:int=1, # Step size for rolling window. Default is 1.
h_split_point:Optional[int]=None, # Point to split the test set for separate evaluation. Default is None.
)->Union[pd.DataFrame, Tuple[pd.DataFrame, pd.DataFrame]]: # DataFrame with overall performance metrics averaged across folds. If h_split_point is provided, also includes separate performance before and after the split point.
Perform cross-validation.
| Type | Default | Details | |
|---|---|---|---|
| df | pd.DataFrame | Input dataframe. | |
| target_col | str | Target variable for evaluation. | |
| cv_split | int | Number of cross-validation folds. | |
| test_size | int | Test size per fold. | |
| metrics | List[Callable] | Metric functions (e.g. [MAE, RMSE]) used to evaluate forecast accuracy across folds. Call .cv_summary() after cross-validation to retrieve the aggregated scores. |
|
| step_size | int | 1 | Step size for rolling window. Default is 1. |
| h_split_point | Optional[int] | None | Point to split the test set for separate evaluation. Default is None. |
| Returns | Union[pd.DataFrame, Tuple[pd.DataFrame, pd.DataFrame]] | DataFrame with overall performance metrics averaged across folds. If h_split_point is provided, also includes separate performance before and after the split point. |