source

ml_mv_forecaster


def ml_mv_forecaster(
    model:Any, # A scikit-learn compatible regression model instance (e.g. LGBMRegressor(), CatBoostRegressor(), LinearRegression(), etc.).
    target_cols:List[str], # List of target variable names to forecast.
    lags:Optional[Dict[str, Union[int, List[int]]]]=None, # Dictionary specifying lag features to create for each target variable. The value can be an integer (number of lags) or a list of specific lag periods.
    lag_transform:Optional[Dict[str, list]]=None, # Dictionary specifying lag-based transformations to apply for each target variable. The value should be a list of transformation functions (e.g. rolling_mean, expanding_std) with their parameters encapsulated in the function instance.
    difference:Optional[Dict[str, int]]=None, # Dictionary specifying the order of ordinary differencing to apply for each target variable.
    seasonal_diff:Optional[Dict[str, int]]=None, # Dictionary specifying the order of seasonal differencing to apply for each target variable.
    trend:Optional[Dict[str, str]]=None, # Dictionary specifying the trend removal strategy for each target variable. Supported values are 'linear', 'ets', 'feature_lr', and 'feature_ets'.
    pol_degree:Optional[Union[int, Dict[str, int]]]=1, # Polynomial degree for linear trend removal. Can be a single integer applied to all targets or a dictionary specifying the degree for each target variable.
    ets_params:Optional[Dict[str, Any]]=None, # Dictionary specifying ETS model and fit parameters for each target variable when using 'ets' trend removal. Each value is a dictionary of parameters for the ExponentialSmoothing model and fitting process.
    change_points:Optional[Dict[str, List[int]]]=None, # Dictionary specifying change points for piecewise linear trend removal for each target variable. The value should be a list of integer indices where the trend slope can change.
    box_cox:Optional[Dict[str, Union[bool, float, int]]]=None, # Dictionary specifying whether to apply Box-Cox transformation for each target variable. The value can be a boolean (True to apply with lambda estimated from data, False to skip) or a float (specific lambda value to use).
    box_cox_biasadj:Optional[Dict[str, bool]]=None, # Dictionary specifying whether to apply bias adjustment when inverting Box-Cox transformation for each target variable.
    cat_variables:Optional[List[str]]=None, # List of categorical feature column names to encode. These will be shared across all target variables.
    categorical_encoder:Optional[Union[Dict[str, Any], Any]]=None, # A categorical encoder instance, or a single-entry dictionary mapping the target column to the encoder when the encoder requires access to the target variable during fitting (e.g. {target_col: MeanEncoder()}). If encoder requiring target access is provided directly without the dict format, first target column in target_cols will be used for fitting the encoder. For encoders that do not require target access, pass the encoder instance directly (e.g. OneHotEncoder()).
)->None:

Initialize the multi-target machine learning forecaster with specified transformations and model.

Type Default Details
model Any A scikit-learn compatible regression model instance (e.g. LGBMRegressor(), CatBoostRegressor(), LinearRegression(), etc.).
target_cols List[str] List of target variable names to forecast.
lags Optional[Dict[str, Union[int, List[int]]]] None Dictionary specifying lag features to create for each target variable. The value can be an integer (number of lags) or a list of specific lag periods.
lag_transform Optional[Dict[str, list]] None Dictionary specifying lag-based transformations to apply for each target variable. The value should be a list of transformation functions (e.g. rolling_mean, expanding_std) with their parameters encapsulated in the function instance.
difference Optional[Dict[str, int]] None Dictionary specifying the order of ordinary differencing to apply for each target variable.
seasonal_diff Optional[Dict[str, int]] None Dictionary specifying the order of seasonal differencing to apply for each target variable.
trend Optional[Dict[str, str]] None Dictionary specifying the trend removal strategy for each target variable. Supported values are ‘linear’, ‘ets’, ‘feature_lr’, and ‘feature_ets’.
pol_degree Optional[Union[int, Dict[str, int]]] 1 Polynomial degree for linear trend removal. Can be a single integer applied to all targets or a dictionary specifying the degree for each target variable.
ets_params Optional[Dict[str, Any]] None Dictionary specifying ETS model and fit parameters for each target variable when using ‘ets’ trend removal. Each value is a dictionary of parameters for the ExponentialSmoothing model and fitting process.
change_points Optional[Dict[str, List[int]]] None Dictionary specifying change points for piecewise linear trend removal for each target variable. The value should be a list of integer indices where the trend slope can change.
box_cox Optional[Dict[str, Union[bool, float, int]]] None Dictionary specifying whether to apply Box-Cox transformation for each target variable. The value can be a boolean (True to apply with lambda estimated from data, False to skip) or a float (specific lambda value to use).
box_cox_biasadj Optional[Dict[str, bool]] None Dictionary specifying whether to apply bias adjustment when inverting Box-Cox transformation for each target variable.
cat_variables Optional[List[str]] None List of categorical feature column names to encode. These will be shared across all target variables.
categorical_encoder Optional[Union[Dict[str, Any], Any]] None A categorical encoder instance, or a single-entry dictionary mapping the target column to the encoder when the encoder requires access to the target variable during fitting (e.g. {target_col: MeanEncoder()}). If encoder requiring target access is provided directly without the dict format, first target column in target_cols will be used for fitting the encoder. For encoders that do not require target access, pass the encoder instance directly (e.g. OneHotEncoder()).
Returns None

source

ml_mv_forecaster.fit


def fit(
    df:pd.DataFrame, # Training DataFrame containing all target and feature columns.
)->None:

Fit the model to the data passed in df

Type Details
df pd.DataFrame Training DataFrame containing all target and feature columns.
Returns None

source

ml_mv_forecaster.forecast


def forecast(
    H:int, # Forecast horizon (number of steps to forecast ahead).
    exog:Optional[pd.DataFrame]=None, # Future exogenous regressors (must contain at least H rows).
)->Dict[str, np.ndarray]: # A dictionary where keys are target column names and values are arrays of H forecasted values for each target variable.

Generate forecasts for H future time steps.

Type Default Details
H int Forecast horizon (number of steps to forecast ahead).
exog Optional[pd.DataFrame] None Future exogenous regressors (must contain at least H rows).
Returns Dict[str, np.ndarray] A dictionary where keys are target column names and values are arrays of H forecasted values for each target variable.

source

ml_mv_forecaster.cross_validate


def cross_validate(
    df:pd.DataFrame, # Input dataframe.
    target_col:str, # Target variable for evaluation.
    cv_split:int, # Number of cross-validation folds.
    test_size:int, # Test size per fold.
    metrics:List[Callable], # Metric functions (e.g. ``[MAE, RMSE]``) used to evaluate forecast accuracy across folds. Call ``.cv_summary()`` after cross-validation to retrieve the aggregated scores.
    step_size:int=1, # Step size for rolling window. Default is 1.
    h_split_point:Optional[int]=None, # Point to split the test set for separate evaluation. Default is None.
)->Union[pd.DataFrame, Tuple[pd.DataFrame, pd.DataFrame]]: # DataFrame with overall performance metrics averaged across folds. If h_split_point is provided, also includes separate performance before and after the split point.

Perform cross-validation.

Type Default Details
df pd.DataFrame Input dataframe.
target_col str Target variable for evaluation.
cv_split int Number of cross-validation folds.
test_size int Test size per fold.
metrics List[Callable] Metric functions (e.g. [MAE, RMSE]) used to evaluate forecast accuracy across folds. Call .cv_summary() after cross-validation to retrieve the aggregated scores.
step_size int 1 Step size for rolling window. Default is 1.
h_split_point Optional[int] None Point to split the test set for separate evaluation. Default is None.
Returns Union[pd.DataFrame, Tuple[pd.DataFrame, pd.DataFrame]] DataFrame with overall performance metrics averaged across folds. If h_split_point is provided, also includes separate performance before and after the split point.