ml_mv_forecast – peshbeen

ml_mv_forecaster


def ml_mv_forecaster(
    model:Any, # A scikit-learn compatible regression model instance (e.g. LGBMRegressor(), CatBoostRegressor(), LinearRegression(), etc.).
    target_cols:List[str], # List of target variable names to forecast.
    lags:Optional[Dict[str, Union[int, List[int]]]]=None, # Dictionary specifying lag features to create for each target variable. The value can be an integer (number of lags) or a list of specific lag periods.
    lag_transform:Optional[Dict[str, list]]=None, # Dictionary specifying lag-based transformations to apply for each target variable. The value should be a list of transformation functions (e.g. rolling_mean, expanding_std) with their parameters encapsulated in the function instance.
    difference:Optional[Dict[str, int]]=None, # Dictionary specifying the order of ordinary differencing to apply for each target variable.
    seasonal_diff:Optional[Dict[str, int]]=None, # Dictionary specifying the order of seasonal differencing to apply for each target variable.
    trend:Optional[Dict[str, str]]=None, # Dictionary specifying the trend removal strategy for each target variable. Supported values are 'linear', 'ets', 'feature_lr', and 'feature_ets'.
    pol_degree:Optional[Union[int, Dict[str, int]]]=1, # Polynomial degree for linear trend removal. Can be a single integer applied to all targets or a dictionary specifying the degree for each target variable.
    ets_params:Optional[Dict[str, Any]]=None, # Dictionary specifying ETS model and fit parameters for each target variable when using 'ets' trend removal. Each value is a dictionary of parameters for the ExponentialSmoothing model and fitting process.
    change_points:Optional[Dict[str, List[int]]]=None, # Dictionary specifying change points for piecewise linear trend removal for each target variable. The value should be a list of integer indices where the trend slope can change.
    box_cox:Optional[Dict[str, Union[bool, float, int]]]=None, # Dictionary specifying whether to apply Box-Cox transformation for each target variable. The value can be a boolean (True to apply with lambda estimated from data, False to skip) or a float (specific lambda value to use).
    box_cox_biasadj:Optional[Dict[str, bool]]=None, # Dictionary specifying whether to apply bias adjustment when inverting Box-Cox transformation for each target variable.
    cat_variables:Optional[List[str]]=None, # List of categorical feature column names to encode. These will be shared across all target variables.
    categorical_encoder:Optional[Union[Dict[str, Any], Any]]=None, # A categorical encoder instance, or a single-entry dictionary mapping the target column to the encoder when the encoder requires access to the target variable during fitting (e.g. {target_col: MeanEncoder()}). If encoder requiring target access is provided directly without the dict format, first target column in target_cols will be used for fitting the encoder. For encoders that do not require target access, pass the encoder instance directly (e.g. OneHotEncoder()).
)->None:

“ Initialize the multi-target machine learning forecaster with specified transformations and model.

	Type	Default	Details
model	Any		A scikit-learn compatible regression model instance (e.g. LGBMRegressor(), CatBoostRegressor(), LinearRegression(), etc.).
target_cols	List[str]		List of target variable names to forecast.
lags	Optional[Dict[str, Union[int, List[int]]]]	None	Dictionary specifying lag features to create for each target variable. The value can be an integer (number of lags) or a list of specific lag periods.
lag_transform	Optional[Dict[str, list]]	None	Dictionary specifying lag-based transformations to apply for each target variable. The value should be a list of transformation functions (e.g. rolling_mean, expanding_std) with their parameters encapsulated in the function instance.
difference	Optional[Dict[str, int]]	None	Dictionary specifying the order of ordinary differencing to apply for each target variable.
seasonal_diff	Optional[Dict[str, int]]	None	Dictionary specifying the order of seasonal differencing to apply for each target variable.
trend	Optional[Dict[str, str]]	None	Dictionary specifying the trend removal strategy for each target variable. Supported values are ‘linear’, ‘ets’, ‘feature_lr’, and ‘feature_ets’.
pol_degree	Optional[Union[int, Dict[str, int]]]	1	Polynomial degree for linear trend removal. Can be a single integer applied to all targets or a dictionary specifying the degree for each target variable.
ets_params	Optional[Dict[str, Any]]	None	Dictionary specifying ETS model and fit parameters for each target variable when using ‘ets’ trend removal. Each value is a dictionary of parameters for the ExponentialSmoothing model and fitting process.
change_points	Optional[Dict[str, List[int]]]	None	Dictionary specifying change points for piecewise linear trend removal for each target variable. The value should be a list of integer indices where the trend slope can change.
box_cox	Optional[Dict[str, Union[bool, float, int]]]	None	Dictionary specifying whether to apply Box-Cox transformation for each target variable. The value can be a boolean (True to apply with lambda estimated from data, False to skip) or a float (specific lambda value to use).
box_cox_biasadj	Optional[Dict[str, bool]]	None	Dictionary specifying whether to apply bias adjustment when inverting Box-Cox transformation for each target variable.
cat_variables	Optional[List[str]]	None	List of categorical feature column names to encode. These will be shared across all target variables.
categorical_encoder	Optional[Union[Dict[str, Any], Any]]	None	A categorical encoder instance, or a single-entry dictionary mapping the target column to the encoder when the encoder requires access to the target variable during fitting (e.g. {target_col: MeanEncoder()}). If encoder requiring target access is provided directly without the dict format, first target column in target_cols will be used for fitting the encoder. For encoders that do not require target access, pass the encoder instance directly (e.g. OneHotEncoder()).
Returns	None

source

ml_mv_forecaster.fit


def fit(
    df:pd.DataFrame, # Training DataFrame containing all target and feature columns.
)->None:

Fit the model to the data passed in df

	Type	Details
df	pd.DataFrame	Training DataFrame containing all target and feature columns.
Returns	None

source

ml_mv_forecaster.forecast


def forecast(
    H:int, # Forecast horizon (number of steps to forecast ahead).
    exog:Optional[pd.DataFrame]=None, # Future exogenous regressors (must contain at least H rows).
)->Dict[str, np.ndarray]: # A dictionary where keys are target column names and values are arrays of H forecasted values for each target variable.

Generate forecasts for H future time steps.

	Type	Default	Details
H	int		Forecast horizon (number of steps to forecast ahead).
exog	Optional[pd.DataFrame]	None	Future exogenous regressors (must contain at least H rows).
Returns	Dict[str, np.ndarray]		A dictionary where keys are target column names and values are arrays of H forecasted values for each target variable.

source

ml_mv_forecaster.cross_validate


def cross_validate(
    df:pd.DataFrame, # Input dataframe.
    target_col:str, # Target variable for evaluation.
    cv_split:int, # Number of cross-validation folds.
    test_size:int, # Test size per fold.
    metrics:List[Callable], # Metric functions (e.g. ``[MAE, RMSE]``) used to evaluate forecast accuracy across folds. Call ``.cv_summary()`` after cross-validation to retrieve the aggregated scores.
    step_size:int=1, # Step size for rolling window. Default is 1.
    h_split_point:Optional[int]=None, # Point to split the test set for separate evaluation. Default is None.
)->Union[pd.DataFrame, Tuple[pd.DataFrame, pd.DataFrame]]: # DataFrame with overall performance metrics averaged across folds. If h_split_point is provided, also includes separate performance before and after the split point.

Perform cross-validation.

	Type	Default	Details
df	pd.DataFrame		Input dataframe.
target_col	str		Target variable for evaluation.
cv_split	int		Number of cross-validation folds.
test_size	int		Test size per fold.
metrics	List[Callable]		Metric functions (e.g. `[MAE, RMSE]`) used to evaluate forecast accuracy across folds. Call `.cv_summary()` after cross-validation to retrieve the aggregated scores.
step_size	int	1	Step size for rolling window. Default is 1.
h_split_point	Optional[int]	None	Point to split the test set for separate evaluation. Default is None.
Returns	Union[pd.DataFrame, Tuple[pd.DataFrame, pd.DataFrame]]		DataFrame with overall performance metrics averaged across folds. If h_split_point is provided, also includes separate performance before and after the split point.