| Type | Default | Details | |
|---|---|---|---|
| model | Any | A regression model object (e.g. LGBMRegressor(), XGBRegressor(), LinearRegression(), etc.) | |
| target_col | str | Name of the target variable column in the input DataFrame. | |
| H | Optional[Union[int, List[int]]] | Forecast horizon(s). If int, forecasts 1..H. If list, forecasts specified horizons. | |
| lags | Optional[Union[int, List[int]]] | None | Lags to include as features. Default is None. |
| lag_transform | Optional[list] | None | Lag-transform functions to apply to the target variable. Default is None. |
| difference | Optional[int] | None | Order of ordinary differencing. Default is None. |
| seasonal_diff | Optional[int] | None | Seasonal period for seasonal differencing. Default is None. |
| trend | Optional[str] | None | Trend strategy: ‘linear’ or ‘ets’. Default is None. |
| pol_degree | int | 1 | Polynomial degree for linear trend. Default is 1. |
| ets_params | Optional[Dict[str, Any]] | None | Parameters for ExponentialSmoothing when trend=‘ets’. Default is None. |
| change_points | Optional[List[int]] | None | Breakpoint indices for piecewise linear trend. Default is None. |
| box_cox | Union[bool, float, int] | False | Box-Cox transformation. If float/int, used as lambda. If True, lambda is estimated. Default is False. |
| box_cox_biasadj | bool | False | Bias adjustment when inverting Box-Cox. Default is False. |
| cat_variables | Optional[List[str]] | None | List of categorical feature column names. If provided, these columns will be treated as categorical variables and encoded accordingly. Default is None (no categorical variables). |
| categorical_encoder | Optional[Any] | None | Categorical encoder object (e.g. OneHotEncoder(), MeanEncoder(), etc.) to apply to the categorical variables specified in cat_variables. The encoder should have fit() and transform() methods that can be applied to the input DataFrame. Default is None (no categorical encoding) and if None, categorical variables can only be used if the model can handle them natively (e.g. LGBM or CatBoost). |
| Returns | None |
ml_direct_forecaster
def ml_direct_forecaster(
model:Any, # A regression model object (e.g. LGBMRegressor(), XGBRegressor(), LinearRegression(), etc.)
target_col:str, # Name of the target variable column in the input DataFrame.
H:Optional[Union[int, List[int]]], # Forecast horizon(s). If int, forecasts 1..H. If list, forecasts specified horizons.
lags:Optional[Union[int, List[int]]]=None, # Lags to include as features. Default is None.
lag_transform:Optional[list]=None, # Lag-transform functions to apply to the target variable. Default is None.
difference:Optional[int]=None, # Order of ordinary differencing. Default is None.
seasonal_diff:Optional[int]=None, # Seasonal period for seasonal differencing. Default is None.
trend:Optional[str]=None, # Trend strategy: 'linear' or 'ets'. Default is None.
pol_degree:int=1, # Polynomial degree for linear trend. Default is 1.
ets_params:Optional[Dict[str, Any]]=None, # Parameters for ExponentialSmoothing when trend='ets'. Default is None.
change_points:Optional[List[int]]=None, # Breakpoint indices for piecewise linear trend. Default is None.
box_cox:Union[bool, float, int]=False, # Box-Cox transformation. If float/int, used as lambda. If True, lambda is estimated. Default is False.
box_cox_biasadj:bool=False, # Bias adjustment when inverting Box-Cox. Default is False.
cat_variables:Optional[List[str]]=None, # List of categorical feature column names. If provided, these columns will be treated as categorical variables and encoded accordingly. Default is None (no categorical variables).
categorical_encoder:Optional[Any]=None, # Categorical encoder object (e.g. OneHotEncoder(), MeanEncoder(), etc.) to apply to the categorical variables specified in cat_variables. The encoder should have fit() and transform() methods that can be applied to the input DataFrame. Default is None (no categorical encoding) and if None, categorical variables can only be used if the model can handle them natively (e.g. LGBM or CatBoost).
)->None:
Initialize the ml_direct_forecaster with the specified model and preprocessing options. Unlike ml_forecaster, this class uses a direct forecasting strategy: a separate model is trained for each horizon h.
ml_direct_forecaster.fit
def fit(
df:pd.DataFrame, # Training DataFrame containing the target and any feature columns.
)->None:
Fit a separate model for each horizon h in 1..H. For each h, the target is shifted h steps forward so the model learns to predict the value h steps ahead directly from the current lag features, bypassing recursive error accumulation.
| Type | Details | |
|---|---|---|
| df | pd.DataFrame | Training DataFrame containing the target and any feature columns. |
| Returns | None |
ml_direct_forecaster.forecast
def forecast(
H:int, # Forecast horizon. Must be <= self.H (models are only trained up to self.H).
exog:Optional[pd.DataFrame]=None, # Optional future exogenous variables (H rows).
)->np.ndarray: # Forecast values of length H.
Generate direct multi-step forecasts. Each horizon h is predicted independently by its own model using the most recent lag features — no predictions are fed back as inputs.
| Type | Default | Details | |
|---|---|---|---|
| H | int | Forecast horizon. Must be <= self.H (models are only trained up to self.H). | |
| exog | Optional[pd.DataFrame] | None | Optional future exogenous variables (H rows). |
| Returns | np.ndarray | Forecast values of length H. |
ml_direct_forecaster.cross_validate
def cross_validate(
df:pd.DataFrame, # DataFrame containing the target and any feature columns.
cv_split:int, # Number of cross-validation splits.
metrics:List[Callable], # Metric functions (e.g. ``[MAE, RMSE]``) used to evaluate forecast accuracy across folds. Call ``.cv_summary()`` after cross-validation to retrieve the aggregated scores.
step_size:int=1, # Step size to move the test window forward in each split.
h_split_point:Optional[int]=None, # Optional index to split the test set into two parts for separate evaluation (e.g. to evaluate short-term vs long-term performance). If None, no split is done.
)->Tuple[pd.DataFrame, pd.DataFrame]: # DataFrame containing overall performance metrics averaged across splits, and a DataFrame with predictions and true values for each split.
Run cross-validation using time series splits.
| Type | Default | Details | |
|---|---|---|---|
| df | pd.DataFrame | DataFrame containing the target and any feature columns. | |
| cv_split | int | Number of cross-validation splits. | |
| metrics | List[Callable] | Metric functions (e.g. [MAE, RMSE]) used to evaluate forecast accuracy across folds. Call .cv_summary() after cross-validation to retrieve the aggregated scores. |
|
| step_size | int | 1 | Step size to move the test window forward in each split. |
| h_split_point | Optional[int] | None | Optional index to split the test set into two parts for separate evaluation (e.g. to evaluate short-term vs long-term performance). If None, no split is done. |
| Returns | Tuple[pd.DataFrame, pd.DataFrame] | DataFrame containing overall performance metrics averaged across splits, and a DataFrame with predictions and true values for each split. |