| Type | Default | Details | |
|---|---|---|---|
| target_col | str | Name of the target variable column in the input DataFrame. | |
| order | Optional[Tuple[int, int, int]] | (0, 0, 0) | The (p, d, q) order of the ARIMA model. Default is (0, 0, 0). |
| seasonal_order | Optional[Tuple[int, int, int]] | (0, 0, 0) | The (P, D, Q) order of the seasonal ARIMA model. Default is (0, 0, 0). |
| seasonal_length | Optional[int] | 1 | The seasonal period for the seasonal ARIMA model. Default is 1. |
| lag_transform | Optional[list] | None | List of lag-transform function objects to apply to the target variable (e.g. [expanding_mean(shift=1), rolling_std(window_size=3, shift=1)]). Each function should take a pandas Series as input and return a Series of the same length. Default is None (no lag transforms). |
| trend | Optional[str] | None | Trend strategy to use. Options are ‘linear’ for linear trend removal, ‘ets’ for ETS-based trend removal, ‘feature_lr’ for using linear trend components as features, and ‘feature_ets’ for using ETS trend components as features. Default is None (no trend handling). |
| pol_degree | int | 1 | Degree of polynomial trend to fit when using ‘linear’ or ‘feature_lr’ trend strategy. Default is 1 (linear trend). |
| ets_params | Optional[Dict[str, Any]] | None | Dictionary of parameters for the ExponentialSmoothing model when using ‘ets’ trend strategy. The keys should be the parameter names and the values should be the parameter values. Default is None (use default ETS parameters). |
| change_points | Optional[List[int]] | None | List of indices in the time series where change points occur for piecewise linear trend fitting. Only used when trend strategy is ‘linear’ or ‘feature_lr’. Default is None (no change points, fit a single linear trend). |
| box_cox | Union[bool, float, int] | False | Whether to apply Box-Cox transformation to the target variable. If a float or int value is provided, it will be used as the lambda parameter for the Box-Cox transformation. If True, the lambda parameter will be estimated from the data. |
| box_cox_biasadj | bool | False | Whether to apply bias adjustment when inverting the Box-Cox transformation on forecasts. Default is False. |
| cat_variables | Optional[List[str]] | None | List of categorical feature column names. If provided, these columns will be treated as categorical variables and encoded accordingly. Default is None (no categorical variables). |
| categorical_encoder | Optional[Any] | None | Categorical encoder object (e.g. OneHotEncoder(), MeanEncoder(), etc.) to apply to the categorical variables specified in cat_variables. The encoder should have fit() and transform() methods that can be applied to the input DataFrame. Default is None (no categorical encoding) and if None, categorical variables can only be used if the model can handle them natively (e.g. LGBM or CatBoost). |
| target_encode | bool | False | |
| Returns | None |
arima
def arima(
target_col:str, # Name of the target variable column in the input DataFrame.
order:Optional[Tuple[int, int, int]]=(0, 0, 0), # The (p, d, q) order of the ARIMA model. Default is (0, 0, 0).
seasonal_order:Optional[Tuple[int, int, int]]=(0, 0, 0), # The (P, D, Q) order of the seasonal ARIMA model. Default is (0, 0, 0).
seasonal_length:Optional[int]=1, # The seasonal period for the seasonal ARIMA model. Default is 1.
lag_transform:Optional[list]=None, # List of lag-transform function objects to apply to the target variable (e.g. [expanding_mean(shift=1), rolling_std(window_size=3, shift=1)]). Each function should take a pandas Series as input and return a Series of the same length. Default is None (no lag transforms).
trend:Optional[str]=None, # Trend strategy to use. Options are 'linear' for linear trend removal, 'ets' for ETS-based trend removal, 'feature_lr' for using linear trend components as features, and 'feature_ets' for using ETS trend components as features. Default is None (no trend handling).
pol_degree:int=1, # Degree of polynomial trend to fit when using 'linear' or 'feature_lr' trend strategy. Default is 1 (linear trend).
ets_params:Optional[Dict[str, Any]]=None, # Dictionary of parameters for the ExponentialSmoothing model when using 'ets' trend strategy. The keys should be the parameter names and the values should be the parameter values. Default is None (use default ETS parameters).
change_points:Optional[List[int]]=None, # List of indices in the time series where change points occur for piecewise linear trend fitting. Only used when trend strategy is 'linear' or 'feature_lr'. Default is None (no change points, fit a single linear trend).
box_cox:Union[bool, float, int]=False, # Whether to apply Box-Cox transformation to the target variable. If a float or int value is provided, it will be used as the lambda parameter for the Box-Cox transformation. If True, the lambda parameter will be estimated from the data.
box_cox_biasadj:bool=False, # Whether to apply bias adjustment when inverting the Box-Cox transformation on forecasts. Default is False.
cat_variables:Optional[List[str]]=None, # List of categorical feature column names. If provided, these columns will be treated as categorical variables and encoded accordingly. Default is None (no categorical variables).
categorical_encoder:Optional[Any]=None, # Categorical encoder object (e.g. OneHotEncoder(), MeanEncoder(), etc.) to apply to the categorical variables specified in cat_variables. The encoder should have fit() and transform() methods that can be applied to the input DataFrame. Default is None (no categorical encoding) and if None, categorical variables can only be used if the model can handle them natively (e.g. LGBM or CatBoost).
target_encode:bool=False
)->None:
Initialize the arima model with the specified parameters and configurations.
arima.fit
def fit(
df:pd.DataFrame, # Training DataFrame containing the target and any feature columns.
)->None:
Fit the model to the training data by applying the specified data preparation steps and then fitting the ARIMA model.
| Type | Details | |
|---|---|---|
| df | pd.DataFrame | Training DataFrame containing the target and any feature columns. |
| Returns | None |
arima.forecast
def forecast(
H:int, # Forecast horizon.
exog:Optional[pd.DataFrame]=None, # Optional dataframe of future regressors.
)->np.ndarray: # Forecast values of length `H`.
Recursive multi-step forecast.
| Type | Default | Details | |
|---|---|---|---|
| H | int | Forecast horizon. | |
| exog | Optional[pd.DataFrame] | None | Optional dataframe of future regressors. |
| Returns | np.ndarray | Forecast values of length H. |
arima.cross_validate
def cross_validate(
df:pd.DataFrame, # DataFrame containing the target and any feature columns.
cv_split:int, # Number of cross-validation splits.
test_size:int, # Number of periods in each test set.
metrics:List[Callable], # Metric functions (e.g. ``[MAE, RMSE]``) used to evaluate forecast accuracy across folds. Call ``.cv_summary()`` after cross-validation to retrieve the aggregated scores.
step_size:int=1, # Step size to move the test window forward in each split.
h_split_point:Optional[int]=None, # Optional index to split the test set into two parts for separate evaluation (e.g. to evaluate short-term vs long-term performance). If None, no split is done.
)->Tuple[pd.DataFrame, pd.DataFrame]: # DataFrame containing overall performance metrics averaged across splits, and a DataFrame with predictions and true values for each split.
Run cross-validation using time series splits.
| Type | Default | Details | |
|---|---|---|---|
| df | pd.DataFrame | DataFrame containing the target and any feature columns. | |
| cv_split | int | Number of cross-validation splits. | |
| test_size | int | Number of periods in each test set. | |
| metrics | List[Callable] | Metric functions (e.g. [MAE, RMSE]) used to evaluate forecast accuracy across folds. Call .cv_summary() after cross-validation to retrieve the aggregated scores. |
|
| step_size | int | 1 | Step size to move the test window forward in each split. |
| h_split_point | Optional[int] | None | Optional index to split the test set into two parts for separate evaluation (e.g. to evaluate short-term vs long-term performance). If None, no split is done. |
| Returns | Tuple[pd.DataFrame, pd.DataFrame] | DataFrame containing overall performance metrics averaged across splits, and a DataFrame with predictions and true values for each split. |