

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

## Hyperparameters tuning methods for Univariate machine learning models

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/model_selection.py#L15"
target="_blank" style="float:right; font-size:smaller">source</a>

### SplitTimeSeries

``` python

def SplitTimeSeries(
    n_splits:int, # Number of splits to generate.
    test_size:int, # The number of samples in each test set.
    step_size:Optional=None, # The number of samples to move the test set forward for each split. If None, it defaults to the test_size, meaning non-overlapping test sets.
)->None:

```

*A time series cross-validator that generates train/test splits with a
fixed test size and a configurable step size.*

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>n_splits</td>
<td>int</td>
<td></td>
<td>Number of splits to generate.</td>
</tr>
<tr>
<td>test_size</td>
<td>int</td>
<td></td>
<td>The number of samples in each test set.</td>
</tr>
<tr>
<td>step_size</td>
<td>Union</td>
<td>None</td>
<td>The number of samples to move the test set forward for each split.
If None, it defaults to the test_size, meaning non-overlapping test
sets.</td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td><strong>None</strong></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/model_selection.py#L72"
target="_blank" style="float:right; font-size:smaller">source</a>

### hyperopt_tune

``` python

def hyperopt_tune(
    model:object, # Forecasting model object with .fit and .forecast methods and relevant attributes.
    df:DataFrame, # Time series data with a datetime index and a target column and optionally exogenous features.
    cv_split:int, # Number of cross-validation splits.
    test_size:int, # Number of samples in each test set. For ml_direct_forecaster, this will be overridden to be the maximum horizon in model.H.
    eval_metric:Callable, # Evaluation metric function.
    param_space:dict, # Hyperparameter search space for the forecasting model.
    step_size:int=None, # Step size to move the test window forward in each split.
    eval_num:int=100, # Number of hyperparameter combinations to evaluate. Default is 100.
    verbose:bool=False, # Whether to print the evaluation metric for each hyperparameter combination. Default is False.
)->Tuple: # A tuple containing the best hyperparameters, selected lags, and selected transforms.

```

*Tune forecasting model hyperparameters using time series
cross-validation and hyperopt.*

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>model</td>
<td>object</td>
<td></td>
<td>Forecasting model object with .fit and .forecast methods and
relevant attributes.</td>
</tr>
<tr>
<td>df</td>
<td>DataFrame</td>
<td></td>
<td>Time series data with a datetime index and a target column and
optionally exogenous features.</td>
</tr>
<tr>
<td>cv_split</td>
<td>int</td>
<td></td>
<td>Number of cross-validation splits.</td>
</tr>
<tr>
<td>test_size</td>
<td>int</td>
<td></td>
<td>Number of samples in each test set. For ml_direct_forecaster, this
will be overridden to be the maximum horizon in model.H.</td>
</tr>
<tr>
<td>eval_metric</td>
<td>Callable</td>
<td></td>
<td>Evaluation metric function.</td>
</tr>
<tr>
<td>param_space</td>
<td>dict</td>
<td></td>
<td>Hyperparameter search space for the forecasting model.</td>
</tr>
<tr>
<td>step_size</td>
<td>int</td>
<td>None</td>
<td>Step size to move the test window forward in each split.</td>
</tr>
<tr>
<td>eval_num</td>
<td>int</td>
<td>100</td>
<td>Number of hyperparameter combinations to evaluate. Default is
100.</td>
</tr>
<tr>
<td>verbose</td>
<td>bool</td>
<td>False</td>
<td>Whether to print the evaluation metric for each hyperparameter
combination. Default is False.</td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td><strong>Tuple</strong></td>
<td></td>
<td><strong>A tuple containing the best hyperparameters, selected lags,
and selected transforms.</strong></td>
</tr>
</tbody>
</table>

``` python
other_
```

    {'box_cox': 0.7756701052941546, 'box_cox_biasadj': False}

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/model_selection.py#L223"
target="_blank" style="float:right; font-size:smaller">source</a>

### optuna_tune

``` python

def optuna_tune(
    model:object, # Forecasting model with .fit and .forecast methods.
    df:DataFrame, # Time series data (datetime index, target column, optional exogenous features).
    cv_split:int, # Number of cross-validation splits.
    test_size:int, # Number of samples in each test fold. For ml_direct_forecaster, this will be overridden to be the maximum horizon in model.H.
    eval_metric:Callable, # Metric function to minimise.
    param_space:Dict, # Each value must be a callable that accepts an Optuna `trial` and returns a value.
    step_size:int=None, # Step size between CV folds.
    eval_num:int=100, # Number of Optuna trials. Default 100.
    verbose:bool=False, # Print score for every trial. Default False.
)->Tuple: # Best hyperparameters and best lags (if 'lags' is in param_space).

```

*Tune forecasting model hyperparameters using time series
cross-validation and Optuna.*

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>model</td>
<td>object</td>
<td></td>
<td>Forecasting model with .fit and .forecast methods.</td>
</tr>
<tr>
<td>df</td>
<td>DataFrame</td>
<td></td>
<td>Time series data (datetime index, target column, optional exogenous
features).</td>
</tr>
<tr>
<td>cv_split</td>
<td>int</td>
<td></td>
<td>Number of cross-validation splits.</td>
</tr>
<tr>
<td>test_size</td>
<td>int</td>
<td></td>
<td>Number of samples in each test fold. For ml_direct_forecaster, this
will be overridden to be the maximum horizon in model.H.</td>
</tr>
<tr>
<td>eval_metric</td>
<td>Callable</td>
<td></td>
<td>Metric function to minimise.</td>
</tr>
<tr>
<td>param_space</td>
<td>Dict</td>
<td></td>
<td>Each value must be a callable that accepts an Optuna
<code>trial</code> and returns a value.</td>
</tr>
<tr>
<td>step_size</td>
<td>int</td>
<td>None</td>
<td>Step size between CV folds.</td>
</tr>
<tr>
<td>eval_num</td>
<td>int</td>
<td>100</td>
<td>Number of Optuna trials. Default 100.</td>
</tr>
<tr>
<td>verbose</td>
<td>bool</td>
<td>False</td>
<td>Print score for every trial. Default False.</td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td><strong>Tuple</strong></td>
<td></td>
<td><strong>Best hyperparameters and best lags (if ‘lags’ is in
param_space).</strong></td>
</tr>
</tbody>
</table>

## Hyperparameters tuning methods for Multivariate machine learning models

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/model_selection.py#L376"
target="_blank" style="float:right; font-size:smaller">source</a>

### mv_hyperopt_tune

``` python

def mv_hyperopt_tune(
    model:object, # Forecasting model object with .fit and .forecast methods and relevant attributes.
    df:DataFrame, # Time series data with a datetime index and a target column and optionally exogenous features.
    target_col:str, # Name of the target column to minimize the evaluation metric on.
    cv_split:int, # Number of cross-validation splits.
    test_size:int, # Number of samples in each test set.
    eval_metric:Callable, # Evaluation metric function.
    param_space:dict, # Hyperparameter search space for the forecasting model.
    step_size:int=None, # Step size to move the test window forward in each split.
    eval_num:int=100, # Number of hyperparameter combinations to evaluate. Default is 100.
    verbose:bool=False, # Whether to print the evaluation metric for each hyperparameter combination. Default is False.
)->Tuple: # A tuple containing the best hyperparameters, selected lags, and selected transforms.

```

*Tune forecasting model hyperparameters using time series
cross-validation and hyperopt for multivariate models.*

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>model</td>
<td>object</td>
<td></td>
<td>Forecasting model object with .fit and .forecast methods and
relevant attributes.</td>
</tr>
<tr>
<td>df</td>
<td>DataFrame</td>
<td></td>
<td>Time series data with a datetime index and a target column and
optionally exogenous features.</td>
</tr>
<tr>
<td>target_col</td>
<td>str</td>
<td></td>
<td>Name of the target column to minimize the evaluation metric on.</td>
</tr>
<tr>
<td>cv_split</td>
<td>int</td>
<td></td>
<td>Number of cross-validation splits.</td>
</tr>
<tr>
<td>test_size</td>
<td>int</td>
<td></td>
<td>Number of samples in each test set.</td>
</tr>
<tr>
<td>eval_metric</td>
<td>Callable</td>
<td></td>
<td>Evaluation metric function.</td>
</tr>
<tr>
<td>param_space</td>
<td>dict</td>
<td></td>
<td>Hyperparameter search space for the forecasting model.</td>
</tr>
<tr>
<td>step_size</td>
<td>int</td>
<td>None</td>
<td>Step size to move the test window forward in each split.</td>
</tr>
<tr>
<td>eval_num</td>
<td>int</td>
<td>100</td>
<td>Number of hyperparameter combinations to evaluate. Default is
100.</td>
</tr>
<tr>
<td>verbose</td>
<td>bool</td>
<td>False</td>
<td>Whether to print the evaluation metric for each hyperparameter
combination. Default is False.</td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td><strong>Tuple</strong></td>
<td></td>
<td><strong>A tuple containing the best hyperparameters, selected lags,
and selected transforms.</strong></td>
</tr>
</tbody>
</table>

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/model_selection.py#L497"
target="_blank" style="float:right; font-size:smaller">source</a>

### mv_optuna_tune

``` python

def mv_optuna_tune(
    model:object, # Forecasting model with .fit and .forecast methods.
    df:DataFrame, # Time series data (datetime index, target column, optional exogenous features).
    target_col:str, # Name of the target column to minimize the evaluation metric on.
    cv_split:int, # Number of cross-validation splits.
    test_size:int, # Number of samples in each test fold.
    eval_metric:Callable, # Metric function to minimise.
    param_space:Dict, # Each value must be a callable that accepts an Optuna `trial` and returns a value.
    step_size:int=None, # Step size between CV folds.
    eval_num:int=100, # Number of Optuna trials. Default 100.
    verbose:bool=False, # Print score for every trial. Default False.
)->Tuple: # Best hyperparameters and best lags (if 'lags' is in param_space).

```

*Tune forecasting model hyperparameters using time series
cross-validation and Optuna.*

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>model</td>
<td>object</td>
<td></td>
<td>Forecasting model with .fit and .forecast methods.</td>
</tr>
<tr>
<td>df</td>
<td>DataFrame</td>
<td></td>
<td>Time series data (datetime index, target column, optional exogenous
features).</td>
</tr>
<tr>
<td>target_col</td>
<td>str</td>
<td></td>
<td>Name of the target column to minimize the evaluation metric on.</td>
</tr>
<tr>
<td>cv_split</td>
<td>int</td>
<td></td>
<td>Number of cross-validation splits.</td>
</tr>
<tr>
<td>test_size</td>
<td>int</td>
<td></td>
<td>Number of samples in each test fold.</td>
</tr>
<tr>
<td>eval_metric</td>
<td>Callable</td>
<td></td>
<td>Metric function to minimise.</td>
</tr>
<tr>
<td>param_space</td>
<td>Dict</td>
<td></td>
<td>Each value must be a callable that accepts an Optuna
<code>trial</code> and returns a value.</td>
</tr>
<tr>
<td>step_size</td>
<td>int</td>
<td>None</td>
<td>Step size between CV folds.</td>
</tr>
<tr>
<td>eval_num</td>
<td>int</td>
<td>100</td>
<td>Number of Optuna trials. Default 100.</td>
</tr>
<tr>
<td>verbose</td>
<td>bool</td>
<td>False</td>
<td>Print score for every trial. Default False.</td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td><strong>Tuple</strong></td>
<td></td>
<td><strong>Best hyperparameters and best lags (if ‘lags’ is in
param_space).</strong></td>
</tr>
</tbody>
</table>

## Feature selection methods for univariate time series models

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/model_selection.py#L623"
target="_blank" style="float:right; font-size:smaller">source</a>

### forward_feature_selection

``` python

def forward_feature_selection(
    model:object, # A *configured but unfitted* [`ml_forecaster`](https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ml_forecast.html#ml_forecaster) instance.  The function works exclusively on deep copies and never mutates the object passed in.
    df:DataFrame, # Full training DataFrame. Must contain the target column and any candidate exogenous columns.
    cv_split:int, # Number of time-series cross-validation folds.
    H:int, # Forecast horizon (test window size for each fold).
    step_size:Optional=None, # Step size between consecutive CV folds.  If `None` (default) the step equals `H`, producing non-overlapping folds — consistent with the default behaviour of [`ml_forecaster.cross_validate`](https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ml_forecast.html#ml_forecaster.cross_validate).
    metrics:Union=None, # One or more metric functions accepted by [`ml_forecaster.cross_validate`](https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ml_forecast.html#ml_forecaster.cross_validate) (e.g. `[MAE, RMSE]`). Selection is driven by the **first** metric in the list; a candidate is only accepted when it improves **all** metrics simultaneously.
    lags_to_consider:Optional=None, # Consider lags `1, 2, ..., lags_to_consider` as candidates.  If `None`, lag selection is skipped.
    candidate_features:Optional=None, # Column names in `df` that are exogenous feature candidates.  The function never modifies this list.  If `None`, exogenous feature selection is skipped.
    transformations:Optional=None, # Lag-transform objects to test as candidates (e.g. `[rolling_mean(3, 1), expanding_std(1)]`).  The function never modifies this list.  If `None`, transform selection is skipped.
    starting_lags:Optional=None, # Lags to include in the initial feature set before the search begins. These are *not* candidates — they are always included.  Must be a list (e.g. `[1]` or `[1, 2, 3]`).
    starting_transforms:Optional=None, # Lag-transform objects to include in the initial feature set before the search begins.  Must be a list.
    best_start_score:Optional=None, # Initial best scores for each metric. If not provided, the function will compute the baseline score using the model with the starting features (if any) before beginning the search.
    verbose:bool=False, # Print a message each time a candidate is accepted.
): # A dictionary with keys `best_lags`, `best_exogs`, and `best_transforms` containing the selected features.

```

*Forward stepwise feature selection for
[`ml_forecaster`](https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ml_forecast.html#ml_forecaster)
models.*

At each iteration every remaining candidate (lag, exogenous column, or
lag-transform) is tested individually by adding it to the current best
feature set. The candidate that produces the largest cross-validation
improvement is permanently added. The loop continues until no remaining
candidate improves any of the evaluation metrics.

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>model</td>
<td>object</td>
<td></td>
<td>A <em>configured but unfitted</em> <a
href="https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ml_forecast.html#ml_forecaster"><code>ml_forecaster</code></a>
instance. The function works exclusively on deep copies and never
mutates the object passed in.</td>
</tr>
<tr>
<td>df</td>
<td>DataFrame</td>
<td></td>
<td>Full training DataFrame. Must contain the target column and any
candidate exogenous columns.</td>
</tr>
<tr>
<td>cv_split</td>
<td>int</td>
<td></td>
<td>Number of time-series cross-validation folds.</td>
</tr>
<tr>
<td>H</td>
<td>int</td>
<td></td>
<td>Forecast horizon (test window size for each fold).</td>
</tr>
<tr>
<td>step_size</td>
<td>Union</td>
<td>None</td>
<td>Step size between consecutive CV folds. If <code>None</code>
(default) the step equals <code>H</code>, producing non-overlapping
folds — consistent with the default behaviour of <a
href="https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ml_forecast.html#ml_forecaster.cross_validate"><code>ml_forecaster.cross_validate</code></a>.</td>
</tr>
<tr>
<td>metrics</td>
<td>Union</td>
<td>None</td>
<td>One or more metric functions accepted by <a
href="https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ml_forecast.html#ml_forecaster.cross_validate"><code>ml_forecaster.cross_validate</code></a>
(e.g. <code>[MAE, RMSE]</code>). Selection is driven by the
<strong>first</strong> metric in the list; a candidate is only accepted
when it improves <strong>all</strong> metrics simultaneously.</td>
</tr>
<tr>
<td>lags_to_consider</td>
<td>Union</td>
<td>None</td>
<td>Consider lags <code>1, 2, ..., lags_to_consider</code> as
candidates. If <code>None</code>, lag selection is skipped.</td>
</tr>
<tr>
<td>candidate_features</td>
<td>Union</td>
<td>None</td>
<td>Column names in <code>df</code> that are exogenous feature
candidates. The function never modifies this list. If <code>None</code>,
exogenous feature selection is skipped.</td>
</tr>
<tr>
<td>transformations</td>
<td>Union</td>
<td>None</td>
<td>Lag-transform objects to test as candidates
(e.g. <code>[rolling_mean(3, 1), expanding_std(1)]</code>). The function
never modifies this list. If <code>None</code>, transform selection is
skipped.</td>
</tr>
<tr>
<td>starting_lags</td>
<td>Union</td>
<td>None</td>
<td>Lags to include in the initial feature set before the search begins.
These are <em>not</em> candidates — they are always included. Must be a
list (e.g. <code>[1]</code> or <code>[1, 2, 3]</code>).</td>
</tr>
<tr>
<td>starting_transforms</td>
<td>Union</td>
<td>None</td>
<td>Lag-transform objects to include in the initial feature set before
the search begins. Must be a list.</td>
</tr>
<tr>
<td>best_start_score</td>
<td>Union</td>
<td>None</td>
<td>Initial best scores for each metric. If not provided, the function
will compute the baseline score using the model with the starting
features (if any) before beginning the search.</td>
</tr>
<tr>
<td>verbose</td>
<td>bool</td>
<td>False</td>
<td>Print a message each time a candidate is accepted.</td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td></td>
<td></td>
<td><strong>A dictionary with keys <code>best_lags</code>,
<code>best_exogs</code>, and <code>best_transforms</code> containing the
selected features.</strong></td>
</tr>
</tbody>
</table>

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/model_selection.py#L868"
target="_blank" style="float:right; font-size:smaller">source</a>

### backward_feature_selection

``` python

def backward_feature_selection(
    model:object, # A *configured but unfitted* [`ml_forecaster`](https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ml_forecast.html#ml_forecaster) instance.  The function works exclusively on deep copies and never mutates the object passed in.
    df:DataFrame, # Full training DataFrame. Must contain the target column and any candidate exogenous columns.
    cv_split:int, # Number of time-series cross-validation folds.
    H:int, # Forecast horizon (test window size for each fold).
    step_size:Optional=None, # Step size between consecutive CV folds.  If `None` (default) the step equals `H`, producing non-overlapping folds.
    metrics:Union=None, # One or more metric functions accepted by [`ml_forecaster.cross_validate`](https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ml_forecast.html#ml_forecaster.cross_validate) (e.g. `[MAE, RMSE]`). Selection is driven by the **first** metric in the list; a feature is only removed when doing so improves **all** metrics simultaneously.
    lags_to_consider:Optional=None, # Lags to include in the initial feature set and test for removal (e.g. `[1, 2, 3, 4]`).  If `None`, no lag removal is attempted.
    candidate_features:Optional=None, # Column names in `df` that start in the model and are tested for removal.  If `None`, exogenous feature removal is skipped.
    transformations:Optional=None, # Lag-transform objects that start in the model and are tested for removal (e.g. `[rolling_mean(3, 1), expanding_std(1)]`).  If `None`, transform removal is skipped.
    verbose:bool=False, # Print a message each time a feature is removed.
): # A dictionary with keys `best_lags`, `best_exogs`, and `best_transforms` containing the surviving features after backward selection.

```

*Backward stepwise feature selection for
[`ml_forecaster`](https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ml_forecast.html#ml_forecaster)
models.*

Starts with the full feature set (all provided lags, exogenous columns,
and lag-transforms) and at each iteration tries removing each current
feature individually. The feature whose removal produces the largest
cross-validation improvement is permanently dropped. The loop continues
until no remaining feature can be removed without hurting any of the
evaluation metrics.

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>model</td>
<td>object</td>
<td></td>
<td>A <em>configured but unfitted</em> <a
href="https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ml_forecast.html#ml_forecaster"><code>ml_forecaster</code></a>
instance. The function works exclusively on deep copies and never
mutates the object passed in.</td>
</tr>
<tr>
<td>df</td>
<td>DataFrame</td>
<td></td>
<td>Full training DataFrame. Must contain the target column and any
candidate exogenous columns.</td>
</tr>
<tr>
<td>cv_split</td>
<td>int</td>
<td></td>
<td>Number of time-series cross-validation folds.</td>
</tr>
<tr>
<td>H</td>
<td>int</td>
<td></td>
<td>Forecast horizon (test window size for each fold).</td>
</tr>
<tr>
<td>step_size</td>
<td>Union</td>
<td>None</td>
<td>Step size between consecutive CV folds. If <code>None</code>
(default) the step equals <code>H</code>, producing non-overlapping
folds.</td>
</tr>
<tr>
<td>metrics</td>
<td>Union</td>
<td>None</td>
<td>One or more metric functions accepted by <a
href="https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ml_forecast.html#ml_forecaster.cross_validate"><code>ml_forecaster.cross_validate</code></a>
(e.g. <code>[MAE, RMSE]</code>). Selection is driven by the
<strong>first</strong> metric in the list; a feature is only removed
when doing so improves <strong>all</strong> metrics simultaneously.</td>
</tr>
<tr>
<td>lags_to_consider</td>
<td>Union</td>
<td>None</td>
<td>Lags to include in the initial feature set and test for removal
(e.g. <code>[1, 2, 3, 4]</code>). If <code>None</code>, no lag removal
is attempted.</td>
</tr>
<tr>
<td>candidate_features</td>
<td>Union</td>
<td>None</td>
<td>Column names in <code>df</code> that start in the model and are
tested for removal. If <code>None</code>, exogenous feature removal is
skipped.</td>
</tr>
<tr>
<td>transformations</td>
<td>Union</td>
<td>None</td>
<td>Lag-transform objects that start in the model and are tested for
removal (e.g. <code>[rolling_mean(3, 1), expanding_std(1)]</code>). If
<code>None</code>, transform removal is skipped.</td>
</tr>
<tr>
<td>verbose</td>
<td>bool</td>
<td>False</td>
<td>Print a message each time a feature is removed.</td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td></td>
<td></td>
<td><strong>A dictionary with keys <code>best_lags</code>,
<code>best_exogs</code>, and <code>best_transforms</code> containing the
surviving features after backward selection.</strong></td>
</tr>
</tbody>
</table>

## Feature selection methods for multivariate time series models

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/model_selection.py#L1064"
target="_blank" style="float:right; font-size:smaller">source</a>

### mv_forward_feature_selection

``` python

def mv_forward_feature_selection(
    model:object, # Template model — never mutated.
    df:DataFrame, # DataFrame containing the target variable and any candidate features.
    target_col:str, # Target variable used to evaluate cross-validation score.
    cv_split:int, # Number of time-series cross-validation folds.
    H:int, # Forecast horizon / test size per fold.
    step_size:NoneType=None, # Rolling-window step size (defaults to H).
    metrics:NoneType=None, # One or more metric functions (e.g. `[MAE, RMSE]`). Selection is driven by the **first** metric in the list; a candidate is only accepted when it improves **all** metrics simultaneously.
    lags_to_consider:NoneType=None, # ``{col: max_lag}`` — lags 1..max_lag are candidates.
    candidate_features:NoneType=None, # Exogenous columns to consider adding.
    transformations:NoneType=None, # ``{col: [transform_objects]}`` — transform candidates per target.
    starting_lags:NoneType=None, # Lags already included before search begins.
    starting_transforms:NoneType=None, # Transforms already included before search begins.
    verbose:bool=False
): # `{"best_lags": {col: [...]}, "best_exogs": [...],
   "best_transforms": {col: [name_str, ...]}}`

```

*Forward stepwise feature selection for
[`ml_mv_forecaster`](https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ml_mv_forecast.html#ml_mv_forecaster).*

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>model</td>
<td>object</td>
<td></td>
<td>Template model — never mutated.</td>
</tr>
<tr>
<td>df</td>
<td>DataFrame</td>
<td></td>
<td>DataFrame containing the target variable and any candidate
features.</td>
</tr>
<tr>
<td>target_col</td>
<td>str</td>
<td></td>
<td>Target variable used to evaluate cross-validation score.</td>
</tr>
<tr>
<td>cv_split</td>
<td>int</td>
<td></td>
<td>Number of time-series cross-validation folds.</td>
</tr>
<tr>
<td>H</td>
<td>int</td>
<td></td>
<td>Forecast horizon / test size per fold.</td>
</tr>
<tr>
<td>step_size</td>
<td>NoneType</td>
<td>None</td>
<td>Rolling-window step size (defaults to H).</td>
</tr>
<tr>
<td>metrics</td>
<td>NoneType</td>
<td>None</td>
<td>One or more metric functions (e.g. <code>[MAE, RMSE]</code>).
Selection is driven by the <strong>first</strong> metric in the list; a
candidate is only accepted when it improves <strong>all</strong> metrics
simultaneously.</td>
</tr>
<tr>
<td>lags_to_consider</td>
<td>NoneType</td>
<td>None</td>
<td><code>{col: max_lag}</code> — lags 1..max_lag are candidates.</td>
</tr>
<tr>
<td>candidate_features</td>
<td>NoneType</td>
<td>None</td>
<td>Exogenous columns to consider adding.</td>
</tr>
<tr>
<td>transformations</td>
<td>NoneType</td>
<td>None</td>
<td><code>{col: [transform_objects]}</code> — transform candidates per
target.</td>
</tr>
<tr>
<td>starting_lags</td>
<td>NoneType</td>
<td>None</td>
<td>Lags already included before search begins.</td>
</tr>
<tr>
<td>starting_transforms</td>
<td>NoneType</td>
<td>None</td>
<td>Transforms already included before search begins.</td>
</tr>
<tr>
<td>verbose</td>
<td>bool</td>
<td>False</td>
<td></td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td></td>
<td></td>
<td><strong><code>{"best_lags": {col: [...]}, "best_exogs": [...],    "best_transforms": {col: [name_str, ...]}}</code></strong></td>
</tr>
</tbody>
</table>

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/model_selection.py#L1276"
target="_blank" style="float:right; font-size:smaller">source</a>

### mv_backward_feature_selection

``` python

def mv_backward_feature_selection(
    model:object, # Template model — never mutated.
    df:DataFrame, # All candidate exog columns must already be present.
    target_col:str, # Target variable used to evaluate cross-validation score.
    cv_split:int, H:int, # Forecast horizon / test size per fold.
    step_size:NoneType=None, # Rolling-window step size (defaults to H).
    metrics:NoneType=None, # One or more metric functions (e.g. `[MAE, RMSE]`). A feature is only removed when its removal improves **all** metrics simultaneously.
    lags_to_consider:NoneType=None, # ``{col: max_lag}`` — all lags 1..max_lag start as selected.
    candidate_features:NoneType=None, # Exogenous columns that start as selected.
    transformations:NoneType=None, # ``{col: [transform_objects]}`` — all transforms start as selected.
    verbose:bool=False
): # ``{"best_lags": {col: [...]}, "best_exogs": [...],
   "best_transforms": {col: [name_str, ...]}}``

```

*Backward stepwise feature selection for
`[`ml_mv_forecaster`](https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ml_mv_forecast.html#ml_mv_forecaster)`.*

Starts with all candidate features included and iteratively removes the
one whose removal most improves cross-validation score.

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>model</td>
<td>object</td>
<td></td>
<td>Template model — never mutated.</td>
</tr>
<tr>
<td>df</td>
<td>DataFrame</td>
<td></td>
<td>All candidate exog columns must already be present.</td>
</tr>
<tr>
<td>target_col</td>
<td>str</td>
<td></td>
<td>Target variable used to evaluate cross-validation score.</td>
</tr>
<tr>
<td>cv_split</td>
<td>int</td>
<td></td>
<td></td>
</tr>
<tr>
<td>H</td>
<td>int</td>
<td></td>
<td>Forecast horizon / test size per fold.</td>
</tr>
<tr>
<td>step_size</td>
<td>NoneType</td>
<td>None</td>
<td>Rolling-window step size (defaults to H).</td>
</tr>
<tr>
<td>metrics</td>
<td>NoneType</td>
<td>None</td>
<td>One or more metric functions (e.g. <code>[MAE, RMSE]</code>). A
feature is only removed when its removal improves <strong>all</strong>
metrics simultaneously.</td>
</tr>
<tr>
<td>lags_to_consider</td>
<td>NoneType</td>
<td>None</td>
<td><code>{col: max_lag}</code> — all lags 1..max_lag start as
selected.</td>
</tr>
<tr>
<td>candidate_features</td>
<td>NoneType</td>
<td>None</td>
<td>Exogenous columns that start as selected.</td>
</tr>
<tr>
<td>transformations</td>
<td>NoneType</td>
<td>None</td>
<td><code>{col: [transform_objects]}</code> — all transforms start as
selected.</td>
</tr>
<tr>
<td>verbose</td>
<td>bool</td>
<td>False</td>
<td></td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td></td>
<td></td>
<td><strong><code>{"best_lags": {col: [...]}, "best_exogs": [...],    "best_transforms": {col: [name_str, ...]}}</code></strong></td>
</tr>
</tbody>
</table>

## Feature selection methods for Markov Switching Autoregressive Regression

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/model_selection.py#L1448"
target="_blank" style="float:right; font-size:smaller">source</a>

### ms_arr_forward_feature_selection

``` python

def ms_arr_forward_feature_selection(
    model:object, # A configured [`ms_arr`](https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ms_arr.html#ms_arr) instance with `fit_em()` already called (recommended to use few EM iterations for this initial fit, e.g. `iterations=10`) or a template model with the same configuration but not yet fitted.  The model is copied internally and never mutated, so the caller's instance remains unchanged.
    df:DataFrame, # Full training DataFrame. Must contain the target column and any candidate exogenous columns.
    cv_split:int, # Number of time-series cross-validation folds.
    H:int, # Forecast horizon (test window size for each fold).
    step_size:Optional=None, # Step size between consecutive CV folds. Defaults to H.
    metrics:Union=None, # Required when validation_type='cv'. Selection driven by first metric; a candidate is accepted only when it improves all metrics.
    lags_to_consider:Union=None, # Candidate lags. Int → 1..n; list → specific lags.
    candidate_features:Optional=None, # Exogenous column names to test as candidates.
    transformations:Optional=None, # Lag-transform objects to test as candidates.
    starting_lags:Optional=None, # Lags always included in the initial set (not candidates).
    starting_transforms:Optional=None, # Transforms always included in the initial set (not candidates).
    validation_type:str='cv', # Criterion for selection: 'cv', 'AIC', 'BIC', or 'AIC_BIC'. When 'cv', metrics must be provided and drive selection. When 'AIC' or 'BIC', the respective information criterion is used. When 'AIC_BIC', a candidate is accepted only if it improves both AIC and BIC.
    iterations:int=10, # EM iterations used inside fit_em() for each candidate evaluation.
    verbose:bool=False, # Print a message each time a candidate is accepted.
): # `{"best_lags": [...], "best_exogs": [...], "best_transforms": [...]}`

```

*Forward stepwise feature selection for
[`ms_arr`](https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ms_arr.html#ms_arr)
models.*

At each iteration every remaining candidate (lag, exogenous column, or
lag-transform) is tested individually by adding it to the current best
feature set. The candidate that produces the largest improvement is
permanently added. The loop continues until no remaining candidate
improves the evaluation criterion.

The HMM state (A, pi, stds, coeffs) is warm-started from the round
winner and propagated to subsequent rounds for consistent
initialisation.

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/model_selection.py#L1064"
target="_blank" style="float:right; font-size:smaller">source</a>

### mv_forward_feature_selection

``` python

def mv_forward_feature_selection(
    model:object, # Template model — never mutated.
    df:DataFrame, # DataFrame containing the target variable and any candidate features.
    target_col:str, # Target variable used to evaluate cross-validation score.
    cv_split:int, # Number of time-series cross-validation folds.
    H:int, # Forecast horizon / test size per fold.
    step_size:NoneType=None, # Rolling-window step size (defaults to H).
    metrics:NoneType=None, # One or more metric functions (e.g. `[MAE, RMSE]`). Selection is driven by the **first** metric in the list; a candidate is only accepted when it improves **all** metrics simultaneously.
    lags_to_consider:NoneType=None, # ``{col: max_lag}`` — lags 1..max_lag are candidates.
    candidate_features:NoneType=None, # Exogenous columns to consider adding.
    transformations:NoneType=None, # ``{col: [transform_objects]}`` — transform candidates per target.
    starting_lags:NoneType=None, # Lags already included before search begins.
    starting_transforms:NoneType=None, # Transforms already included before search begins.
    verbose:bool=False
): # `{"best_lags": {col: [...]}, "best_exogs": [...],
   "best_transforms": {col: [name_str, ...]}}`

```

*Forward stepwise feature selection for
[`ml_mv_forecaster`](https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ml_mv_forecast.html#ml_mv_forecaster).*

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>model</td>
<td>object</td>
<td></td>
<td>Template model — never mutated.</td>
</tr>
<tr>
<td>df</td>
<td>DataFrame</td>
<td></td>
<td>DataFrame containing the target variable and any candidate
features.</td>
</tr>
<tr>
<td>target_col</td>
<td>str</td>
<td></td>
<td>Target variable used to evaluate cross-validation score.</td>
</tr>
<tr>
<td>cv_split</td>
<td>int</td>
<td></td>
<td>Number of time-series cross-validation folds.</td>
</tr>
<tr>
<td>H</td>
<td>int</td>
<td></td>
<td>Forecast horizon / test size per fold.</td>
</tr>
<tr>
<td>step_size</td>
<td>NoneType</td>
<td>None</td>
<td>Rolling-window step size (defaults to H).</td>
</tr>
<tr>
<td>metrics</td>
<td>NoneType</td>
<td>None</td>
<td>One or more metric functions (e.g. <code>[MAE, RMSE]</code>).
Selection is driven by the <strong>first</strong> metric in the list; a
candidate is only accepted when it improves <strong>all</strong> metrics
simultaneously.</td>
</tr>
<tr>
<td>lags_to_consider</td>
<td>NoneType</td>
<td>None</td>
<td><code>{col: max_lag}</code> — lags 1..max_lag are candidates.</td>
</tr>
<tr>
<td>candidate_features</td>
<td>NoneType</td>
<td>None</td>
<td>Exogenous columns to consider adding.</td>
</tr>
<tr>
<td>transformations</td>
<td>NoneType</td>
<td>None</td>
<td><code>{col: [transform_objects]}</code> — transform candidates per
target.</td>
</tr>
<tr>
<td>starting_lags</td>
<td>NoneType</td>
<td>None</td>
<td>Lags already included before search begins.</td>
</tr>
<tr>
<td>starting_transforms</td>
<td>NoneType</td>
<td>None</td>
<td>Transforms already included before search begins.</td>
</tr>
<tr>
<td>verbose</td>
<td>bool</td>
<td>False</td>
<td></td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td></td>
<td></td>
<td><strong><code>{"best_lags": {col: [...]}, "best_exogs": [...],    "best_transforms": {col: [name_str, ...]}}</code></strong></td>
</tr>
</tbody>
</table>

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/model_selection.py#L1714"
target="_blank" style="float:right; font-size:smaller">source</a>

### ms_arr_backward_feature_selection

``` python

def ms_arr_backward_feature_selection(
    df:DataFrame, # Full training DataFrame. All candidate exogenous columns must be present.
    cv_split:int, # Number of time-series cross-validation folds.
    H:int, # Forecast horizon (test window size for each fold).
    step_size:Optional=None, # Step size between consecutive CV folds. Defaults to H.
    model:object=None, # A configured but unfitted ms_arr instance. Never mutated.
    metrics:Union=None, # Required when validation_type='cv'. A feature is only removed when
doing so improves all metrics simultaneously.
    lags_to_consider:Union=None, # Initial lag set. Int → 1..n; list → specific lags.
    candidate_features:Optional=None, # Exogenous columns that start in the model and are tested for removal.
    transformations:Optional=None, # Lag-transform objects that start in the model and are tested for removal.
    validation_type:str='cv', # Criterion for selection: 'cv', 'AIC', 'BIC', or 'AIC_BIC'.
    iterations:int=100, # EM iterations used inside fit_em() for each candidate evaluation.
    verbose:bool=False, # Print a message each time a feature is removed.
): # `{"best_lags": [...], "best_exogs": [...], "best_transforms": [...]}`

```

*Backward stepwise feature selection for
[`ms_arr`](https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ms_arr.html#ms_arr)
models.*

Starts with the full feature set and at each iteration tries removing
each current feature individually. The feature whose removal produces
the largest improvement is permanently dropped. The loop continues until
no removal improves the evaluation criterion.

The HMM state (A, pi, stds, coeffs) is warm-started from the round
winner and propagated to subsequent rounds.

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/model_selection.py#L1276"
target="_blank" style="float:right; font-size:smaller">source</a>

### mv_backward_feature_selection

``` python

def mv_backward_feature_selection(
    model:object, # Template model — never mutated.
    df:DataFrame, # All candidate exog columns must already be present.
    target_col:str, # Target variable used to evaluate cross-validation score.
    cv_split:int, H:int, # Forecast horizon / test size per fold.
    step_size:NoneType=None, # Rolling-window step size (defaults to H).
    metrics:NoneType=None, # One or more metric functions (e.g. `[MAE, RMSE]`). A feature is only removed when its removal improves **all** metrics simultaneously.
    lags_to_consider:NoneType=None, # ``{col: max_lag}`` — all lags 1..max_lag start as selected.
    candidate_features:NoneType=None, # Exogenous columns that start as selected.
    transformations:NoneType=None, # ``{col: [transform_objects]}`` — all transforms start as selected.
    verbose:bool=False
): # ``{"best_lags": {col: [...]}, "best_exogs": [...],
   "best_transforms": {col: [name_str, ...]}}``

```

*Backward stepwise feature selection for
`[`ml_mv_forecaster`](https://mustafaslanCoto.github.io/peshbeen/modules/02_models/ml_mv_forecast.html#ml_mv_forecaster)`.*

Starts with all candidate features included and iteratively removes the
one whose removal most improves cross-validation score.

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>model</td>
<td>object</td>
<td></td>
<td>Template model — never mutated.</td>
</tr>
<tr>
<td>df</td>
<td>DataFrame</td>
<td></td>
<td>All candidate exog columns must already be present.</td>
</tr>
<tr>
<td>target_col</td>
<td>str</td>
<td></td>
<td>Target variable used to evaluate cross-validation score.</td>
</tr>
<tr>
<td>cv_split</td>
<td>int</td>
<td></td>
<td></td>
</tr>
<tr>
<td>H</td>
<td>int</td>
<td></td>
<td>Forecast horizon / test size per fold.</td>
</tr>
<tr>
<td>step_size</td>
<td>NoneType</td>
<td>None</td>
<td>Rolling-window step size (defaults to H).</td>
</tr>
<tr>
<td>metrics</td>
<td>NoneType</td>
<td>None</td>
<td>One or more metric functions (e.g. <code>[MAE, RMSE]</code>). A
feature is only removed when its removal improves <strong>all</strong>
metrics simultaneously.</td>
</tr>
<tr>
<td>lags_to_consider</td>
<td>NoneType</td>
<td>None</td>
<td><code>{col: max_lag}</code> — all lags 1..max_lag start as
selected.</td>
</tr>
<tr>
<td>candidate_features</td>
<td>NoneType</td>
<td>None</td>
<td>Exogenous columns that start as selected.</td>
</tr>
<tr>
<td>transformations</td>
<td>NoneType</td>
<td>None</td>
<td><code>{col: [transform_objects]}</code> — all transforms start as
selected.</td>
</tr>
<tr>
<td>verbose</td>
<td>bool</td>
<td>False</td>
<td></td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td></td>
<td></td>
<td><strong><code>{"best_lags": {col: [...]}, "best_exogs": [...],    "best_transforms": {col: [name_str, ...]}}</code></strong></td>
</tr>
</tbody>
</table>
