

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/models/ms_arr.py#L28"
target="_blank" style="float:right; font-size:smaller">source</a>

### ms_arr

``` python

def ms_arr(
    n_components:int, # Number of hidden states (regimes).
    target_col:str, # Name of the target variable.
    lags:Optional[Union[int, List[int]]]=None, # Lags for the autoregressive model.
    lag_transform:Optional[list]=None, # List of lag-transform function objects applied to the target.
    difference:Optional[int]=None, # Order of ordinary differencing (e.g. 1 for first difference).
    seasonal_diff:Optional[int]=None, # Seasonal period for seasonal differencing.
    trend:Optional[str]=None, # Trend strategy: 'linear' or 'ets'.
    pol_degree:int=1, # Degree of polynomial trend (default: 1). Used when trend='linear'.
    ets_params:Optional[Dict[str, Any]]=None, # Dictionary of parameters for the ExponentialSmoothing model when using 'ets' trend strategy. The keys should be the parameter names and the values should be the parameter values. Default is None (use default ETS parameters).
    change_points:Optional[List[int]]=None, # Change points for piecewise linear trend. List of indices where the trend slope can change.
    box_cox:Union[bool, float, int]=False, # Whether to apply Box-Cox transformation to the target variable. If a float or int value is provided, it will be used as the lambda parameter for the Box-Cox transformation. If True, the lambda parameter will be estimated from the data.
    box_cox_biasadj:bool=False, # Whether to apply bias adjustment when inverting Box-Cox transformation (default: False).
    add_constant:bool=True, # If True, prepend a constant column to the regressor matrix (default: True).
    cat_variables:Optional[List[str]]=None, # List of categorical feature column names. If provided, these columns will be treated as categorical variables and encoded accordingly. Default is None (no categorical variables).
    categorical_encoder:Optional[Any]=None, # Categorical encoder object (e.g. OneHotEncoder(), MeanEncoder(), etc.) to apply to the categorical variables specified in cat_variables. The encoder should have fit() and transform() methods that can be applied to the input DataFrame. Default is None (no categorical encoding) and if None, categorical variables can only be used if the model can handle them natively (e.g. LGBM or CatBoost).
    method:str='posterior', # State assignment method: 'posterior' (soft) or 'viterbi' (hard). Default: 'posterior'.
    switching_var:bool=True, # If True, each regime has its own variance. If False, uses pooled variance. Default: True.
    startprob_prior:float=1000.0, # Dirichlet concentration for initial state distribution. Default: 1e3.
    transmat_prior:float=1000.0, # Dirichlet concentration for transition matrix rows. Default: 1e3.
    n_iter:int=100, # Maximum EM iterations. Default: 100.
    tol:float=0.001, # Convergence tolerance on log-likelihood. Default: 1e-6.
    ridge:float=1e-05, # Ridge regularisation parameter for coefficient estimation. Default: 1e-5.
    coefficients:Optional[np.ndarray]=None, # Initial regression coefficients (shape: n_states x n_features).
    stds:Optional[np.ndarray]=None, # Initial state standard deviations (shape: n_states,).
    init_state:Optional[np.ndarray]=None, # Initial state probability vector (shape: n_states,).
    trans_matrix:Optional[np.ndarray]=None, # Initial transition matrix (shape: n_states x n_states).
    random_state:int=42, # Random seed for reproducibility. Default: 42.
    verbose:bool=False, # If True, print EM progress. Default: False.
)->None:

```

*Initialize the MS-ARR model with the specified parameters.*

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>n_components</td>
<td>int</td>
<td></td>
<td>Number of hidden states (regimes).</td>
</tr>
<tr>
<td>target_col</td>
<td>str</td>
<td></td>
<td>Name of the target variable.</td>
</tr>
<tr>
<td>lags</td>
<td>Optional[Union[int, List[int]]]</td>
<td>None</td>
<td>Lags for the autoregressive model.</td>
</tr>
<tr>
<td>lag_transform</td>
<td>Optional[list]</td>
<td>None</td>
<td>List of lag-transform function objects applied to the target.</td>
</tr>
<tr>
<td>difference</td>
<td>Optional[int]</td>
<td>None</td>
<td>Order of ordinary differencing (e.g. 1 for first difference).</td>
</tr>
<tr>
<td>seasonal_diff</td>
<td>Optional[int]</td>
<td>None</td>
<td>Seasonal period for seasonal differencing.</td>
</tr>
<tr>
<td>trend</td>
<td>Optional[str]</td>
<td>None</td>
<td>Trend strategy: ‘linear’ or ‘ets’.</td>
</tr>
<tr>
<td>pol_degree</td>
<td>int</td>
<td>1</td>
<td>Degree of polynomial trend (default: 1). Used when
trend=‘linear’.</td>
</tr>
<tr>
<td>ets_params</td>
<td>Optional[Dict[str, Any]]</td>
<td>None</td>
<td>Dictionary of parameters for the ExponentialSmoothing model when
using ‘ets’ trend strategy. The keys should be the parameter names and
the values should be the parameter values. Default is None (use default
ETS parameters).</td>
</tr>
<tr>
<td>change_points</td>
<td>Optional[List[int]]</td>
<td>None</td>
<td>Change points for piecewise linear trend. List of indices where the
trend slope can change.</td>
</tr>
<tr>
<td>box_cox</td>
<td>Union[bool, float, int]</td>
<td>False</td>
<td>Whether to apply Box-Cox transformation to the target variable. If a
float or int value is provided, it will be used as the lambda parameter
for the Box-Cox transformation. If True, the lambda parameter will be
estimated from the data.</td>
</tr>
<tr>
<td>box_cox_biasadj</td>
<td>bool</td>
<td>False</td>
<td>Whether to apply bias adjustment when inverting Box-Cox
transformation (default: False).</td>
</tr>
<tr>
<td>add_constant</td>
<td>bool</td>
<td>True</td>
<td>If True, prepend a constant column to the regressor matrix (default:
True).</td>
</tr>
<tr>
<td>cat_variables</td>
<td>Optional[List[str]]</td>
<td>None</td>
<td>List of categorical feature column names. If provided, these columns
will be treated as categorical variables and encoded accordingly.
Default is None (no categorical variables).</td>
</tr>
<tr>
<td>categorical_encoder</td>
<td>Optional[Any]</td>
<td>None</td>
<td>Categorical encoder object (e.g. OneHotEncoder(), MeanEncoder(),
etc.) to apply to the categorical variables specified in cat_variables.
The encoder should have fit() and transform() methods that can be
applied to the input DataFrame. Default is None (no categorical
encoding) and if None, categorical variables can only be used if the
model can handle them natively (e.g. LGBM or CatBoost).</td>
</tr>
<tr>
<td>method</td>
<td>str</td>
<td>posterior</td>
<td>State assignment method: ‘posterior’ (soft) or ‘viterbi’ (hard).
Default: ‘posterior’.</td>
</tr>
<tr>
<td>switching_var</td>
<td>bool</td>
<td>True</td>
<td>If True, each regime has its own variance. If False, uses pooled
variance. Default: True.</td>
</tr>
<tr>
<td>startprob_prior</td>
<td>float</td>
<td>1000.0</td>
<td>Dirichlet concentration for initial state distribution. Default:
1e3.</td>
</tr>
<tr>
<td>transmat_prior</td>
<td>float</td>
<td>1000.0</td>
<td>Dirichlet concentration for transition matrix rows. Default:
1e3.</td>
</tr>
<tr>
<td>n_iter</td>
<td>int</td>
<td>100</td>
<td>Maximum EM iterations. Default: 100.</td>
</tr>
<tr>
<td>tol</td>
<td>float</td>
<td>0.001</td>
<td>Convergence tolerance on log-likelihood. Default: 1e-6.</td>
</tr>
<tr>
<td>ridge</td>
<td>float</td>
<td>1e-05</td>
<td>Ridge regularisation parameter for coefficient estimation. Default:
1e-5.</td>
</tr>
<tr>
<td>coefficients</td>
<td>Optional[np.ndarray]</td>
<td>None</td>
<td>Initial regression coefficients (shape: n_states x n_features).</td>
</tr>
<tr>
<td>stds</td>
<td>Optional[np.ndarray]</td>
<td>None</td>
<td>Initial state standard deviations (shape: n_states,).</td>
</tr>
<tr>
<td>init_state</td>
<td>Optional[np.ndarray]</td>
<td>None</td>
<td>Initial state probability vector (shape: n_states,).</td>
</tr>
<tr>
<td>trans_matrix</td>
<td>Optional[np.ndarray]</td>
<td>None</td>
<td>Initial transition matrix (shape: n_states x n_states).</td>
</tr>
<tr>
<td>random_state</td>
<td>int</td>
<td>42</td>
<td>Random seed for reproducibility. Default: 42.</td>
</tr>
<tr>
<td>verbose</td>
<td>bool</td>
<td>False</td>
<td>If True, print EM progress. Default: False.</td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td><strong>None</strong></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/models/ms_arr.py#L474"
target="_blank" style="float:right; font-size:smaller">source</a>

### ms_arr.fit

``` python

def fit(
    df:pd.DataFrame, # Training DataFrame containing the target and any feature columns.
)->float: # Final log-likelihood after EM convergence.

```

*Fit the model using the EM algorithm.*

<table>
<colgroup>
<col style="width: 9%" />
<col style="width: 38%" />
<col style="width: 52%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>df</td>
<td>pd.DataFrame</td>
<td>Training DataFrame containing the target and any feature
columns.</td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td><strong>float</strong></td>
<td><strong>Final log-likelihood after EM convergence.</strong></td>
</tr>
</tbody>
</table>

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/models/ms_arr.py#L599"
target="_blank" style="float:right; font-size:smaller">source</a>

### ms_arr.forecast

``` python

def forecast(
    H:int, # Forecast horizon (number of steps to forecast ahead).
    exog:Optional[pd.DataFrame]=None, # Future exogenous regressors (must contain at least H rows). Should have the same columns as the training data (excluding the target variable).
)->np.ndarray:

```

*Generate forecasts for H future time steps.*

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>H</td>
<td>int</td>
<td></td>
<td>Forecast horizon (number of steps to forecast ahead).</td>
</tr>
<tr>
<td>exog</td>
<td>Optional[pd.DataFrame]</td>
<td>None</td>
<td>Future exogenous regressors (must contain at least H rows). Should
have the same columns as the training data (excluding the target
variable).</td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td><strong>np.ndarray</strong></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/models/ms_arr.py#L717"
target="_blank" style="float:right; font-size:smaller">source</a>

### ms_arr.cross_validate

``` python

def cross_validate(
    df:pd.DataFrame, # DataFrame containing the target and any feature columns.
    cv_split:int, # Number of cross-validation splits.
    test_size:int, # Number of time steps in the test set for each split.
    metrics:List[Callable], # Metric functions (e.g. ``[MAE, RMSE]``) used to evaluate forecast accuracy across folds. Call ``.cv_summary()`` after cross-validation to retrieve the aggregated scores.
    step_size:int=1, # Step size between the start of each test set in the splits.
    n_iter:int=1, # Number of EM iterations to run for each training fold.
    h_split_point:Optional[int]=None, # If provided, split the test set into two parts at this index and evaluate metrics separately on each part.
)->Union[pd.DataFrame, Tuple[pd.DataFrame, pd.DataFrame]]: # DataFrame containing the average score for each metric across all splits. If h_split_point is provided, also includes separate scores for the two parts of the test set. If cv_df is True, returns a tuple of (overall_performance_df, cv_predictions_df)

```

*Run cross-validation.*

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>df</td>
<td>pd.DataFrame</td>
<td></td>
<td>DataFrame containing the target and any feature columns.</td>
</tr>
<tr>
<td>cv_split</td>
<td>int</td>
<td></td>
<td>Number of cross-validation splits.</td>
</tr>
<tr>
<td>test_size</td>
<td>int</td>
<td></td>
<td>Number of time steps in the test set for each split.</td>
</tr>
<tr>
<td>metrics</td>
<td>List[Callable]</td>
<td></td>
<td>Metric functions (e.g. <code>[MAE, RMSE]</code>) used to evaluate
forecast accuracy across folds. Call <code>.cv_summary()</code> after
cross-validation to retrieve the aggregated scores.</td>
</tr>
<tr>
<td>step_size</td>
<td>int</td>
<td>1</td>
<td>Step size between the start of each test set in the splits.</td>
</tr>
<tr>
<td>n_iter</td>
<td>int</td>
<td>1</td>
<td>Number of EM iterations to run for each training fold.</td>
</tr>
<tr>
<td>h_split_point</td>
<td>Optional[int]</td>
<td>None</td>
<td>If provided, split the test set into two parts at this index and
evaluate metrics separately on each part.</td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td><strong>Union[pd.DataFrame, Tuple[pd.DataFrame,
pd.DataFrame]]</strong></td>
<td></td>
<td><strong>DataFrame containing the average score for each metric
across all splits. If h_split_point is provided, also includes separate
scores for the two parts of the test set. If cv_df is True, returns a
tuple of (overall_performance_df, cv_predictions_df)</strong></td>
</tr>
</tbody>
</table>
