

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/models/glm.py#L21"
target="_blank" style="float:right; font-size:smaller">source</a>

### glm

``` python

def glm(
    family:Any, # A statsmodels family object specifying the error distribution and link function for the GLM (e.g. family=sm.families.Poisson() for count data, family=sm.families.Binomial() for binary data, etc.). `import statsmodels.api as sm`, so you can access the families via `sm.families`.
    target_col:str, # Name of the target variable column in the input DataFrame.
    lags:Optional[Union[int, List[int]]]=None, # Lags to include as features. If an integer is provided, lags from 1 to that integer will be included. If a list of integers is provided, those specific lags will be included. Default is None (no lag features).
    lag_transform:Optional[list]=None, # List of lag-transform function objects to apply to the target variable (e.g. [expanding_mean(shift=1), rolling_std(window_size=3, shift=1)]). Each function should take a pandas Series as input and return a Series of the same length. Default is None (no lag transforms).
    difference:Optional[int]=None, # Order of ordinary differencing to apply to the target variable (e.g. 1 for first difference). Default is None (no differencing).
    seasonal_diff:Optional[int]=None, # Seasonal period for seasonal differencing (e.g. 12 for monthly data with yearly seasonality). Default is None (no seasonal differencing).
    trend:Optional[str]=None, # Trend strategy to use. Options are 'linear' for linear trend removal, 'ets' for ETS-based trend removal, 'feature_lr' for using linear trend components as features, and 'feature_ets' for using ETS trend components as features. Default is None (no trend handling).
    pol_degree:int=1, # Degree of polynomial trend to fit when using 'linear' or 'feature_lr' trend strategy. Default is 1 (linear trend).
    ets_params:Optional[Dict[str, Any]]=None, # Dictionary of parameters for the ExponentialSmoothing model when using 'ets' trend strategy. The keys should be the parameter names and the values should be the parameter values. Default is None (use default ETS parameters).
    change_points:Optional[List[int]]=None, # List of indices in the time series where change points occur for piecewise linear trend fitting. Only used when trend strategy is 'linear' or 'feature_lr'. Default is None (no change points, fit a single linear trend).
    box_cox:Union[bool, float, int]=False, # Whether to apply Box-Cox transformation to the target variable. If a float or int value is provided, it will be used as the lambda parameter for the Box-Cox transformation. If True, the lambda parameter will be estimated from the data.
    box_cox_biasadj:bool=False, # Whether to apply bias adjustment when inverting the Box-Cox transformation on forecasts. Default is False.
    add_constant:bool=True,
    cat_variables:Optional[List[str]]=None, # List of categorical feature column names. If provided, these columns will be treated as categorical variables and encoded accordingly. Default is None (no categorical variables).
    categorical_encoder:Optional[Any]=None, # Categorical encoder object (e.g. OneHotEncoder(), MeanEncoder(), etc.) to apply to the categorical variables specified in cat_variables. The encoder should have fit() and transform() methods that can be applied to the input DataFrame. Default is None (no categorical encoding) and if None, categorical variables can only be used if the model can handle them natively (e.g. LGBM or CatBoost).
    offset:Optional[np.ndarray]=None, # An offset to be included in the model. If provided, must be an array whose length is the number of rows in exog.
    exposure:Optional[np.ndarray]=None, # Log(exposure) will be added to the linear prediction in the model. Exposure is only valid if the log link is used. If provided, it must be an array with the same length as endog.
    freq_weights:Optional[np.ndarray]=None, # 1d array of frequency weights. The default is None. If None is selected or a blank value, then the algorithm will replace with an array of 1’s with length equal to the endog. WARNING: Using weights is not verified yet for all possible options and results, see Notes in statsmodels documentation.
    var_weights:Optional[np.ndarray]=None, # 1d array of variance (analytic) weights. The default is None. If None is selected or a blank value, then the algorithm will replace with an array of 1’s with length equal to the endog. WARNING: Using weights is not verified yet for all possible options and results, see Notes in statsmodels documentation.
    missing:Optional[str]=None, # Available options are ‘none’, ‘drop’, and ‘raise’. If ‘none’, no nan checking is done. If ‘drop’, any observations with nans are dropped. If ‘raise’, an error is raised. Default is ‘none’.
)->None:

```

*Initialize the glm forecaster with the specified model and data
preparation parameters.*

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>family</td>
<td>Any</td>
<td></td>
<td>A statsmodels family object specifying the error distribution and
link function for the GLM (e.g. family=sm.families.Poisson() for count
data, family=sm.families.Binomial() for binary data, etc.).
<code>import statsmodels.api as sm</code>, so you can access the
families via <code>sm.families</code>.</td>
</tr>
<tr>
<td>target_col</td>
<td>str</td>
<td></td>
<td>Name of the target variable column in the input DataFrame.</td>
</tr>
<tr>
<td>lags</td>
<td>Optional[Union[int, List[int]]]</td>
<td>None</td>
<td>Lags to include as features. If an integer is provided, lags from 1
to that integer will be included. If a list of integers is provided,
those specific lags will be included. Default is None (no lag
features).</td>
</tr>
<tr>
<td>lag_transform</td>
<td>Optional[list]</td>
<td>None</td>
<td>List of lag-transform function objects to apply to the target
variable (e.g. [expanding_mean(shift=1), rolling_std(window_size=3,
shift=1)]). Each function should take a pandas Series as input and
return a Series of the same length. Default is None (no lag
transforms).</td>
</tr>
<tr>
<td>difference</td>
<td>Optional[int]</td>
<td>None</td>
<td>Order of ordinary differencing to apply to the target variable
(e.g. 1 for first difference). Default is None (no differencing).</td>
</tr>
<tr>
<td>seasonal_diff</td>
<td>Optional[int]</td>
<td>None</td>
<td>Seasonal period for seasonal differencing (e.g. 12 for monthly data
with yearly seasonality). Default is None (no seasonal
differencing).</td>
</tr>
<tr>
<td>trend</td>
<td>Optional[str]</td>
<td>None</td>
<td>Trend strategy to use. Options are ‘linear’ for linear trend
removal, ‘ets’ for ETS-based trend removal, ‘feature_lr’ for using
linear trend components as features, and ‘feature_ets’ for using ETS
trend components as features. Default is None (no trend handling).</td>
</tr>
<tr>
<td>pol_degree</td>
<td>int</td>
<td>1</td>
<td>Degree of polynomial trend to fit when using ‘linear’ or
‘feature_lr’ trend strategy. Default is 1 (linear trend).</td>
</tr>
<tr>
<td>ets_params</td>
<td>Optional[Dict[str, Any]]</td>
<td>None</td>
<td>Dictionary of parameters for the ExponentialSmoothing model when
using ‘ets’ trend strategy. The keys should be the parameter names and
the values should be the parameter values. Default is None (use default
ETS parameters).</td>
</tr>
<tr>
<td>change_points</td>
<td>Optional[List[int]]</td>
<td>None</td>
<td>List of indices in the time series where change points occur for
piecewise linear trend fitting. Only used when trend strategy is
‘linear’ or ‘feature_lr’. Default is None (no change points, fit a
single linear trend).</td>
</tr>
<tr>
<td>box_cox</td>
<td>Union[bool, float, int]</td>
<td>False</td>
<td>Whether to apply Box-Cox transformation to the target variable. If a
float or int value is provided, it will be used as the lambda parameter
for the Box-Cox transformation. If True, the lambda parameter will be
estimated from the data.</td>
</tr>
<tr>
<td>box_cox_biasadj</td>
<td>bool</td>
<td>False</td>
<td>Whether to apply bias adjustment when inverting the Box-Cox
transformation on forecasts. Default is False.</td>
</tr>
<tr>
<td>add_constant</td>
<td>bool</td>
<td>True</td>
<td></td>
</tr>
<tr>
<td>cat_variables</td>
<td>Optional[List[str]]</td>
<td>None</td>
<td>List of categorical feature column names. If provided, these columns
will be treated as categorical variables and encoded accordingly.
Default is None (no categorical variables).</td>
</tr>
<tr>
<td>categorical_encoder</td>
<td>Optional[Any]</td>
<td>None</td>
<td>Categorical encoder object (e.g. OneHotEncoder(), MeanEncoder(),
etc.) to apply to the categorical variables specified in cat_variables.
The encoder should have fit() and transform() methods that can be
applied to the input DataFrame. Default is None (no categorical
encoding) and if None, categorical variables can only be used if the
model can handle them natively (e.g. LGBM or CatBoost).</td>
</tr>
<tr>
<td>offset</td>
<td>Optional[np.ndarray]</td>
<td>None</td>
<td>An offset to be included in the model. If provided, must be an array
whose length is the number of rows in exog.</td>
</tr>
<tr>
<td>exposure</td>
<td>Optional[np.ndarray]</td>
<td>None</td>
<td>Log(exposure) will be added to the linear prediction in the model.
Exposure is only valid if the log link is used. If provided, it must be
an array with the same length as endog.</td>
</tr>
<tr>
<td>freq_weights</td>
<td>Optional[np.ndarray]</td>
<td>None</td>
<td>1d array of frequency weights. The default is None. If None is
selected or a blank value, then the algorithm will replace with an array
of 1’s with length equal to the endog. WARNING: Using weights is not
verified yet for all possible options and results, see Notes in
statsmodels documentation.</td>
</tr>
<tr>
<td>var_weights</td>
<td>Optional[np.ndarray]</td>
<td>None</td>
<td>1d array of variance (analytic) weights. The default is None. If
None is selected or a blank value, then the algorithm will replace with
an array of 1’s with length equal to the endog. WARNING: Using weights
is not verified yet for all possible options and results, see Notes in
statsmodels documentation.</td>
</tr>
<tr>
<td>missing</td>
<td>Optional[str]</td>
<td>None</td>
<td>Available options are ‘none’, ‘drop’, and ‘raise’. If ‘none’, no nan
checking is done. If ‘drop’, any observations with nans are dropped. If
‘raise’, an error is raised. Default is ‘none’.</td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td><strong>None</strong></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/models/glm.py#L288"
target="_blank" style="float:right; font-size:smaller">source</a>

### glm.fit

``` python

def fit(
    df:pd.DataFrame, # Training DataFrame containing the target and any feature columns.
)->None:

```

*Fit the model to the training data after applying the specified data
preparation steps.*

<table>
<colgroup>
<col style="width: 9%" />
<col style="width: 38%" />
<col style="width: 52%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>df</td>
<td>pd.DataFrame</td>
<td>Training DataFrame containing the target and any feature
columns.</td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td><strong>None</strong></td>
<td></td>
</tr>
</tbody>
</table>

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/models/glm.py#L387"
target="_blank" style="float:right; font-size:smaller">source</a>

### glm.forecast

``` python

def forecast(
    H:int, # Forecast horizon.
    exog:Optional[pd.DataFrame]=None, # Optional dataframe of future regressors.
)->np.ndarray: # Forecast values of length `H`.

```

*Recursive multi-step forecast.*

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>H</td>
<td>int</td>
<td></td>
<td>Forecast horizon.</td>
</tr>
<tr>
<td>exog</td>
<td>Optional[pd.DataFrame]</td>
<td>None</td>
<td>Optional dataframe of future regressors.</td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td><strong>np.ndarray</strong></td>
<td></td>
<td><strong>Forecast values of length <code>H</code>.</strong></td>
</tr>
</tbody>
</table>

------------------------------------------------------------------------

<a
href="https://github.com/mustafaslanCoto/peshbeen/blob/main/peshbeen/models/glm.py#L482"
target="_blank" style="float:right; font-size:smaller">source</a>

### glm.cross_validate

``` python

def cross_validate(
    df:pd.DataFrame, # DataFrame containing the target and any feature columns.
    cv_split:int, # Number of cross-validation splits.
    test_size:int, # Number of periods in each test set.
    metrics:List[Callable], # Metric functions (e.g. ``[MAE, RMSE]``) used to evaluate forecast accuracy across folds. Call ``.cv_summary()`` after cross-validation to retrieve the aggregated scores.
    step_size:int=1, # Step size to move the test window forward in each split.
    h_split_point:Optional[int]=None, # Optional index to split the test set into two parts for separate evaluation (e.g. to evaluate short-term vs long-term performance). If None, no split is done.
)->Tuple[pd.DataFrame, pd.DataFrame]: # DataFrame containing overall performance metrics averaged across splits, and a DataFrame with predictions and true values for each split.

```

*Run cross-validation using time series splits.*

<table>
<colgroup>
<col style="width: 6%" />
<col style="width: 25%" />
<col style="width: 34%" />
<col style="width: 34%" />
</colgroup>
<thead>
<tr>
<th></th>
<th><strong>Type</strong></th>
<th><strong>Default</strong></th>
<th><strong>Details</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>df</td>
<td>pd.DataFrame</td>
<td></td>
<td>DataFrame containing the target and any feature columns.</td>
</tr>
<tr>
<td>cv_split</td>
<td>int</td>
<td></td>
<td>Number of cross-validation splits.</td>
</tr>
<tr>
<td>test_size</td>
<td>int</td>
<td></td>
<td>Number of periods in each test set.</td>
</tr>
<tr>
<td>metrics</td>
<td>List[Callable]</td>
<td></td>
<td>Metric functions (e.g. <code>[MAE, RMSE]</code>) used to evaluate
forecast accuracy across folds. Call <code>.cv_summary()</code> after
cross-validation to retrieve the aggregated scores.</td>
</tr>
<tr>
<td>step_size</td>
<td>int</td>
<td>1</td>
<td>Step size to move the test window forward in each split.</td>
</tr>
<tr>
<td>h_split_point</td>
<td>Optional[int]</td>
<td>None</td>
<td>Optional index to split the test set into two parts for separate
evaluation (e.g. to evaluate short-term vs long-term performance). If
None, no split is done.</td>
</tr>
<tr>
<td><strong>Returns</strong></td>
<td><strong>Tuple[pd.DataFrame, pd.DataFrame]</strong></td>
<td></td>
<td><strong>DataFrame containing overall performance metrics averaged
across splits, and a DataFrame with predictions and true values for each
split.</strong></td>
</tr>
</tbody>
</table>
