API Index¶

hts¶

class hts.HTSRegressor(model: str = 'prophet', revision_method: str = 'OLS', transform: Union[hts._t.Transform, bool, None] = False, n_jobs: int = 1, low_memory: bool = False, **kwargs)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.RegressorMixin

Main regressor class for scikit-hts. Likely the only import you’ll need for using this project. It takes a pandas dataframe, the nodes specifying the hierarchies, model kind, revision method, and a few other parameters. See Examples to get an idea of how to use it.

Variables:

transform (Union[NamedTuple[str, Callable], bool]) – Function transform to be applied to input and outputs. If True, it will use scipy.stats.boxcox and scipy.special._ufuncs.inv_boxcox on input and output data
sum_mat (array_like) – The summing matrix, explained in depth in Forecasting
nodes (Dict[str, List[str]]) – Nodes representing node, edges of the hierarchy. Keys are nodes, values are list of edges.
df (pandas.DataFrame) – The dataframe containing the nodes and edges specified above
revision_method (str) – One of: "OLS", "WLSS", "WLSV", "FP", "PHA", "AHP", "BU", "NONE"
models (dict) – Dictionary that holds the trained models
mse (dict) – Dictionary that holds the mse scores for the trained models
residuals (dict) – Dictionary that holds the mse residual for the trained models
forecasts (dict) – Dictionary that holds the forecasts for the trained models
model_instance (TimeSeriesModel) – Reference to the class implementing the actual time series model

__init__(model: str = 'prophet', revision_method: str = 'OLS', transform: Union[hts._t.Transform, bool, None] = False, n_jobs: int = 1, low_memory: bool = False, **kwargs)[source]¶

Parameters:

model (str) – One of the models supported by hts. These can be found
revision_method (str) – The revision method to be used. One of: "OLS", "WLSS", "WLSV", "FP", "PHA", "AHP", "BU", "NONE"
transform (Boolean or NamedTuple) –
If True, scipy.stats.boxcox and scipy.special._ufuncs.inv_boxcox will be applied prior and after fitting. If False (default), no transform is applied. If you desired to use custom functions, use a NamedTuple like:
```
from collections import namedtuple

Transform = namedtuple('Transform', ['func', 'inv_func']
transform = Transform(func=numpy.exp, inv_func=numpy.log)

ht = HTSRegressor(transform=transform, ...)
```
The signatures for the func as well as inv_func parameters must both be Callable[[numpy.ndarry], numpy.ndarray], i.e. they must take an array and return an array, both of equal dimensions
n_jobs (int) – Number of parallel jobs to run the forecasting on
low_memory (Bool) – If True, models will be fit, serialized, and released from memory. Usually a good idea if you are dealing with a large amount of nodes
kwargs – Keyword arguments to be passed to the underlying model to be instantiated

fit(df: Optional[pandas.core.frame.DataFrame] = None, nodes: Optional[Dict[str, List[str]]] = None, tree: Optional[hts.hierarchy.HierarchyTree] = None, exogenous: Optional[Dict[str, List[str]]] = None, root: str = 'total', distributor: Optional[hts.utilities.distribution.DistributorBaseClass] = None, disable_progressbar=False, show_warnings=False, **fit_kwargs) → hts.core.regressor.HTSRegressor[source]¶

Fit hierarchical model to dataframe containing hierarchical data as specified in the nodes parameter.

Exogenous can also be passed as a dict of (string, list), where string is the specific node key and the list contains the names of the columns to be used as exogenous variables for that node.

Alternatively, a pre-built HierarchyTree can be passed without specifying the node and df. See more at hts.hierarchy.HierarchyTree

Parameters:	df (pandas.DataFrame) – A Dataframe of time series with a DateTimeIndex. Each column represents a node in the hierarchy. Ignored if tree argument is passed nodes (Dict[str, List[str]]) – The hierarchy defined as a dict of (string, list), as specified in `HierarchyTree.from_nodes` tree (HierarchyTree) – A pre-built HierarchyTree. Ignored if df and nodes are passed, as the tree will be built from thise distributor (Optional[DistributorBaseClass]) – A distributor, for parallel/distributed processing exogenous (Dict[str, List[str]] or None) – Node key mapping to columns that contain the exogenous variable for that node root (str) – The name of the root node disable_progressbar (Bool) – Disable or enable progressbar show_warnings (Bool) – Disable warnings fit_kwargs (Any) – Any arguments to be passed to the underlying forecasting model’s fit function
Returns:	The fitted HTSRegressor instance
Return type:	HTSRegressor

predict(exogenous_df: pandas.core.frame.DataFrame = None, steps_ahead: int = None, distributor: Optional[hts.utilities.distribution.DistributorBaseClass] = None, disable_progressbar: bool = False, show_warnings: bool = False, **predict_kwargs) → pandas.core.frame.DataFrame[source]¶

Parameters:

distributor (Optional[DistributorBaseClass]) – A distributor, for parallel/distributed processing
disable_progressbar (Bool) – Disable or enable progressbar
show_warnings (Bool) – Disable warnings
predict_kwargs (Any) – Any arguments to be passed to the underlying forecasting model’s predict function
exogenous_df (pandas.DataFrame) –
A dataframe of length == steps_ahead containing the exogenous data for each of the nodes. Only required when using prophet or auto_arima models. See fbprophet’s additional regression docs and AutoARIMA’s exogenous handling docs for more information.

Other models do not require additional regressors at predict time.
steps_ahead (int) – The number of forecasting steps for which to produce a forecast

Returns:

Revised Forecasts, as a pandas.DataFrame in the same format as the one passed for fitting, extended by steps_ahead
time steps`

class hts.RevisionMethod(name: str, sum_mat: numpy.ndarray, transformer)[source]¶

Bases: object

revise(forecasts=None, mse=None, nodes=None) → numpy.ndarray[source]¶

Parameters:	forecasts – mse – nodes –

hts.convenience¶

hts.convenience.revise_forecasts(method: str, forecasts: Dict[str, Union[numpy.ndarray, pandas.core.series.Series, pandas.core.frame.DataFrame]], errors: Optional[Dict[str, float]] = None, residuals: Optional[Dict[str, Union[numpy.ndarray, pandas.core.series.Series, pandas.core.frame.DataFrame]]] = None, summing_matrix: numpy.ndarray = None, nodes: hts._t.NAryTreeT = None, transformer: Union[hts._t.Transform, bool] = None)[source]¶

Convenience function to get revised forecast for pre-computed base forecasts

Parameters:	method (str) – The reconciliation method to use forecasts (Dict[str, ArrayLike]) – A dict mapping key name to its forecasts (including in-sample forecasts). Required, can be of type `numpy.ndarray` of `ndim == 1`, `pandas.Series`, or single columned `pandas.DataFrame` errors (Dict[str, float]) – A dict mapping key name to the in-sample errors. Required for methods: `OLS`, `WLSS`, `WLSV` if `residuals` is not passed residuals (Dict[str, ArrayLike]) – A dict mapping key name to the residuals of in-sample forecasts. Required for methods: `OLS`, `WLSS`, `WLSV`, can be of type `numpy.ndarray` of ndim == 1, `pandas.Series`, or single columned `pandas.DataFrame`. If passing residuals, `errors` dict is not required and will instead be calculated using MSE metric: `numpy.mean(numpy.array(residual) 2)` summing_matrix** (numpy.ndarray) – Not required if `nodes` argument is passed, or if using `BU` approach nodes (NAryTreeT) – The tree of nodes as specified in `HierarchyTree`. Required if not if using `AHP`, `PHA`, `FP` methods, or if using passing the `OLS`, `WLSS`, `WLSV` methods and not passing the `summing_matrix` parameter transformer (TransformT) – A transform with the method: `inv_func` that will be applied to the forecasts
Returns:	revised forecasts – The revised forecasts
Return type:	`pandas.DataFrame`

hts.defaults¶

hts.functions¶

hts.functions._create_bl_str_col(df: pandas.core.frame.DataFrame, level_names: List[str]) → List[str][source]¶

Concatenate the column values of all the specified level_names by row into a single column.

Parameters:	df (pandas.DataFrame) – Tabular data. level_names (List[str]) – Levels in the hierarchy.
Returns:	Concatendated column values by row.
Return type:	List[str]

hts.functions._get_bl(grouped_levels: List[str], bottom_levels: List[str]) → List[List[str]][source]¶

Get bottom level columns required to sum to create grouped columns.

Parameters:	grouped_levels (List[str]) – Grouped level, underscore delimited, column names. bottom_levels (List[str]) – Bottom level, underscore delimited, column names.
Returns:	Bottom level column names that make up each individual aggregated node in the hierarchy.
Return type:	List[List[str]]

hts.functions.add_agg_series_to_df(df: pandas.core.frame.DataFrame, grouped_levels: List[str], bottom_levels: List[str]) → pandas.core.frame.DataFrame[source]¶

Add aggregate series columns to wide dataframe.

Parameters:	df (pandas.DataFrame) – Wide dataframe containing bottom level series. grouped_levels (List[str]) – Grouped level, underscore delimited, column names. bottom_levels (List[str]) – Bottom level, underscore delimited, column names.
Returns:	Wide dataframe with all series in hierarchy.
Return type:	pandas.DataFrame

hts.functions.forecast_proportions(forecasts, nodes)[source]¶

Cons:: Produces biased revised forecasts even if base forecasts are unbiased

hts.functions.get_agg_series(df: pandas.core.frame.DataFrame, levels: List[List[str]]) → List[str][source]¶

Get aggregate level series names.

Parameters:	df (pandas.DataFrame) – Tabular data. levels (List[List[str]]) – List of lists containing the desired level of aggregation.
Returns:	Aggregate series names.
Return type:	List[str]

hts.functions.get_hierarchichal_df(df: pandas.core.frame.DataFrame, level_names: List[str], hierarchy: List[List[str]], date_colname: str, val_colname: str) → Tuple[pandas.core.frame.DataFrame, numpy.array, List[str]][source]¶

Transform your tabular dataframe to a wide dataframe with desired levels a hierarchy.

Parameters:

df (pd.DataFrame) – Tabular dataframe
level_names (List[str]) – Levels in the hierarchy.
hierarchy (List[List[str]]) – Desired levels in your hierarchy.
date_colname (str) – Date column name
val_colname (str) – Name of column containing series values.

Returns:

pd.DataFrame – Wide dataframe with levels of specified aggregation.
np.array – Summing matrix.
List[str] – Summing matrix labels.

Examples

>>> import hts.functions
>>> hier_df = pandas.DataFrame(
    data={
        'ds': ['2020-01', '2020-02'] * 5,
        "lev1": ['A', 'A',
                 'A', 'A',
                 'A', 'A',
                 'B', 'B',
                 'B', 'B'],
        "lev2": ['X', 'X',
                 'Y', 'Y',
                 'Z', 'Z',
                 'X', 'X',
                 'Y', 'Y'],
        "val": [1, 2,
                3, 4,
                5, 6,
                7, 8,
                9, 10]
    }
)
>>> hier_df
        ds lev1 lev2  val
0  2020-01    A    X    1
1  2020-02    A    X    2
2  2020-01    A    Y    3
3  2020-02    A    Y    4
4  2020-01    A    Z    5
5  2020-02    A    Z    6
6  2020-01    B    X    7
7  2020-02    B    X    8
8  2020-01    B    Y    9
9  2020-02    B    Y   10
>>> level_names = ['lev1', 'lev2']
>>> hierarchy = [['lev1'], ['lev2']]
>>> wide_df, sum_mat, sum_mat_labels = hts.functions.get_hierarchichal_df(hier_df,
                                                                          level_names=level_names,
                                                                          hierarchy=hierarchy,
                                                                          date_colname='ds',
                                                                          val_colname='val')
>>> wide_df
    lev1_lev2  A_X  A_Y  A_Z  B_X  B_Y  total   A   B   X   Y  Z
    ds
    2020-01      1    3    5    7    9     25   9  16   8  12  5
    2020-02      2    4    6    8   10     30  12  18  10  14  6

hts.functions.optimal_combination(forecasts: Dict[str, pandas.core.frame.DataFrame], sum_mat: numpy.ndarray, method: str, mse: Dict[str, float])[source]¶

Produces the optimal combination of forecasts by trace minimization (as described by Wickramasuriya, Athanasopoulos, Hyndman in “Optimal Forecast Reconciliation for Hierarchical and Grouped Time Series Through Trace Minimization”)

Parameters:	forecasts (dict) – Dictionary of pandas.DataFrames containing the future predictions sum_mat (np.ndarray) – The summing matrix method (str) – One of: OLS (ordinary least squares) WLSS (structurally weighted least squares) WLSV (variance weighted least squares) mse –

hts.functions.project(hat_mat: numpy.ndarray, sum_mat: numpy.ndarray, optimal_mat: numpy.ndarray) → numpy.ndarray[source]¶

hts.functions.proportions(nodes, forecasts, sum_mat, method='PHA')[source]¶

hts.functions.to_sum_mat(ntree: hts._t.NAryTreeT = None, node_labels: List[str] = None) → Tuple[numpy.ndarray, List[str]][source]¶

This function creates a summing matrix for the bottom up and optimal combination approaches All the inputs are the same as above The output is a summing matrix, see Rob Hyndman’s “Forecasting: principles and practice” Section 9.4

Parameters:

ntree (NAryTreeT) –
node_labels (List[str]) – Labels corresponing to node names/ summing matrix. Get from hts.functions.get_hierarchichal_df(…)

Returns:

numpy.ndarray – Summing matrix.
List[str] – Row order list of the level in the hierarchy represented by each row in the summing matrix.

hts.functions.y_hat_matrix(forecasts, keys=None)[source]¶

hts.revision¶

class hts.revision.RevisionMethod(name: str, sum_mat: numpy.ndarray, transformer)[source]¶

Bases: object

revise(forecasts=None, mse=None, nodes=None) → numpy.ndarray[source]¶

Parameters:	forecasts – mse – nodes –

hts.transforms¶

class hts.transforms.BoxCoxTransformer[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

fit(x: pandas.core.series.Series, y=None, **fit_params)[source]¶

fit_transform(x: pandas.core.series.Series, y=None, **fit_params)[source]¶

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:	X (array-like of shape (n_samples, n_features)) – Input samples. y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations). *fit_params (dict*) – Additional fit parameters.
Returns:	X_new – Transformed array.
Return type:	ndarray array of shape (n_samples, n_features_new)

inverse_transform(x: Union[pandas.core.series.Series, numpy.ndarray])[source]¶

transform(x: pandas.core.series.Series)[source]¶

class hts.transforms.FunctionTransformer(func: callable = None, inv_func: callable = None)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

fit(x: pandas.core.series.Series, y=None, **fit_params)[source]¶

fit_transform(x: pandas.core.series.Series, y=None, **fit_params)[source]¶

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:	X (array-like of shape (n_samples, n_features)) – Input samples. y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations). *fit_params (dict*) – Additional fit parameters.
Returns:	X_new – Transformed array.
Return type:	ndarray array of shape (n_samples, n_features_new)

inverse_transform(x: Union[pandas.core.series.Series, numpy.ndarray])[source]¶

transform(x: pandas.core.series.Series)[source]¶

hts._t¶

class hts._t.ExtendedEnum[source]¶

Bases: enum.Enum

An enumeration.

list = <bound method ExtendedEnum.list of <enum 'ExtendedEnum'>>[source]¶

names = <bound method ExtendedEnum.names of <enum 'ExtendedEnum'>>[source]¶

class hts._t.HierarchyVisualizerT[source]¶

Bases: object

create_map()[source]¶

class hts._t.MethodT[source]¶

Bases: hts._t.ExtendedEnum

An enumeration.

AHP = 'AHP'¶

BU = 'BU'¶

FP = 'FP'¶

NONE = 'NONE'¶

OLS = 'OLS'¶

PHA = 'PHA'¶

WLSS = 'WLSS'¶

WLSV = 'WLSV'¶

class hts._t.ModelT[source]¶

Bases: str, hts._t.ExtendedEnum

An enumeration.

auto_arima = 'auto_arima'¶

holt_winters = 'holt_winters'¶

prophet = 'prophet'¶

sarimax = 'sarimax'¶

class hts._t.NAryTreeT[source]¶

Bases: object

Type definition of an NAryTree

add_child(key=None, item=None, exogenous=None) → hts._t.NAryTreeT[source]¶

exogenous = None¶

get_height() → int[source]¶

get_level_order_labels() → List[List[str]][source]¶

get_node_height(key: str) → int[source]¶

get_series() → pandas.core.series.Series[source]¶

is_leaf() → bool[source]¶

leaf_sum() → int[source]¶

level_order_traversal() → List[List[int]][source]¶

num_nodes() → int[source]¶

parent¶

string_repr(prefix='', _last=True)[source]¶

sum_at_height(level) → int[source]¶

to_pandas() → pandas.core.frame.DataFrame[source]¶

traversal_level() → List[hts._t.NAryTreeT][source]¶

value_at_height(level: int) → List[T][source]¶

class hts._t.TimeSeriesModelT[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.RegressorMixin

Type definition of an TimeSeriesModel

create_model(**kwargs)[source]¶

fit(**fit_args) → hts._t.TimeSeriesModelT[source]¶

predict(node: hts._t.NAryTreeT, **predict_args)[source]¶

class hts._t.Transform(func, inv_func)[source]¶

Bases: tuple

func¶: Alias for field number 0

inv_func¶: Alias for field number 1

class hts._t.UnivariateModelT[source]¶

Bases: str, hts._t.ExtendedEnum

An enumeration.

arima = 'arima'¶

auto_arima = 'auto_arima'¶

holt_winters = 'holt_winters'¶

prophet = 'prophet'¶

sarimax = 'sarimax'¶