Optimisation of historical forecast for regression models #1885

madtoinou · 2023-07-07T15:55:31Z

Summary

Reduce historical_forecasts() runtime for all the RegressionModels using some tricks (enforce retrain=False, use boundaries of the "forecastable indexes" instead of a range, predictions are vectorized with stride).

Other Information

forecast_horizon > model.output_chunk_length is not supported at the moment because it would require auto-regression ~~and num_samples > 1 needs a bit of work before being fully supported~~.

[EDIT] : Some speed up statistics, also included some comparison between the refactored and "legacy" implementation.

# parameters
multi_models = [True, False]
multivariate = [True, False]
forecast_horizon = [1,7]
stride = 1
start = [700, 800, 900, 990]
length_ts = 1000

ts_univariate = tg.linear_timeseries(start_value=1, end_value=length_ts, length=length_ts)
ts_multivariate = ts_univariate.stack(tg.sine_timeseries(length=length_ts))

# two models, to test forecast_horizon different or equal to output_chunk_length
model1 = LinearRegressionModel(lags=3, output_chunk_length=forecast_horizon
model2 = LinearRegressionModel(lags=3, output_chunk_length=7)

# loop iterating on all the parameters
t0 = time.time()
for i in range(replicates):
    hist_fct = model.historical_forecasts(
       series=ts,
        start=start,
        stride=stride,
        last_points_only=last_points_only,
        forecast_horizon=forecast_horizon,
        num_samples=num_samples,
        retrain=False,
        enable_optimisation=False
        )
t1 = time.time()
for i in range(replicates):
    opti_hist_fct =  model.historical_forecasts(
        series=ts,
        start=start,
        stride=stride,
        last_points_only=last_points_only,
        forecast_horizon=forecast_horizon,
        num_samples=num_samples,
        retrain=False,
        enable_optimisation=True
t2 = time.time()

Refactorization of the original method did not affect the performance, but it should be easier to read.

The ratio on the y axis is (t1-t0)/(t2-t1)

Speed up when `last_points_only=True`

When the conditions are met, the gain scales with the length of the forecasted period.

Speed up when `last_points_only=False`

The bottleneck is the creation of the returned TimeSeries, the gain are less significant.

…ome issues for some scenario, lack of generalization

…arios

…f conditions

… properly handled, start of past covariates was not shifted

…historical forecast

codecov-commenter · 2023-07-13T10:05:03Z

Codecov Report

Patch coverage: 92.30% and project coverage change: -0.11% ⚠️

Comparison is base (7f64c92) 93.82% compared to head (ce508ea) 93.72%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1885      +/-   ##
==========================================
- Coverage   93.82%   93.72%   -0.11%     
==========================================
  Files         128      131       +3     
  Lines       12475    12646     +171     
==========================================
+ Hits        11705    11852     +147     
- Misses        770      794      +24

Files Changed	Coverage Δ
darts/utils/__init__.py	`100.00% <ø> (ø)`
darts/utils/timeseries_generation.py	`96.15% <ø> (ø)`
darts/utils/utils.py	`80.58% <ø> (-11.09%)`	⬇️
darts/models/forecasting/forecasting_model.py	`95.21% <85.00%> (+0.20%)`	⬆️
...orical_forecasts/optimized_historical_forecasts.py	`88.46% <88.46%> (ø)`
darts/models/forecasting/regression_model.py	`95.23% <93.75%> (-0.13%)`	⬇️
darts/utils/historical_forecasts/utils.py	`94.73% <94.73%> (ø)`
darts/dataprocessing/encoders/encoder_base.py	`94.69% <100.00%> (+0.02%)`	⬆️
darts/utils/historical_forecasts/__init__.py	`100.00% <100.00%> (ø)`

... and 6 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dennisbader

Great PR, thanks a lot and congratulations @madtoinou! This will give such a boost to historical forecasting and backtesting for RegressionModels! 🚀

I left some comments and suggestions regarding what we discussed offline, to add some more information on specific parts, etc.

darts/models/forecasting/forecasting_model.py

darts/utils/optimised_historical_forecasts.py

dennisbader · 2023-07-13T18:00:52Z

darts/utils/optimised_historical_forecasts.py

+                hist_fct_pc_start -= shift_start * unit
+                hist_fct_fc_start -= shift_start * unit
+
+            if model.output_chunk_length == forecast_horizon:


can you explain why this is not required for ocl < forecast_horizon?

it's the other way around, forecast_horizon < ocl, and it's because these timestamps will be covered by the last prediction of length output_chunk_length. By leaving them, the prediction will go too far.

darts/utils/optimised_historical_forecasts.py

dennisbader · 2023-07-13T18:19:40Z

darts/utils/optimised_historical_forecasts.py

+        require_auto_regression: bool = forecast_horizon > model.output_chunk_length
+
+        # reshape and stride the forecast into (forecastable_index, forecast_horizon, n_components, num_samples)
+        if model.multi_models:


this part is tricky to understand, could you add some more comments that describe how the output of _predict_and_sample is different for multi_model True/False, and share a bit more info for the slicing/reshaping.

We will be thankful in the future :)

Updated the comments, let me know if it's enough

darts/utils/optimised_historical_forecasts.py

…sed some review comments

…storical forecasts

…_time_index creation

dennisbader

Very, very cool stuff @madtoinou 🚀

I really only had some minor suggestions :)

add some tests
add support for optimization with encoders (or at least make it non-optimizable if they use encoders)

We can

darts/models/forecasting/forecasting_model.py

darts/utils/historical_forecasts/utils.py

darts/models/forecasting/forecasting_model.py

darts/utils/historical_forecasts/utils.py

darts/utils/optimised_historical_forecasts.py

dennisbader · 2023-07-29T13:18:33Z

darts/models/forecasting/regression_model.py

+            )
+
+        # retrieve stored covariates, usually handled by RegressionModel.predict()
+        if past_covariates is None and self.past_covariate_series is not None:


in the current implemenetation models using encoders are not yet optimizable.
Do you think we could add support for this?

Edit: I'm thinking about adding something like a generate_fit_predict_encodings which would make this a bit easier. Maybe we could drop optimization support for encoders until then.

Edit 2: I added the generate_fit_predict_encodings in #1925. Would be cool to merge that one and then add the support for optimization with encodings here as well :)

Good catch, I tend to forget about the encoders... Thank you for implementing this, I will adjust this PR as soon as the other one is merged.

…8co/darts into refactor/hist_fc_regression

…dex generators

darts/utils/historical_forecasts/utils.py

dennisbader

🚀 Super nice and great job @madtoinou 👏
We're ready to merge now once unit tests have passed (i adapted slightly to reduce testing time).

dennisbader and others added 12 commits May 30, 2023 11:18

added optimized_historical_forecasts method to forecasting models

b7a2040

reduced historical forecastable index generation

9b43b51

feat: historical forecasts optimization for simple regression model

c37193e

fix: bug in reduction to boundaries

d47ff70

fix: reduce intermediary step

4ed475a

fix: improved support for stride > 1and forecast_horizon > 1, still s…

7231bab

…ome issues for some scenario, lack of generalization

fix: improved generalization, support stride and forecast_horizon > 1

b261cde

fix: improved comments and error messages

3b24330

fix: bug for stride and forecast horizon > 1

a319d99

modularizing the code, debugged last_points_only = False for all scen…

76263f2

…arios

fix: last_points_only with model.output_chunk_length != forecast_horizon

c4896fa

fix: improved arguments names and type hinting

e64e962

madtoinou requested a review from dennisbader as a code owner July 7, 2023 15:55

madtoinou and others added 13 commits July 7, 2023 17:56

Merge branch 'master' into refactor/hist_fc_regression

147e1f8

fix: simplified if/else

307c0e9

feat: support for num_samples > 1 when last_points_only = True

b8a1e94

feat: support num_samples > 1 for last_points_only=False

6d8ea99

fix: improved type hinting

2fd1971

fix: support for RangeIndex with start > 0 and static covariates

9af8539

fix: bug for RangeIndex with step > 1

9a372bd

fix: show_warning argument is correctly propagated, simplified some i…

9cd0a8f

…f conditions

fix: bug in forecastable index, pos/neg aspect of future lags was not…

b2a5185

… properly handled, start of past covariates was not shifted

fix: forecastable index must be further shifted in not multi_models

30d5b7b

fix: remove duplicated code

685417d

fix: stored covariates are retrieved when necessary in the optimised …

c309a44

…historical forecast

Merge branch 'master' into refactor/hist_fc_regression

c5f0236

Merge branch 'master' into refactor/hist_fc_regression

a544128

dennisbader requested changes Jul 13, 2023

View reviewed changes

doc: added entry in the changelog

9e479b6

madtoinou and others added 11 commits July 17, 2023 11:25

fix: revert changes to get_forecastable_time_index

935a426

feat: created a utils module dedicated to historical forecast, addres…

c8e74e4

…sed some review comments

fix: slicing the covariates as much as possible

f5d936e

fix: _optimised_historical_forecasts returns an empty list

76534a1

Merge branch 'master' into refactor/hist_fc_regression

dff4652

feat: improved modularity, light performance gain for un-optimised hi…

84748ad

…storical forecasts

fix: further modularization, added typing

97aa91e

doc: improved typing and added some docstrings

78b97da

fix: clearly separated the retrain=True/False for historical_forecast…

09959a5

…_time_index creation

merge with master

be7dd88

fix: predict_likelihood_parameters in optimized historical fct

2a45588

madtoinou requested a review from dennisbader July 27, 2023 22:29

Merge branch 'master' into refactor/hist_fc_regression

7f4952c

dennisbader requested changes Jul 29, 2023

View reviewed changes

madtoinou and others added 11 commits July 31, 2023 13:07

Merge branch 'master' into refactor/hist_fc_regression

b4b21a0

fix: addressing review comments

ce3b00f

fix: renamed optimised to optimized (US spelling)

29e7211

fix: renamed optimised to optimized (US spelling)

2ddb537

Merge branch 'refactor/hist_fc_regression' of https://github.com/unit…

cd0078d

…8co/darts into refactor/hist_fc_regression

feat: added tests for optimized historical forecast

e277d80

fix: properly copying dict in tests

572721b

fix: support for model with encoders

0edd779

fix: improved encoders support for opti hist fct

8c98d48

include index gap between train and inference index for covariates in…

102178d

…dex generators

reduce testing time

ce508ea

dennisbader reviewed Aug 1, 2023

View reviewed changes

darts/utils/historical_forecasts/utils.py Outdated Show resolved Hide resolved

dennisbader approved these changes Aug 1, 2023

View reviewed changes

dennisbader merged commit 44c730a into master Aug 1, 2023
9 checks passed

dennisbader deleted the refactor/hist_fc_regression branch August 1, 2023 12:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimisation of historical forecast for regression models #1885

Optimisation of historical forecast for regression models #1885

madtoinou commented Jul 7, 2023 •

edited

Loading

codecov-commenter commented Jul 13, 2023 •

edited

Loading

dennisbader left a comment

dennisbader Jul 13, 2023

madtoinou Jul 17, 2023

dennisbader Jul 13, 2023

madtoinou Jul 17, 2023

dennisbader left a comment

dennisbader Jul 29, 2023 •

edited

Loading

madtoinou Jul 31, 2023

dennisbader left a comment

Optimisation of historical forecast for regression models #1885

Optimisation of historical forecast for regression models #1885

Conversation

madtoinou commented Jul 7, 2023 • edited Loading

Summary

Other Information

Speed up when last_points_only=True

Speed up when last_points_only=False

codecov-commenter commented Jul 13, 2023 • edited Loading

Codecov Report

dennisbader left a comment

Choose a reason for hiding this comment

dennisbader Jul 13, 2023

Choose a reason for hiding this comment

madtoinou Jul 17, 2023

Choose a reason for hiding this comment

dennisbader Jul 13, 2023

Choose a reason for hiding this comment

madtoinou Jul 17, 2023

Choose a reason for hiding this comment

dennisbader left a comment

Choose a reason for hiding this comment

dennisbader Jul 29, 2023 • edited Loading

Choose a reason for hiding this comment

madtoinou Jul 31, 2023

Choose a reason for hiding this comment

dennisbader left a comment

Choose a reason for hiding this comment

madtoinou commented Jul 7, 2023 •

edited

Loading

Speed up when `last_points_only=True`

Speed up when `last_points_only=False`

codecov-commenter commented Jul 13, 2023 •

edited

Loading

dennisbader Jul 29, 2023 •

edited

Loading