Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: broken dataset links in transfer learning notebook #2067

Merged
merged 5 commits into from
Nov 16, 2023

Conversation

madtoinou
Copy link
Collaborator

Fixes #1948.

Summary

  • Replaced links to a private github repository with links to public websites
  • Pre-processing takes more time as the datasets are not prepared

Other informations

  • Had to rerun the whole notebook since the datasets are slightly different
  • Changed some observations/conclusions accordingly

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@codecov-commenter
Copy link

codecov-commenter commented Nov 14, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (d206055) 93.82% compared to head (3655e32) 93.81%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2067      +/-   ##
==========================================
- Coverage   93.82%   93.81%   -0.01%     
==========================================
  Files         134      134              
  Lines       13091    13077      -14     
==========================================
- Hits        12282    12268      -14     
  Misses        809      809              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

review-notebook-app bot commented Nov 15, 2023

View / edit / reply to this conversation on ReviewNB

dennisbader commented on 2023-11-15T09:30:48Z
----------------------------------------------------------------

The compute durations written for the different models have been obtained by running the notebook on a i9-10900K CPU, with an RTX 2080s GPU, with Python 3.9.7 and Darts 0.18.0.

Can you update this with the setup you used?


madtoinou commented on 2023-11-15T22:37:27Z
----------------------------------------------------------------

I updated the config, I could try to run the notebook on a more powerful config if you think that it would be more meaningful for users.

dennisbader commented on 2023-11-16T07:26:20Z
----------------------------------------------------------------

All good like this, thanks

Copy link

review-notebook-app bot commented Nov 15, 2023

View / edit / reply to this conversation on ReviewNB

dennisbader commented on 2023-11-15T09:30:49Z
----------------------------------------------------------------

you might want to cut out the cell output here to hide your machine paths :)


Copy link

review-notebook-app bot commented Nov 15, 2023

View / edit / reply to this conversation on ReviewNB

dennisbader commented on 2023-11-15T09:30:50Z
----------------------------------------------------------------

sMAPE of about 27.2.


Copy link

review-notebook-app bot commented Nov 15, 2023

View / edit / reply to this conversation on ReviewNB

dennisbader commented on 2023-11-15T09:30:51Z
----------------------------------------------------------------

The median is better for with the naive seasonal

->

Slightly better than the naive seasonal model.


Copy link

review-notebook-app bot commented Nov 15, 2023

View / edit / reply to this conversation on ReviewNB

dennisbader commented on 2023-11-15T09:30:51Z
----------------------------------------------------------------

with good forecasting accuracy and about 10x faster

->

with good forecasting accuracy and lower computational cost.

---

Not related to this PR but it's a bit worrying how ARIMA takes much longer than in the original version


madtoinou commented on 2023-11-15T22:38:52Z
----------------------------------------------------------------

I don't know if it's caused by the hardware or a change in the implementation since darts 0.18.0. Another difference is that the series are slightly longer on average with this version of the passengers per carrier.

Copy link

review-notebook-app bot commented Nov 15, 2023

View / edit / reply to this conversation on ReviewNB

dennisbader commented on 2023-11-15T09:30:52Z
----------------------------------------------------------------

Line #20.            # change this one to "gpu" if your notebook does run in a GPU environment:

you can use “auto” so it automatically uses gpu if available


madtoinou commented on 2023-11-15T22:41:02Z
----------------------------------------------------------------

I forced "cpu" because with the Apple Silicon macbook, the gpu (mps) acceleration does not support float64 and would require all the datasets to be converted to float32. Let me know what you think is the best.

dennisbader commented on 2023-11-16T07:28:31Z
----------------------------------------------------------------

Ah okay, than it's alright

Copy link

review-notebook-app bot commented Nov 15, 2023

View / edit / reply to this conversation on ReviewNB

dennisbader commented on 2023-11-15T09:30:53Z
----------------------------------------------------------------

tradeoff between accuracy and speed (about 85x faster than ARIMA for similar accuracy).

->

tradeoff between accuracy and speed.


Copy link

review-notebook-app bot commented Nov 15, 2023

View / edit / reply to this conversation on ReviewNB

dennisbader commented on 2023-11-15T09:30:54Z
----------------------------------------------------------------

Line #18.            # change this one to "gpu" if your notebook does run in a GPU environment:

same here with "auto"


Copy link

review-notebook-app bot commented Nov 15, 2023

View / edit / reply to this conversation on ReviewNB

dennisbader commented on 2023-11-15T09:30:55Z
----------------------------------------------------------------

The forecasting step with N-BEATS is ~350x faster than the fit-predict step we needed with ARIMA, and about 4x faster than the fit-predict step of linear regression.

->

The forecasting step with N-BEATS is more than 1000x faster than the fit-predict step we needed with ARIMA, and also faster than with linear regression.


Copy link

review-notebook-app bot commented Nov 15, 2023

View / edit / reply to this conversation on ReviewNB

dennisbader commented on 2023-11-15T09:30:56Z
----------------------------------------------------------------

The part about regression being slower can be removed now, as this was optimized


Copy link

review-notebook-app bot commented Nov 15, 2023

View / edit / reply to this conversation on ReviewNB

dennisbader commented on 2023-11-15T09:30:57Z
----------------------------------------------------------------

ARIMA performs best but is about 170x slower than N-BEATS...

now it's around 1k times slower

----

Note that two models out of the 3 most accurate (Exponential Smoothing and Kalman Filter) did not perform so well when used on the air passengers series

--->

Note also that Exponential Smoothing and Kalman Filter now perform much better than when we used them on the air passengers series.


Copy link
Collaborator

@dennisbader dennisbader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks a lot @madtoinou 🚀

Had some suggestions here and there mainly about updating the plot explanations with the new numbers

Copy link
Collaborator Author

I updated the config, I could try to run the notebook on a more powerful config if you think that it would be more meaningful for users.


View entire conversation on ReviewNB

Copy link
Collaborator Author

I don't know if it's caused by the hardware or a change in the implementation since darts 0.18.0. Another difference is that the series are slightly longer on average with this version of the passengers per carrier.


View entire conversation on ReviewNB

Copy link
Collaborator Author

I forced "cpu" because with the Apple Silicon macbook, the gpu (mps) acceleration does not support float64 and would require all the datasets to be converted to float32. Let me know what you think is the best.


View entire conversation on ReviewNB

Copy link
Collaborator

All good like this, thanks


View entire conversation on ReviewNB

Copy link
Collaborator

Ah okay, than it's alright


View entire conversation on ReviewNB

@dennisbader dennisbader merged commit a7f5d09 into master Nov 16, 2023
9 checks passed
@dennisbader dennisbader deleted the fix/broken_datasets branch November 16, 2023 07:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Released
Development

Successfully merging this pull request may close these issues.

[BUG] dataset link is broken
3 participants