Skip to content

Class Tune #1200

Merged
merged 18 commits into from
May 21, 2023
Merged

Class Tune #1200

merged 18 commits into from
May 21, 2023

Conversation

GooseIt
Copy link
Contributor

@GooseIt GooseIt commented Apr 2, 2023

Before submitting (must do checklist)

  • Did you read the contribution guide?
  • Did you update the docs? We use Numpy format for all the methods and classes.
  • Did you write any new necessary tests?
  • Did you update the CHANGELOG?

Proposed Changes

See #1024

Closing issues

closes #1024

@GooseIt
Copy link
Contributor Author

GooseIt commented Apr 2, 2023

Here is a list of details of my implementation that I think should be stated:

  1. Typing of BaseMixin.set_params() was changed. Now the method returns type of object, from which it was called (instead of always returning BaseMixin). In Python 3.11 this can be implemeted easily using Typing.Self. So when Etna updates Python version it operates on, I think this implementation should be changed as well.
  2. The implementation of BaseMixin.set_params() helper-function BaseMixin._update_nested_dict_with_flat_dict() altered to make it able to process .params_to_tune() output (original implementation was built to operate only with .to_dict() output). It is made possible by allowing list parsing during param_nesting-processing cycle, this option is currently only used when indexing "i"th element of transforms list during handling of "transforms.i.x" input.
  3. (!!!) Current implementation of Tune is base on assumption that params_to_tune yields transforms in format "transforms.i.x" instead of "transform.i.x". This decision of mine also affects test_can_handle_transforms in test/test_auto/test_tune.py. There are my reasons:
    • "transforms" is the name by which list of transforms is passed when constructing object from .to_dict() method output. I think there should be consistency around the code in dict elements naming, having it "transforms" in one place and "transform" in others could confuse contributors.
    • "transforms.i.x" notation is interpretable - it is understandable that there are multiple transforms, and i refers to number of concrete transform. On the other hand, it is unclear whether i in "transform.i.x" refers to indice of transform or to indice of something inside one transform object.
    • implementation-wise i think it is better to have the same processing pipeline for transforms coming from .to_dict() and from .params_to_tune(). I don't think that checking if coming arguments are "transform" or "transforms" and then directing them into one of two distinct pipelines is a good idea.

If you think that my decision (or any of the arguments) to prefer "transforms.i.x" notation is wrong, please explain why. I'm ready to re-write my code if you provide your vision on implementation of Tune based on keeping "transform.i.x" notation.

@codecov-commenter
Copy link

codecov-commenter commented Apr 2, 2023

Codecov Report

❗ No coverage uploaded for pull request base (automl-2.0@3c42bad). Click here to learn what that means.
The diff coverage is n/a.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@              Coverage Diff              @@
##             automl-2.0    #1200   +/-   ##
=============================================
  Coverage              ?   88.34%           
=============================================
  Files                 ?      186           
  Lines                 ?    11132           
  Branches              ?        0           
=============================================
  Hits                  ?     9835           
  Misses                ?     1297           
  Partials              ?        0           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@GooseIt
Copy link
Contributor Author

GooseIt commented Apr 2, 2023

There is also 1 problem:
As you can see, I've declared dict_of_distrs right inside objective function.
I don't think that's good, as it creates dict_of_distrs every time objective is called, yet I don't know a better place to put dict_of_distrs declaration.
Please provide with suggestion where it should be put.

@Mr-Geekman
Copy link
Contributor

  1. Ok, it is also possible to import Self from typing_extensions, but our current mypy version doesn't support it, as I understand.
  2. I fixed this in this PR: Teach set_params to work with nested list and tuple #1201.
  3. I agree about transforms, but don't really see where it was said to use transform.

etna/auto/auto.py Outdated Show resolved Hide resolved
etna/auto/auto.py Outdated Show resolved Hide resolved
etna/auto/auto.py Outdated Show resolved Hide resolved
etna/core/mixins.py Outdated Show resolved Hide resolved
CHANGELOG.md Outdated
@@ -7,6 +7,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## Unreleased
### Added
- Forecast decomposition for `SeasonalMovingAverageModel`([#1180](https://github.com/tinkoff-ai/etna/pull/1180))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update automl-2.0 branch to hide this updates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I have to remove it, or will you alter it during merge into automl-2.0?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be better to remove changes not related to the task from this branch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changelog shouldn't be like this. In actual changelog for automl-2.0 we have already added 2.0 release section. You should fix this.
May be the easiest way is to copy actual changelog and add only your changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for any other change for other classes that is unrelated to the task.

etna/auto/auto.py Show resolved Hide resolved
etna/auto/auto.py Outdated Show resolved Hide resolved
etna/auto/auto.py Outdated Show resolved Hide resolved
etna/auto/auto.py Outdated Show resolved Hide resolved
@Mr-Geekman
Copy link
Contributor

Some tests are also failing. Probably, they can be debugged locally.

@Mr-Geekman
Copy link
Contributor

Hi, @GooseIt! Are there any news about fixes of the PR?

@GooseIt
Copy link
Contributor Author

GooseIt commented Apr 6, 2023

I'll take my time fixing this PR on this weekend, and I'll report no later than Sunday 23:59.

tests/test_auto/test_tune.py Outdated Show resolved Hide resolved
tests/test_auto/test_tune.py Outdated Show resolved Hide resolved
tests/test_auto/test_tune.py Show resolved Hide resolved
tests/test_auto/test_tune.py Show resolved Hide resolved
tests/test_auto/test_tune.py Show resolved Hide resolved
tests/test_auto/test_tune.py Outdated Show resolved Hide resolved
@GooseIt
Copy link
Contributor Author

GooseIt commented Apr 9, 2023

I'm sorry, I've overesrimated my coding pacing. I'm looking into the issue, but it will take one or two more days to fix everything

@GooseIt
Copy link
Contributor Author

GooseIt commented Apr 11, 2023

I get the message below when running pytest tests/test_auto/test_auto.py inside poetry shell.
E RuntimeError: The runtime optuna version 2.10.1 is no longer compatible with the table schema (set up by optuna 3.1.0). Please try updating optuna to the latest version by $ pip install -U optuna.
The problem is, it is not exclusive to the branch I'm currently working in. I've checkout into master, and I there get the same error when running above mentioned command.
Do you get this error on your end as well?
If not, what do you think is the cause? Is there any problem with my virtual environment on your mind?

Copy link
Contributor

@Mr-Geekman Mr-Geekman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think logical part of the code looks good. Remaining tasks

  • Remove changes unrelated to the task.
  • Fix comments on tests.
  • Remove redundant 'print's.

CHANGELOG.md Outdated
@@ -7,6 +7,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## Unreleased
### Added
- Forecast decomposition for `SeasonalMovingAverageModel`([#1180](https://github.com/tinkoff-ai/etna/pull/1180))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be better to remove changes not related to the task from this branch.

etna/auto/auto.py Outdated Show resolved Hide resolved
Copy link
Contributor

@Mr-Geekman Mr-Geekman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed the button.

@GooseIt
Copy link
Contributor Author

GooseIt commented Apr 12, 2023

@Mr-Geekman, could you please reply to my E RuntimeError: The runtime optuna version 2.10.1 is no longer compatible with the table schema (set up by optuna 3.1.0). Please try updating optuna to the latest version by $ pip install -U optuna. error message?

This manifestation makes it hard for me to debug tests locally, as all tests dependent on optuna.storages.RDBStorage throw errors

@Mr-Geekman
Copy link
Contributor

Have you seen somewhere in the tests the file with a name test.db? I have a suspicion that error is due to this file schema.

@GooseIt
Copy link
Contributor Author

GooseIt commented Apr 12, 2023

Yes, I believe that the file is relevant, all the tests fail on yield RDBStorage("sqlite:///test.db") from optuna_storage fixture

@Mr-Geekman
Copy link
Contributor

Have you tried to remove it? Or it isn't present on the file system?

@Mr-Geekman
Copy link
Contributor

I think this file is created somewhere in test_auto.py or somewhere around the fixture that creates it.

@Mr-Geekman
Copy link
Contributor

@GooseIt , how are you doing with the task?

@GooseIt
Copy link
Contributor Author

GooseIt commented Apr 20, 2023

@Mr-Geekman, can this task wait for two weeks?
I think that this time will be more than enough for me to resolve my current schedule and have nothing else on my hands.

@Mr-Geekman
Copy link
Contributor

We moved the task into the Hold status. Please, write here in the comments when you will be ready to continue.

@GooseIt
Copy link
Contributor Author

GooseIt commented May 15, 2023

It seems for me like everything except CHANGELOG.md is fixed.
Please review

@Mr-Geekman Mr-Geekman self-requested a review May 15, 2023 18:48
Copy link
Contributor

@Mr-Geekman Mr-Geekman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove changes, unrelated to the task. It can even be done manually by copy-pasting.

There are some issues with tests.

If you are to tired of this task we can finish on removing changes from other code and I'll continue from there.

etna/auto/auto.py Outdated Show resolved Hide resolved
CHANGELOG.md Outdated
@@ -7,6 +7,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## Unreleased
### Added
- Forecast decomposition for `SeasonalMovingAverageModel`([#1180](https://github.com/tinkoff-ai/etna/pull/1180))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changelog shouldn't be like this. In actual changelog for automl-2.0 we have already added 2.0 release section. You should fix this.
May be the easiest way is to copy actual changelog and add only your changes.

CHANGELOG.md Outdated
@@ -7,6 +7,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## Unreleased
### Added
- Forecast decomposition for `SeasonalMovingAverageModel`([#1180](https://github.com/tinkoff-ai/etna/pull/1180))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for any other change for other classes that is unrelated to the task.

tests/test_auto/test_auto.py Show resolved Hide resolved
tests/test_auto/test_auto.py Show resolved Hide resolved
@GooseIt
Copy link
Contributor Author

GooseIt commented May 16, 2023

I'll muster the willpower and finish the task, but thank you for proposing resignation.
I'll report this evening.

@GooseIt
Copy link
Contributor Author

GooseIt commented May 16, 2023

This is not final version, I will remove prints and make lint.

@Mr-Geekman Mr-Geekman self-requested a review May 16, 2023 20:14
@GooseIt
Copy link
Contributor Author

GooseIt commented May 16, 2023

Last commit is ready to being reviewed

Copy link
Contributor

@Mr-Geekman Mr-Geekman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave one comment. You can make it or decide to finish and I'll do that after merge.


def __init__(
self,
pipeline: Pipeline,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can try to replace Pipeline here with AbstractPipeline. Because this class isn't specific for the Pipeline. Other methods should be also updated to work with AbstractPipeline.

CHANGELOG.md Outdated Show resolved Hide resolved
@GooseIt
Copy link
Contributor Author

GooseIt commented May 17, 2023

I'll do the replacement, will report tomorrow.

@GooseIt
Copy link
Contributor Author

GooseIt commented May 18, 2023

For hyperparameter suggestion logic in _objective to work, having set_params method is mandatory.

set_params is defined for class BaseMixin in etna/core/mixins.py. Nor AbstractPipeline nor BasePipeline support it.

Should I inherit one of these classes from BaseMixin, and if yes, which one?

@Mr-Geekman
Copy link
Contributor

As I can see, BasePipeline already inherits BaseMixin.

@GooseIt
Copy link
Contributor Author

GooseIt commented May 19, 2023

The PR is ready to be reviewed.

etna/auto/auto.py Outdated Show resolved Hide resolved
etna/auto/auto.py Outdated Show resolved Hide resolved
etna/auto/auto.py Outdated Show resolved Hide resolved
Copy link
Contributor

@Mr-Geekman Mr-Geekman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at comments above.

@GooseIt
Copy link
Contributor Author

GooseIt commented May 20, 2023

I've moved all classes in auto.py from Pipeline class to BasePipeline class.

I think that both Tune and Auto classes should be able to work with any user-defined pipeline structure, not only instances of Pipeline class.

@Mr-Geekman
Copy link
Contributor

I'm merging it.

@Mr-Geekman Mr-Geekman merged commit f4b50fe into tinkoff-ai:automl-2.0 May 21, 2023
@Mr-Geekman Mr-Geekman mentioned this pull request May 21, 2023
@GooseIt
Copy link
Contributor Author

GooseIt commented May 22, 2023

Thank you for bearing with me during this issue's completion.

I'm grateful for your assistance and reviewing.

I'll take a break from committing into Etna for a while. I don't know if or when I'll come back, but if I do, I will be happy to work with you once again.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants