Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dependency installation for topic/machine-learning/automl #366

Merged
merged 4 commits into from
Mar 14, 2024

Conversation

hammerhead
Copy link
Member

Summary of the changes / Why this is an improvement

Google Colab currently fails running pip install. pycaret[analysis] pulls in pyyaml==5.3.1 (from 2020!), which no longer installs on Google Colab:

Collecting pyyaml==5.3.1 (from pycaret[analysis,models,parallel,test,tuner]==3.3.0->-r https://raw.githubusercontent.com/crate/cratedb-examples/main/topic/machine-learning/automl/requirements.txt (line 5))
  Using cached PyYAML-5.3.1.tar.gz (269 kB)
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  Preparing metadata (setup.py) ... error
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

I couldn't find any explicit documentation from PyCaret on what the analysis extra does. On master, PyCaret already updated PyYAML to 5.4.1 (from 2021 🙄). Can anyone confirm if the extra is actually used?

There was also the tuner extra used which doesn't exist (any more) as well as a few direct dependencies I could not find any usages of.

Checklist

  • Link to issue this PR refers to (if applicable):
  • CLA is signed

Copy link
Member

@amotl amotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for resolving an issue with dependencies we have not been aware of. We fully trust your assessment on this topic, as we also only added them on a best-effort basis.

While we can't trust CI fully, because it just runs the notebooks to completion, without actually validating them, we think it is at least a soft indicator that CI succeeds, that the change will not cause any other havoc. As it demonstrates the pieces also work in a non-Colab environment, we think the change will be good to go?

@hammerhead
Copy link
Member Author

Right, I didn't have the chance yet to run through both of the Jupyter Notebooks. That should still be done prior to merging this, ensuring it's not failing on any missing dependencies with this change.

@amotl
Copy link
Member

amotl commented Mar 14, 2024

We think it will be good, because CI usually tests all the notebooks within the corresponding folder. Any missing dependency would be expected to raise a runtime error there, so it should not go unnoticed. Coming from this assessment, we think it is safe to merge this patch.

However, sure enough, testing manually once again would decrease the chance of edge case failures, like »better safe than sorry«.

@hammerhead hammerhead merged commit 79a8788 into main Mar 14, 2024
1 check passed
@hammerhead hammerhead deleted the hammerhead/werkzeug branch March 14, 2024 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants