Fix dependency installation for `topic/machine-learning/automl` #366

hammerhead · 2024-03-13T19:23:34Z

Summary of the changes / Why this is an improvement

Google Colab currently fails running pip install. pycaret[analysis] pulls in pyyaml==5.3.1 (from 2020!), which no longer installs on Google Colab:

Collecting pyyaml==5.3.1 (from pycaret[analysis,models,parallel,test,tuner]==3.3.0->-r https://raw.githubusercontent.com/crate/cratedb-examples/main/topic/machine-learning/automl/requirements.txt (line 5))
  Using cached PyYAML-5.3.1.tar.gz (269 kB)
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  Preparing metadata (setup.py) ... error
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

I couldn't find any explicit documentation from PyCaret on what the analysis extra does. On master, PyCaret already updated PyYAML to 5.4.1 (from 2021 🙄). Can anyone confirm if the extra is actually used?

There was also the tuner extra used which doesn't exist (any more) as well as a few direct dependencies I could not find any usages of.

Checklist

Link to issue this PR refers to (if applicable):
CLA is signed

amotl

Thank you for resolving an issue with dependencies we have not been aware of. We fully trust your assessment on this topic, as we also only added them on a best-effort basis.

While we can't trust CI fully, because it just runs the notebooks to completion, without actually validating them, we think it is at least a soft indicator that CI succeeds, that the change will not cause any other havoc. As it demonstrates the pieces also work in a non-Colab environment, we think the change will be good to go?

hammerhead · 2024-03-14T17:06:33Z

Right, I didn't have the chance yet to run through both of the Jupyter Notebooks. That should still be done prior to merging this, ensuring it's not failing on any missing dependencies with this change.

amotl · 2024-03-14T17:37:57Z

We think it will be good, because CI usually tests all the notebooks within the corresponding folder. Any missing dependency would be expected to raise a runtime error there, so it should not go unnoticed. Coming from this assessment, we think it is safe to merge this patch.

However, sure enough, testing manually once again would decrease the chance of edge case failures, like »better safe than sorry«.

hammerhead added 4 commits March 13, 2024 19:44

Remove unused werkzeug dependency

5db9245

Remove pycaret[tuner] dependency, it doesn't exist (any more)

aefa537

Remove unused pycaret[analysis] dependency

d86b066

Remove unused tqdm dependency

7070dba

amotl approved these changes Mar 14, 2024

View reviewed changes

hammerhead merged commit 79a8788 into main Mar 14, 2024
1 check passed

hammerhead deleted the hammerhead/werkzeug branch March 14, 2024 17:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix dependency installation for `topic/machine-learning/automl` #366

Fix dependency installation for `topic/machine-learning/automl` #366

hammerhead commented Mar 13, 2024

amotl left a comment •

edited

Loading

hammerhead commented Mar 14, 2024

amotl commented Mar 14, 2024 •

edited

Loading

Fix dependency installation for topic/machine-learning/automl #366

Fix dependency installation for topic/machine-learning/automl #366

Conversation

hammerhead commented Mar 13, 2024

Summary of the changes / Why this is an improvement

Checklist

amotl left a comment • edited Loading

Choose a reason for hiding this comment

hammerhead commented Mar 14, 2024

amotl commented Mar 14, 2024 • edited Loading

Fix dependency installation for `topic/machine-learning/automl` #366

Fix dependency installation for `topic/machine-learning/automl` #366

amotl left a comment •

edited

Loading

amotl commented Mar 14, 2024 •

edited

Loading