Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT: updates to HF Hub, manage fairlearn deprecation #392

Conversation

BenjaminBossan
Copy link
Collaborator

Some maintenance work to make skops future proof:

  1. Ignore FutureWarning stemming from fairlearn: New pandas versions give FutureWarning for applymap. We have to wait for fairlearn to update this.
  2. Make the widget attribute an array: Starting from HF Hub 0.18, it has to be an array.
  3. InferenceApi is deprecated move to InferenceClient: API changed a little bit, had to be adjusted.

Note: DON'T MERGE THIS YET. The min version of HF Hub has been increased to 0.18 to test the changes but it actually should be lowered to 0.17 once CI is green for 0.18.

New pandas versions give FutureWarning for applymap. We have to wait for
fairlearn to update this.
Starting from HF Hub 0.18, it has to be an array.
API changed a little bit, had to be adjusted.
This SHOULD NOT BE MERGED. Final min version will be 0.17 once the CI is
green.
@BenjaminBossan
Copy link
Collaborator Author

BenjaminBossan commented Oct 6, 2023

@adrinjalali This should be ready to review. CI is partly red, but that just seems to be due to rate limit. I'll restart the tests later.

Edit: Seems to be a daily quota, so I guess this will have to wait a bit longer.

Edit: No longer red. Ready for review. If accepted, we just need to lower the hf hub requirement and then it's ready to be merged.

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM.

)
client = InferenceClient(token=token)
res_bytes = client.post(json={"inputs": inputs}, model=repo_id)
res = json.loads(res_bytes.decode("utf-8"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we have a test for this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I forgot that we require [CI inference] in the commit message to trigger it. This is done now but of course we run into the rate limit again...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove the functionality then since the rate limit is so low?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mostly an issue for our CI, right? Which is probably why we made the test opt-in in the first place. I'd say we leave it, in case some users actually use the feature. If we want to remove it later on, we can do it properly with a deprecation cycle.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also add a random sleep between 1 and 100 hours to avoid hitting the limit ;-P

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not happy with a functionality that even our CI hits the limits pretty much every time. But we can merge and figure that out in a separate PR.

@@ -13,7 +13,7 @@
dependent_packages = {
"scikit-learn": ("0.24", "install", None),
"scikit-learn-intelex": ("2021.7.1", "docs", None),
"huggingface_hub": ("0.10.1", "install", None),
"huggingface_hub": ("0.18.0.rc0", "install", None),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this change before merge?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the actual min version is 0.17. I can change it now or we can first ensure that all tests pass for 0.18 too -- which may take some time ;-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let me know when they pass and then we can merge.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll let you know in a month or two.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really that bad? So it's not something we can maintain. Let's fix the version merge here, then we deal with the deprecation

Copy link
Collaborator Author

@BenjaminBossan BenjaminBossan Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the error now says that the daily quota has been exceeded. I don't remember it being this tight, not sure if it was reduced. We may be able to avoid it by using a HF employee token, but then we have the old issue of hiding it while still allowing it to be used in PRs, right?

We could perhaps also restrict the inference test to only run in 1 or 2 settings, instead of the full matrix, to avoid the rate limit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the end of the day, it's not a free service, and it doesn't match the vibe of this library anymore I would say. We added it because everybody could simply use it, but with the new rate limits, I don't see why we would keep supporting it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, the rate limit is pretty aggressive (still couldn't make all tests pass), so for practical purposes, it's not a free service anymore. I think we have a few options, I'll leave the decision up to you:

  1. Still merge this PR, as I'm pretty sure it works (after adjusting the min version of course). Then create a new PR to deprecate the feature.
  2. Don't merge this PR. Deprecate the feature.
  3. Remove without deprecation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be happy to merge this PR, then I'll open another PR to deprecate the feature.

It was 0.18.0.rc0 for testing only.
@adrinjalali adrinjalali merged commit 933edbe into skops-dev:main Oct 11, 2023
8 of 14 checks passed
@jot-s-bindra
Copy link

Its not related to this commit
I just wanted your help in setting up environment in local machine

git clone git@github.com:jot-s-bindra/skops.git
git init
git checkout -b new-feature
conda create -c conda-forge -n skops python=3.10
conda activate skops
python -m pip install -e ".[tests,docs]"
conda install -c conda-forge pre-commit
pre-commit install
pytest

during pytest i am recieving variable declare errors , modules not installed errors like
ERROR examples/plot_california_housing.py - KeyError: 'HF_HUB_TOKEN'
ERROR examples/plot_hf_hub.py - KeyError: 'HF_HUB_TOKEN'
ERROR examples/plot_intelex.py - KeyError: 'HF_HUB_TOKEN'
ERROR scripts/clean_skops.py - OSError: pytest: reading from stdin while output is captured! Consider using -s.
ERROR spaces/skops_model_card_creator/app.py - ModuleNotFoundError: No module named 'streamlit'
ERROR spaces/skops_model_card_creator/create.py - ModuleNotFoundError: No module named 'streamlit'
ERROR spaces/skops_model_card_creator/edit.py - ModuleNotFoundError: No module named 'streamlit'
ERROR spaces/skops_model_card_creator/gethelp.py - ModuleNotFoundError: No module named 'streamlit'
ERROR spaces/skops_model_card_creator/start.py - ModuleNotFoundError: No module named 'streamlit'
ERROR spaces/skops_model_card_creator/tasks.py - ModuleNotFoundError: No module named 'streamlit'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants