-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminate short-circuiting for loading from local #3600
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would actually be good for us to add a test for this in test_llm.py
, which downloads weights to local, then uses that path as the base_model name, and then just makes sure the there are no errors when we do ModelConfig.from_dict
. Equally, I wouldn't be opposed to the test instead just training for 3-4 steps. I'd be okay with both, but a test will be good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you rebase, you should be able to get a clean CI
if os.path.isdir(model_name): | ||
return model_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, wondering if we should add some more checks here, the most basic one being that the directory should not be empty. In an ideal world, we also add validation to ensure that the model config exists in this directly and it can be initialized correctly from this directory using the same code block from line 57, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think we can do this in a fast-follow, but I do think for completeness these additional checks are important. What do you think? @Infernaught @justinxzhao
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. Justin and I expect HF to give us a failure message if the model objects are bad, so it's probably not that bad if we don't have these checks for the time being. I'm going to merge this for now, but I definitely think we should keep thinking about how to properly verify this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay with the merge for now as well, but I do think we should follow-up with custom validation here and raise clear resolution methods for what went wrong and how to fix it.
I do think we should add these checks now because they're honestly so trivial to add and we want to fail fast, always. There's nothing wrong with leaving it up to HF, but I guarantee that we are going to get Ludwig users messaging us asking us why they're seeing cryptic import error or not found errors from HF without them when there is a user-side error. The reason I personally really like these kinds of validation checks is that it lets us get ahead and provide really clear and crisp resolution methods to unblock/self-serve yourself, and that is what I would lean towards.
Previously, we would raise an error if a local path was passed base_model in our config. This PR adds a check that verifies if the local path is valid before checking if it is a valid HF path.