Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dry-run + ability to specify models + minor edits #39

Merged
merged 4 commits into from
Sep 18, 2021

Conversation

yash-fn
Copy link
Contributor

@yash-fn yash-fn commented Sep 15, 2021

I changed some of things @shasheene changed back to the way I had it originally just to ensure they weren't unintentional omissions. If they were intentional then just drop those commits. I explain below and in commit messages.

So 3 commits:

First with the minor edits:

  1. Call vs Run?

    • On my original PR you mentioned that we should use subprocess.run instead of subprocess.call because the latter was older api. But then I saw that you had it changed back to the call variant. Just wanted to ensure if this was intentional.
    • Also, I saw that you switched from using os.path functions to just directly parsing the string yourself using split('/'). I favor using the os.path api so that we can just leverage others work to ensure the paths are parsed correctly across platforms and etc. We can just trust it will always work versus having to parse it ourselves and relying that we correctly implemented the splitting part ourselves. But your call on which you think is best.
  2. Flatter nests

    • I saw that you changed main.py's inference loop to go back to checking for relevant files and running inference all in one loop. I switched it back to the way I had with multiple loops because I think it just looks clearer with it being flatter rather than nested too much. Please read the commit message for more details.
    • Also, if it was switched because the changes didn't work, I actually discovered that I had made an error on the first version of it. I was removing elements from an array while also actively reading it, which caused it to skip elements and otherwise not work correctly. I switched to appending to a new array and then deleting the old one instead, so it should work correctly now.

And the slightly larger edit:

  1. Dry-run and models
    • Instead of rearranging code as we did before to instantiate the models before checking for argument validity to get the useful side effect of being able instantiate cuda cache for reusable docker images, I just formalized it into --dry-run option to achieve the same thing, but now regular users with incorrect arguments don't have to wait for instantiation just to learn their arguments were incorrect. Best of both worlds.
    • Also, now users can specify which pbmm and scorer models they wish to use if there are multiple. It defaults back to just searching the local directory for one if one isn't specified. Also, if more than one is found or zero are found, it will throw the appropriate errors to notify the user.

@shasheene
Copy link
Contributor

Looks good to me.

Yeah I kept subprocess.call() so that the earlier commit did one thing only: provide the command as a list to fix only the path error issue. I wasn't sure whether the change to the subprocess.run() high level API could cause any negative side effects that I hadn't considered. Like the semantics of stdout/stderr. Switching to subprocess.run() is fine.

Yep, using basename() over split() was a change I intending to keep. It's very easy to fix things like that on top (like you have) now that everything has been merged, so thanks for making that change.

The filter stuff LGTM. As does the --dry-run option.

I am fine with this PR being merged.

The only thing for future reference is the commit messages are definitely not to my taste. But I'm not the owner of this git repository, and all other commits don't follow Chris Beam's great post linked earlier. (I highly recommend reading that blog post though!)

@yash-fn
Copy link
Contributor Author

yash-fn commented Sep 16, 2021

Sorry. New to git in general, but good point regarding verbosity. Reworded commit messages to be concise.

@abhirooptalasila
Copy link
Owner

Ok, looks good guys. Thanks!
And yes, I'll give that blog a read.

@abhirooptalasila abhirooptalasila merged commit 33fe2bc into abhirooptalasila:master Sep 18, 2021
@shasheene
Copy link
Contributor

And yes, I'll give that blog a read.

Your one-line commit messages are already pretty good for high-velocity development. I don't necessarily recommend writing a detailed commit message body unless the change warrants it.

But yes, the blog post is still worth reading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants