Dry-run + ability to specify models + minor edits #39

yash-fn · 2021-09-15T15:21:11Z

I changed some of things @shasheene changed back to the way I had it originally just to ensure they weren't unintentional omissions. If they were intentional then just drop those commits. I explain below and in commit messages.

So 3 commits:

First with the minor edits:

Call vs Run?
- On my original PR you mentioned that we should use subprocess.run instead of subprocess.call because the latter was older api. But then I saw that you had it changed back to the call variant. Just wanted to ensure if this was intentional.
- Also, I saw that you switched from using os.path functions to just directly parsing the string yourself using split('/'). I favor using the os.path api so that we can just leverage others work to ensure the paths are parsed correctly across platforms and etc. We can just trust it will always work versus having to parse it ourselves and relying that we correctly implemented the splitting part ourselves. But your call on which you think is best.
Flatter nests
- I saw that you changed main.py's inference loop to go back to checking for relevant files and running inference all in one loop. I switched it back to the way I had with multiple loops because I think it just looks clearer with it being flatter rather than nested too much. Please read the commit message for more details.
- Also, if it was switched because the changes didn't work, I actually discovered that I had made an error on the first version of it. I was removing elements from an array while also actively reading it, which caused it to skip elements and otherwise not work correctly. I switched to appending to a new array and then deleting the old one instead, so it should work correctly now.

And the slightly larger edit:

Dry-run and models
- Instead of rearranging code as we did before to instantiate the models before checking for argument validity to get the useful side effect of being able instantiate cuda cache for reusable docker images, I just formalized it into --dry-run option to achieve the same thing, but now regular users with incorrect arguments don't have to wait for instantiation just to learn their arguments were incorrect. Best of both worlds.
- Also, now users can specify which pbmm and scorer models they wish to use if there are multiple. It defaults back to just searching the local directory for one if one isn't specified. Also, if more than one is found or zero are found, it will throw the appropriate errors to notify the user.

shasheene · 2021-09-16T13:06:00Z

Looks good to me.

Yeah I kept subprocess.call() so that the earlier commit did one thing only: provide the command as a list to fix only the path error issue. I wasn't sure whether the change to the subprocess.run() high level API could cause any negative side effects that I hadn't considered. Like the semantics of stdout/stderr. Switching to subprocess.run() is fine.

Yep, using basename() over split() was a change I intending to keep. It's very easy to fix things like that on top (like you have) now that everything has been merged, so thanks for making that change.

The filter stuff LGTM. As does the --dry-run option.

I am fine with this PR being merged.

The only thing for future reference is the commit messages are definitely not to my taste. But I'm not the owner of this git repository, and all other commits don't follow Chris Beam's great post linked earlier. (I highly recommend reading that blog post though!)

yash-fn · 2021-09-16T19:39:02Z

Sorry. New to git in general, but good point regarding verbosity. Reworded commit messages to be concise.

abhirooptalasila · 2021-09-18T17:55:50Z

Ok, looks good guys. Thanks!
And yes, I'll give that blog a read.

shasheene · 2021-09-19T01:13:50Z

And yes, I'll give that blog a read.

Your one-line commit messages are already pretty good for high-velocity development. I don't necessarily recommend writing a detailed commit message body unless the change warrants it.

But yes, the blog post is still worth reading.

yash-fn added 4 commits September 16, 2021 14:33

subprocess.run + basename path splitting

c9da745

Code Reorganization: Flatter nests

cbbfd2b

Dry-run/instantiate mode + ability to specify training models

9612973

README + GPU build script changes for dry-run feature

ab0a71d

yash-fn force-pushed the master branch from 2d6e50e to ab0a71d Compare September 16, 2021 19:37

abhirooptalasila merged commit 33fe2bc into abhirooptalasila:master Sep 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dry-run + ability to specify models + minor edits #39

Dry-run + ability to specify models + minor edits #39

yash-fn commented Sep 15, 2021

shasheene commented Sep 16, 2021

yash-fn commented Sep 16, 2021

abhirooptalasila commented Sep 18, 2021

shasheene commented Sep 19, 2021

Dry-run + ability to specify models + minor edits #39

Dry-run + ability to specify models + minor edits #39

Conversation

yash-fn commented Sep 15, 2021

shasheene commented Sep 16, 2021

yash-fn commented Sep 16, 2021

abhirooptalasila commented Sep 18, 2021

shasheene commented Sep 19, 2021