Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add 1st pytorch example #1180

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

add 1st pytorch example #1180

wants to merge 7 commits into from

Conversation

radekosmulski
Copy link
Contributor

@radekosmulski radekosmulski commented Jul 5, 2023

This adds the first pytorch example 🥳

Additionally, I propose in DLRMModel we rename dim to embedding_dim. This aligns us with the tf api and (which is probably more important) is more informative to the reader (this is the language used in the paper, otherwise it is a bit confusing what does output refer to -- is it the output of the model? of mlp hidden layers? embedding_dim is more informative)

@github-actions
Copy link

github-actions bot commented Jul 5, 2023

Documentation preview

https://nvidia-merlin.github.io/models/review/pr-1180

@radekosmulski radekosmulski marked this pull request as draft July 5, 2023 03:25
@radekosmulski radekosmulski marked this pull request as ready for review July 5, 2023 05:12
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@@ -0,0 +1,364 @@
{
Copy link
Contributor

@bschifferer bschifferer Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we propose with Loader(train, batch_size=1024) as loader: which is different to our TensorFlow examples?


Reply via ReviewNB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something @edknv suggested, it ensures that the background thread gets removed. I am working on a way to see if we can move this inside our model/trainer code. Because I think the context manager approach works for single GPU, but I don't think it will work in a multi-GPU setting.

@@ -0,0 +1,364 @@
{
Copy link
Contributor

@bschifferer bschifferer Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, if we have the next notebook available on PyTorch - we might need to reference the TensorFlow one OR the next steps are removed OR we link to the other training examples


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point! removed the cell for now and will add it once we have more examples

"id": "23d9bf34",
"metadata": {},
"source": [
"<img src=\"https://developer.download.nvidia.com/notebooks/dlsw-notebooks/merlin_models_01-getting-started/nvidia_logo.png\" style=\"width: 90px; float: right;\">\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might need to update the logo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, absolutely! good point, created a new tracking logo

@@ -0,0 +1,348 @@
{
Copy link
Contributor

@rnyak rnyak Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #4.    model.initialize(train_loader)

can we add some explanation why we do need model.initialize() step?


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added note on the functionality of initialize

@@ -0,0 +1,348 @@
{
Copy link
Contributor

@rnyak rnyak Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this entire block again?


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure -- was just copying over what we have on the TF side, we follow the same pattern there

@radekosmulski radekosmulski force-pushed the add_pytorch_DLRM_example branch from 7dfa115 to 7368970 Compare July 6, 2023 08:30
@radekosmulski radekosmulski requested a review from rnyak July 6, 2023 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants