Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update example notebooks to create schema object from merlin core #650

Merged
merged 6 commits into from
Mar 17, 2023

Conversation

rnyak
Copy link
Contributor

@rnyak rnyak commented Mar 15, 2023

This PR updates schema creation line in the example notebooks based on this merged PR #642.

Blocked by the error coming from multi gpu training example: #651

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@rnyak rnyak added examples chore Maintenance for the repository labels Mar 15, 2023
@github-actions
Copy link

@rnyak rnyak added this to the Merlin 23.03 milestone Mar 15, 2023
@@ -69,7 +69,18 @@
"execution_count": 2,
Copy link
Contributor

@bschifferer bschifferer Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we call fit transform twice - first above - .fit and .transform and then .compute

and here, we call workflow.fir_transform().to_parquet.

We do the calculation twice.

I think we should change the process above to all fit_transform().to_parquet and then read the data with cudf.read_parquet() here.


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. I can change that.

Copy link
Contributor

@bschifferer bschifferer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve, if we change the NVTabular workflow to fit/transform only once instead of doing it twice

@rnyak rnyak changed the title [WIP] Update example notebooks to create schema object from merlin core Update example notebooks to create schema object from merlin core Mar 16, 2023
@bschifferer bschifferer merged commit a293d8c into main Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore Maintenance for the repository examples
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants