Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unit tests for SOKEmbedding #980

Merged

Conversation

edknv
Copy link
Contributor

@edknv edknv commented Feb 13, 2023

This PR adds unit tests for SOKEmbedding, proposed in #863. Note: this PR is for the fea-sok-integration-wj branch, not the main branch.

@edknv edknv added the chore Maintenance for the repository label Feb 13, 2023
@edknv edknv self-assigned this Feb 13, 2023
tox.ini Outdated
@@ -33,6 +33,7 @@ commands =
conda env create --prefix {envdir}/env --file requirements/horovod-cpu-environment.yml --force
{envdir}/env/bin/python -m pip install horovod --no-cache-dir
{envdir}/env/bin/horovodrun --check-build
{envdir}/env/bin/python -m pip install merlin-sok
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is merlin-sok the correct package to install?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, let me revise the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edknv I just realize the branch is on your repo, So I cannot modify it. the correct way to install sok is pip install sparse_operation_kit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in f4fef6d.

weights = np.random.rand(10, 16)
embedding = SOKEmbedding.from_pretrained(16, [weights])
inputs = [tf.ragged.constant([[0, 1, 0], [1, 0]])]
combiners = ["sum"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would also make sense to have a sensible default for combiners and make this an optional parameter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edknv Make sense. I will modify it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. We can merge this PR to the main PR and apply the changes there.

print("[SOK INFO] diff:", diff)
assert diff < 1e-6
def test_sok_embedding_basic(self):
embedding = SOKEmbedding(16, self.sample_column_schema, vocab_sizes=[10])
Copy link
Contributor Author

@edknv edknv Feb 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can vocab_sizes be determined from the input schema?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can get from the input schema. But I don't know the right way to parse it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also skip it for now in this version, and iterate on it later in the next version.

@rnyak rnyak added this to the Merlin 23.02 milestone Feb 15, 2023
@edknv
Copy link
Contributor Author

edknv commented Feb 15, 2023

@WonderingWJ I'll go ahead and merge this PR onto your PR since you can't make changes to this one.

@edknv edknv merged commit d7531f6 into NVIDIA-Merlin:fea-sok-integration-wj Feb 15, 2023
@WonderingWJ
Copy link
Contributor

@edknv Thank you!

edknv added a commit that referenced this pull request Mar 21, 2023
* new sok class

* new sok class

* test sok dynamic variable

* test sok dynamic variable

* bug fix comma

* add some comments and test distributed var

* format the comments

* assert condition in sok lookup sparse

* Move SOKEmbedding to a separate file

* Clean up

* Clean up

* fix some import and param bug

* remove some unused variable

* remove intial vals

* fix import

* reorder the import

* fix import error in test embedding

* format the code

* change the way of import

* support sp_weights in lookup

* Add unit tests for SOKEmbedding (#980)

* Add unit tests for SOKEmbedding

* pip install sparse_operation_kit

* remove sok from ci since no gpu

* lint

* pip install sparse_operation_kit in tox.ini

* fix spelling

* resolve init method issue

* fix schema issue

* add indices and weights in from_pretrained

* schema issue in from_pretrained

* init method in DET

* tensor type in sok assing

* resolve config passing

* lint

* install sok in tox

---------

Co-authored-by: zhuwenjing <zhuwenjing@360.cn>
Co-authored-by: wenjingz <wenjingz@nvidia.com>
Co-authored-by: edknv <edwardk@nvidia.com>
Co-authored-by: Marc Romeyn <marcromeyn@gmail.com>
Co-authored-by: edknv <109497216+edknv@users.noreply.github.com>
Co-authored-by: rnyak <ronayak@hotmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/tests chore Maintenance for the repository
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants