Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added integration test for four model types #89

Merged
merged 16 commits into from
Jan 26, 2024
Merged

Conversation

x-tabdeveloping
Copy link
Collaborator

Integration test for:

  • SBERT
  • E5
  • Translate -> E5
  • FastText

@x-tabdeveloping
Copy link
Collaborator Author

Addresses #84

@x-tabdeveloping x-tabdeveloping linked an issue Jan 23, 2024 that may be closed by this pull request
@x-tabdeveloping
Copy link
Collaborator Author

Ah crap I forgot DKHate was gated

Copy link
Owner

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One change otherwise looks good

tests/cli/test_cli.py Outdated Show resolved Hide resolved
@x-tabdeveloping
Copy link
Collaborator Author

So problem: It simply takes too long to download all models and then run them, even if we only take the four that are currently in the test. The Fasttext model is especially heavyweight, which is a bummer because I just figured out that it had a bug. The translate E5 also has a very hefty translation model, so I'm frankly not sure how we could pull this off, especially on Github. I will try to think of a smart solution. Any thoughts? @KennethEnevoldsen

@x-tabdeveloping
Copy link
Collaborator Author

I think for one, we could drop all-MiniLM-L6-v2, and the regular E5, since translate-e5 depends on the same functionality, so we technically hit three birds with one stone. (though if a pull request changes these dependencies then this would no longer be the case)

@x-tabdeveloping
Copy link
Collaborator Author

With FastText I'm at a bit of a loss. I think maybe we could add the wiki models to the test as they have a smaller vocabulary??

@KennethEnevoldsen
Copy link
Owner

KennethEnevoldsen commented Jan 24, 2024

I would exclude the translation model and potentially remove the fasttext (though ideally download a smaller one). One thing you can also do is have it reuse the cache dir of the models (but that is mostly to speed things up).

@x-tabdeveloping
Copy link
Collaborator Author

I believe it should run in a reasonable timeframe now.

@x-tabdeveloping
Copy link
Collaborator Author

Hmm very weird behaviour. Pybind is clearly installed and the fasttext install still doesn't recognize it, and I can't reproduce this on my own machine, something fishy going on.

@x-tabdeveloping
Copy link
Collaborator Author

Maybe adding an actual task (not just dummy) with all-MiniLM could be a good idea, but I'm not entirely sure cause right now it tests most types of models and runs relatively fast, what do you think?

@KennethEnevoldsen
Copy link
Owner

Maybe adding an actual task (not just dummy) with all-MiniLM could be a good idea, but I'm not entirely sure cause right now it tests most types of models and runs relatively fast, what do you think?

For the integration test it would be nice if it is an MTEB task.
However, one option is to mark the test as "slow" and only run it in the GitHub tests suite (so you don't have to run it locally). I don't think one task is too bad though. Especially if we pick LCC.

@x-tabdeveloping
Copy link
Collaborator Author

Okay, how about we keep the dummy task for the heftier models (so that we can test whether they run at all). And then add LCC for a lighter model (like MiniLM)?

@KennethEnevoldsen
Copy link
Owner

Add a comment for it.

Hmm very weird behaviour. Pybind is clearly installed and the fasttext install still doesn't recognize it, and I can't reproduce this on my own machine, something fishy going on.

If this is still a problem I would remove fasttext from the integration test for now and then get this merged in and create an issue.

@x-tabdeveloping
Copy link
Collaborator Author

It does work now fortunately, I fixed it by using the fasttext-wheel package from PyPI instead, which uses the old ways to install the thingy. It's a bit of a nasty workaround though.

@KennethEnevoldsen
Copy link
Owner

KennethEnevoldsen commented Jan 24, 2024

It does work now fortunately, I fixed it by using the fasttext-wheel package from PyPI instead, which uses the old ways to install the thingy. It's a bit of a nasty workaround though.

as long as it is an optional dependency I don't have too many issues.

@x-tabdeveloping
Copy link
Collaborator Author

@KennethEnevoldsen is this fine with you as it stands or do you want me to change something?

Copy link
Owner

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only a few minor things, but otherwise feel free to merge

makefile Outdated Show resolved Hide resolved
tests/test_integration.py Outdated Show resolved Hide resolved
tests/test_integration.py Outdated Show resolved Hide resolved
@x-tabdeveloping
Copy link
Collaborator Author

@KennethEnevoldsen tests are failing because on an assertion. Do you have an explanation?

@KennethEnevoldsen
Copy link
Owner

KennethEnevoldsen commented Jan 26, 2024

It is because of the score not being correct:

is_approximately_equal(0.40919975266183484, 0.423)

It is due to an incorrect merge I believe. I will mark the sections

@x-tabdeveloping x-tabdeveloping merged commit f427a6d into main Jan 26, 2024
4 of 6 checks passed
@x-tabdeveloping x-tabdeveloping deleted the stuff_runs_tests branch January 26, 2024 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integration tests for all model types.
2 participants