-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]: cupy, and by extension cudf changes the system's preferred encoding to ANSI_X3.4-1968 #859
Closed
2 tasks done
Labels
bug
Something isn't working
Comments
This is a known issue with NVRTC (cupy/cupy#7514 (comment)) currently there are only two known work-arounds:
|
2 tasks
3 tasks
2 tasks
rapids-bot bot
pushed a commit
that referenced
this issue
Apr 28, 2023
This PR creates at least one test for each example containing custom stages. This PR currently only covers those examples which do not require additional packages. Part of #849. * Moves the bert vocabulary files to `morpheus/data` dir, no longer requiring them to be fetched from LFS and making them available to unittests. * Fixes type hints and remove a redundant method in `examples/log_parsing/inference.py` * Remove redundant copies of `bert-base-cased-hash.txt` and `bert-base-uncased-hash.txt` files, replacing them with symlinks to the files in the morpheus/data` dir fixes #850 * Explicitly set `encoding='UTF-8'` in `examples/log_parsing/postprocessing.py` as a work-around for issue #859 * Add `py::kw_only` to Python bindings for `TensorMemory` and sublasses to ensure parity with Python impls. * Set `repr=False` for the `tensors` field of `TensorMemory` avoids bug when printing due to the fact that we assign the value to `self._tensors` * Seed cupy's random number generator in `manual_seed` method. * Fix usage of `reload_modules` fixture, requesting a reload of multiple modules should be done with `@pytest.mark.reload_modules([mod1, mod2])` not calling `reload_modules` twice. * New test data in `tests/tests_data/log_parsing` is based upon the first 5 rows of data from `models/datasets/validation-data/log-parsing-validation-data-input.csv` Authors: - David Gardner (https://github.com/dagardner-nv) Approvers: - Michael Demoret (https://github.com/mdemoret-nv) URL: #885
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Version
23.07
Which installation method(s) does this occur on?
Docker, Conda, Source
Describe the bug.
Calling various cupy and cudf methods changes the system's preferred encoding from UTF-8 to ANSI_X3.4-1968.
cupy/cupy#7514
rapidsai/cudf#13085
The problem is any code called after this that requires reading a UTF-8 data source without explicitly setting the encoding will fail.
Minimum reproducible example
Relevant log output
Full env printout
No response
Other/Misc.
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: