Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests fail when under non UTF8 LANG setting #1199

Closed
cmacdonald opened this issue May 15, 2020 · 2 comments
Closed

tests fail when under non UTF8 LANG setting #1199

cmacdonald opened this issue May 15, 2020 · 2 comments

Comments

@cmacdonald
Copy link

Failed tests:   testNonEnglishTopics(io.anserini.search.topicreader.TopicReaderTest): expected:<[?????????????????]> but was:<[《千里走单骑》和张艺谋是什么关系?]>
  testNonEnglishTopics_TopicIdsAsStrings(io.anserini.search.topicreader.TopicReaderTest): expected:<[?????????????????]> but was:<[《千里走单骑》和张艺谋是什么关系?]>

my LANG is en_GB.
if my LANG is en_GB.UTF-8 it works fine.

@lintool
Copy link
Member

lintool commented May 15, 2020

Ah, good call. We had related issues before, thought I whacked all the moles. Will go back to whacking more... The crux is UTF-8, since tests have non-Roman script.

@lintool
Copy link
Member

lintool commented Apr 5, 2021

Another mole whacked.

For future reference, since I had to "re-discover" the solution:

import java.util.Locale;
...

Locale.setDefault(Locale.US);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants