-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Chinese parser for NE and fix a few edge cases #1961
Conversation
@nvkelso Did you mention that we still miss something for it to work? I got the unit tests work, but the integration test 1955-chinese-parser failed the NE test case with the following error:
|
Let me look into the missing step, but in the meantime some PR comments above... |
after adding the source to the input in the integration, the integration tests work. |
}) | ||
) | ||
|
||
# the min_zoom is 2.7 so it should appear at zoom 3 to zoom 7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nvkelso please help confirm this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing at zoom 3 is fine... it should show up also at zoom 2... but if you're testing simply the Chinese language stuff then sometimes it's better to keep your test small... else it begins to be a "prod QA release" test rather than an simple "integration" test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to test in the range, then I'd start at 2 (should work!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nvkelso when I test zoom 2 this test failed with this error
FAIL: test_ne_san_francisco (integration-test.1955-chinese-parser.ChineseNameTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/pli2/Snapchat/Dev/vector-datasource/integration-test/1955-chinese-parser.py", line 135, in test_ne_san_francisco
'source': u'naturalearthdata.com',
File "/Users/pli2/Snapchat/Dev/vector-datasource/integration-test/__init__.py", line 1489, in assert_has_feature
self.test_instance.assert_has_feature(z, x, y, layer, props)
File "integration-test/__init__.py", line 1322, in assert_has_feature
self.assertions.assert_has_feature(z, x, y, layer, props)
File "integration-test/__init__.py", line 1151, in assert_has_feature
"layer %r was empty)" % (properties, layer))
AssertionError: Did not find feature including properties {'kind': 'locality', 'name': 'San Francisco', 'name:zh-Hant': u'\u820a\u91d1\u5c71', 'name:zh-Hans': u'\u65e7\u91d1\u5c71', 'source': u'naturalearthdata.com', 'id': 26819236} (because layer 'places' was empty)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh. That's configured here:
Which should work but moving on... not important for this particular test so keep at 3.
@nvkelso would you please also help check this line
I manually change it to |
For the Tilezen data fixture there are only a handful of valid fully qualified "web domain"
The OpenStreetMap node confusingly also has a We need to set the correct source in the test data fixture, so the Tilezen software knows when and how to export it in tiles. |
@peitili Please fix your PEP8 problems with Python formatting so I can see the output of the CircleCI build tests, please. EG:
|
vectordatasource/transform.py
Outdated
|
||
# only select one of the options if the field is separated by "/" | ||
# for example if the field is "旧金山市县/三藩市市縣/舊金山市郡" only the first | ||
# one 旧金山市县 will be preserved | ||
properties['name:zh'] = properties['name:zh'].split('/')[0].strip() | ||
properties['name:zht'] = properties['name:zht'].split('/')[0].strip() | ||
properties['name:zh-Hans'] = properties['name:zh-Hans'].split('/')[0].strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hahahaha look at https://www.wikidata.org/wiki/Q374404 and Simplified Chinese value \菲利普斯县
. Truth is stranger than fiction.
There are 2 related test failures in Circle CI to fix, relating to the change of name:zh:
|
fixed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One final ask... please document the new Chinese languages codes here: https://github.com/tilezen/vector-datasource/blob/024909ed8245a4ad4a25c908413ba3602de6c335/docs/SEMANTIC-VERSIONING.md#common-languages
Also marking the existing name:zh
as deprecated.
I filed #1962 for the breaking v2 work later.
Add Chinese parser for NE and change the output property name from "zh" to "zh-hans" and "zht" to "zh-hant"
Also fixes a few edge cases for OSM and WOF Chinese fields.
#1955