Add Chinese parser for NE and fix a few edge cases #1961

peitili · 2021-07-30T00:22:23Z

Update tests
Update docs

Add Chinese parser for NE and change the output property name from "zh" to "zh-hans" and "zht" to "zh-hant"
Also fixes a few edge cases for OSM and WOF Chinese fields.

#1955

peitili · 2021-07-30T00:23:56Z

@nvkelso Did you mention that we still miss something for it to work? I got the unit tests work, but the integration test 1955-chinese-parser failed the NE test case with the following error:

FAIL: test_ne_san_francisco (integration-test.1955-chinese-parser.ChineseNameTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/pli2/Snapchat/Dev/vector-datasource/integration-test/1955-chinese-parser.py", line 132, in test_ne_san_francisco
    'name:zh-hant': u'舊金山'
  File "/Users/pli2/Snapchat/Dev/vector-datasource/integration-test/__init__.py", line 1489, in assert_has_feature
    self.test_instance.assert_has_feature(z, x, y, layer, props)
  File "integration-test/__init__.py", line 1322, in assert_has_feature
    self.assertions.assert_has_feature(z, x, y, layer, props)
  File "integration-test/__init__.py", line 1158, in assert_has_feature
    (properties, closest['properties'], misses))
AssertionError: Did not find feature including properties {'name:zh-hant': u'\u820a\u91d1\u5c71', 'name:zh-hans': u'\u65e7\u91d1\u5c71', 'id': 26819236, 'name': 'San Francisco'}. The closest match was {'kind': 'locality', 'collision_rank': 377, 'name': 'San Francisco', 'min_zoom': 3.0, 'id': 26819236, 'population_rank': 0}: missed {'name:zh-hant': "None != '\\xe8\\x88\\x8a\\xe9\\x87\\x91\\xe5\\xb1\\xb1'", 'name:zh-hans': "None != '\\xe6\\x97\\xa7\\xe9\\x87\\x91\\xe5\\xb1\\xb1'"}.

docs/layers.md

integration-test/1955-chinese-parser.py

nvkelso · 2021-07-30T00:58:59Z

Let me look into the missing step, but in the meantime some PR comments above...

peitili · 2021-07-30T01:37:09Z

Let me look into the missing step, but in the meantime some PR comments above...

after adding the source to the input in the integration, the integration tests work.

peitili · 2021-07-30T01:39:38Z

integration-test/1955-chinese-parser.py

+            })
+        )
+
+        # the min_zoom is 2.7 so it should appear at zoom 3 to zoom 7


@nvkelso please help confirm this.

Testing at zoom 3 is fine... it should show up also at zoom 2... but if you're testing simply the Chinese language stuff then sometimes it's better to keep your test small... else it begins to be a "prod QA release" test rather than an simple "integration" test.

If you want to test in the range, then I'd start at 2 (should work!).

@nvkelso when I test zoom 2 this test failed with this error

FAIL: test_ne_san_francisco (integration-test.1955-chinese-parser.ChineseNameTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/pli2/Snapchat/Dev/vector-datasource/integration-test/1955-chinese-parser.py", line 135, in test_ne_san_francisco 'source': u'naturalearthdata.com', File "/Users/pli2/Snapchat/Dev/vector-datasource/integration-test/__init__.py", line 1489, in assert_has_feature self.test_instance.assert_has_feature(z, x, y, layer, props) File "integration-test/__init__.py", line 1322, in assert_has_feature self.assertions.assert_has_feature(z, x, y, layer, props) File "integration-test/__init__.py", line 1151, in assert_has_feature "layer %r was empty)" % (properties, layer)) AssertionError: Did not find feature including properties {'kind': 'locality', 'name': 'San Francisco', 'name:zh-Hant': u'\u820a\u91d1\u5c71', 'name:zh-Hans': u'\u65e7\u91d1\u5c71', 'source': u'naturalearthdata.com', 'id': 26819236} (because layer 'places' was empty)

Huh. That's configured here:

https://github.com/tilezen/vector-datasource/blob/master/queries/ne-places.jinja2#L41

Which should work but moving on... not important for this particular test so keep at 3.

peitili · 2021-07-30T01:52:57Z

@nvkelso would you please also help check this line

vector-datasource/integration-test/1955-chinese-parser.py

Line 124 in 39b34fa

u"source": u"openstreetmap.org",

I manually change it to openstreetmap.org before to make the test work, however if you check the source on OSM https://www.openstreetmap.org/node/424317935 it says the source is Wikipedia. So is that OK? If we put Wikipedia in the integration test, it would fail because source Wikipedia is unrecognized.

nvkelso · 2021-07-30T04:53:24Z

I manually change it to openstreetmap.org before to make the test work, however if you check the source on OSM https://www.openstreetmap.org/node/424317935 it says the source is Wikipedia. So is that OK? If we put Wikipedia in the integration test, it would fail because source Wikipedia is unrecognized.

For the Tilezen data fixture there are only a handful of valid fully qualified "web domain" source values (link):

naturalearthdata.com
openstreetmap.org
osmdata.openstreetmap.de
whosonfirst.org

The OpenStreetMap node confusingly also has a source property... which indicates that something in the property list was sourced from Wikipedia. But that doesn't have any bearing on Tilezen.

We need to set the correct source in the test data fixture, so the Tilezen software knows when and how to export it in tiles.

nvkelso · 2021-07-30T04:59:10Z

@peitili Please fix your PEP8 problems with Python formatting so I can see the output of the CircleCI build tests, please.

EG:

./vectordatasource/transform.py:691:49: E127 continuation line over-indented for visual indent
./vectordatasource/transform.py:693:50: E127 continuation line over-indented for visual indent

nvkelso · 2021-07-30T06:01:58Z

vectordatasource/transform.py


    # only select one of the options if the field is separated by "/"
    # for example if the field is "旧金山市县/三藩市市縣/舊金山市郡" only the first
    # one 旧金山市县 will be preserved
-    properties['name:zh'] = properties['name:zh'].split('/')[0].strip()
-    properties['name:zht'] = properties['name:zht'].split('/')[0].strip()
+    properties['name:zh-Hans'] = properties['name:zh-Hans'].split('/')[0].strip()


Hahahaha look at https://www.wikidata.org/wiki/Q374404 and Simplified Chinese value \菲利普斯县. Truth is stranger than fiction.

nvkelso · 2021-07-30T06:11:42Z

There are 2 related test failures in Circle CI to fix, relating to the change of name:zh:

======================================================================
FAIL: test_jerusalem (integration-test.418-wof-l10n_name.WofL10nName)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/circleci/project/integration-test/418-wof-l10n_name.py", line 64, in test_jerusalem
    'name:zh-yue': None,
  File "/home/circleci/project/integration-test/__init__.py", line 1489, in assert_has_feature
    self.test_instance.assert_has_feature(z, x, y, layer, props)
  File "integration-test/__init__.py", line 1322, in assert_has_feature
    self.assertions.assert_has_feature(z, x, y, layer, props)
  File "integration-test/__init__.py", line 1158, in assert_has_feature
    (properties, closest['properties'], misses))
AssertionError: Did not find feature including properties {'name:zh-yue': None, 'name:zh': None, 'id': 29090735, 'name:zh-min-nan': None}. The closest match was {'name:cy': 'Jeriwsalem', 'name:ko': '\xec\x98\x88\xeb\xa3\xa8\xec\x82\xb4\xeb\xa0\x98', 'name:cv': '\xd0\x98\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:cu': '\xd0\x87\xd1\x94\xd1\x80\xd0\xbe\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc\xd1\x8a', 'name:cs': 'Jeruzal\xc3\xa9m', 'name:tt': '\xd0\x98\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:kv': '\xd0\x98\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:ku': 'Or\xc5\x9fel\xc3\xaem', 'name:tl': 'Herusalem', 'name:ta': '\xe0\xae\xaf\xe0\xaf\x86\xe0\xae\xb0\xe0\xaf\x82\xe0\xae\x9a\xe0\xae\xb2\xe0\xae\xae\xe0\xaf\x8d', 'name:tg': '\xd0\xa3\xd1\x80\xd1\x88\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:te': '\xe0\xb0\x9c\xe0\xb1\x86\xe0\xb0\xb0\xe0\xb1\x82\xe0\xb0\xb8\xe0\xb0\xb2\xe0\xb1\x87\xe0\xb0\x82', 'name:ka': '\xe1\x83\x98\xe1\x83\x94\xe1\x83\xa0\xe1\x83\xa3\xe1\x83\xa1\xe1\x83\x90\xe1\x83\x9a\xe1\x83\x98\xe1\x83\x9b\xe1\x83\x98', 'name:hr': 'Jeruzalem', 'name:da': 'Jerusalem', 'name:de': 'Jerusalem', 'source': 'openstreetmap.org', 'name:kn': '\xe0\xb2\x9c\xe0\xb3\x86\xe0\xb2\xb0\xe0\xb3\x81\xe0\xb2\xb8\xe0\xb2\xb2\xe0\xb3\x86\xe0\xb2\x82', 'name:be': '\xd0\x86\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd1\x96\xd0\xbc', 'name:dv': '\xde\xa4\xde\xaa\xde\x8b\xde\xaa\xde\x90\xde\xb0', 'name:lv': 'Jeruz\xc4\x81leme', 'name:lt': 'Jeruzal\xc4\x97', 'name:uz': 'Quddus', 'name:kk': '\xd3\x98\xd0\xbb-\xd2\x9a\xd2\xb1\xd0\xb4\xd1\x8b\xd1\x81', 'name:ur': '\xd8\xa8\xdb\x8c\xd8\xaa \xd8\xa7\xd9\x84\xd9\x85\xd9\x82\xd8\xaf\xd8\xb3', 'name:la': 'Hierosolyma', 'name:uk': '\xd0\x84\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:ug': '\xd9\x8a\xdb\x90\xd8\xb1\xdb\x87\xd8\xb3\xd8\xa7\xd9\x84\xdb\x90\xd9\x85', 'name:li': 'Jeruzalem', 'name:ln': 'Yerusal\xc3\xa9mi', 'name:el': '\xce\x99\xce\xb5\xcf\x81\xce\xbf\xcf\x85\xcf\x83\xce\xb1\xce\xbb\xce\xae\xce\xbc', 'name:eo': 'Jerusalemo', 'name:en': 'Jerusalem', 'name': '\xd7\x99\xd7\xa8\xd7\x95\xd7\xa9\xd7\x9c\xd7\x99\xd7\x9d', 'name:tk': 'I\xc3\xbderusalim', 'name:th': '\xe0\xb9\x80\xe0\xb8\xa2\xe0\xb8\xa3\xe0\xb8\xb9\xe0\xb8\x8b\xe0\xb8\xb2\xe0\xb9\x80\xe0\xb8\xa5\xe0\xb8\xa1', 'name:et': 'Jeruusalemm', 'name:es': 'Jerusal\xc3\xa9n', 'name:az': 'Jerusalem', 'name:id': 'Yerusalem', 'name:ar': '\xd8\xa7\xd9\x84\xd9\x82\xd8\xaf\xd8\xb3', 'name:io': 'Ierusalem', 'name:is': 'Jer\xc3\xbasalem', 'name:am': '\xe1\x8a\xa5\xe1\x8b\xa8\xe1\x88\xa9\xe1\x88\xb3\xe1\x88\x8c\xe1\x88\x9d', 'name:it': 'Gerusalemme', 'name:an': 'Cherusalem', 'name:ru': '\xd0\x98\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:rw': 'Yerusalemu', 'kind_detail': 'city', 'name:so': 'Qudus', 'name:sm': 'Ierusalema', 'name:sl': 'Jeruzalem', 'name:sk': 'Jeruzalem', 'name:ja': '\xe3\x82\xa8\xe3\x83\xab\xe3\x82\xb5\xe3\x83\xac\xe3\x83\xa0', 'name:sh': 'Jeruzalem', 'name:sc': 'Gerusalemme', 'name:br': 'Jeruzalem', 'name:bn': '\xe0\xa6\x9c\xe0\xa7\x87\xe0\xa6\xb0\xe0\xa7\x81\xe0\xa6\xb8\xe0\xa6\xbe\xe0\xa6\xb2\xe0\xa7\x87\xe0\xa6\xae', 'name:bo': '\xe0\xbd\x87\xe0\xbd\xba\xe0\xbc\x8b\xe0\xbd\xa2\xe0\xbd\xb4\xe0\xbc\x8b\xe0\xbd\xa6\xe0\xbc\x8b\xe0\xbd\xa3\xe0\xbd\xba\xe0\xbd\x98\xe0\xbc\x8d', 'name:sw': 'Yerusalemu', 'name:sv': 'Jerusalem', 'name:su': 'Yerusalem', 'name:bg': '\xd0\x99\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:sr': '\xd0\x88\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:sq': 'Jeruzalemi', 'name:jv': 'Y\xc3\xa9rusalem', 'name:pt': 'Jerusal\xc3\xa9m', 'name:ps': '\xd8\xa8\xd9\x8a\xd8\xaa \xd8\xa7\xd9\x84\xd9\x85\xd9\x82\xd8\xaf\xd8\xb3', 'name:oc': 'Jerusal\xc3\xa8m', 'min_zoom': 8.0, 'name:ro': 'Ierusalim', 'name:zh-min-nan': 'I\xc3\xa2-l\xc5\x8d\xcd\x98-sat-l\xc3\xa9ng', 'name:os': '\xd0\x98\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:pl': 'Jerozolima', 'admin_level': '2', 'collision_rank': 342, 'name:tr': 'Kud\xc3\xbcs', 'name:hi': '\xe0\xa4\xaf\xe0\xa4\xb0\xe0\xa5\x81\xe0\xa4\xb6\xe0\xa4\xb2\xe0\xa4\xae', 'name:he': '\xd7\x99\xd7\xa8\xd7\x95\xd7\xa9\xd7\x9c\xd7\x99\xd7\x9d', 'name:hy': '\xd4\xb5\xd6\x80\xd5\xb8\xd6\x82\xd5\xbd\xd5\xa1\xd5\xb2\xd5\xa5\xd5\xb4', 'country_capital': True, 'name:hu': 'Jeruzs\xc3\xa1lem', 'name:qu': 'Yerushalayim', 'population': 780200, 'kind': 'locality', 'population_rank': 11, 'alt_name:is': 'J\xc3\xb3rsalir;J\xc3\xb3rsalaborg', 'wikidata_id': 'Q1218', 'name:yi': '\xd7\x99\xd7\xa8\xd7\x95\xd7\xa9\xd7\x9c\xd7\x99\xd7\x9d', 'name:yo': 'Jer\xc3\xbas\xc3\xa1l\xe1\xba\xb9\xcc\x81m\xc3\xb9', 'name:ms': 'Baitulmuqaddis', 'name:mr': '\xe0\xa4\x9c\xe0\xa5\x87\xe0\xa4\xb0\xe0\xa5\x81\xe0\xa4\xb8\xe0\xa4\xb2\xe0\xa5\x87\xe0\xa4\xae', 'name:my': '\xe1\x80\x82\xe1\x80\xbb\xe1\x80\xb1\xe1\x80\x9b\xe1\x80\xaf\xe1\x80\x86\xe1\x80\x9c\xe1\x80\x84\xe1\x80\xba\xe1\x80\x99\xe1\x80\xbc\xe1\x80\xad\xe1\x80\xaf\xe1\x80\xb7', 'int_name': 'Jerusalem', 'name:ml': '\xe0\xb4\x9c\xe0\xb5\x86\xe0\xb4\xb1\xe0\xb5\x81\xe0\xb4\xb8\xe0\xb4\xb2\xe0\xb5\x87\xe0\xb4\x82', 'name:vi': 'Jerusalem', 'name:mn': '\xd0\x98\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'id': 29090735, 'name:mk': '\xd0\x95\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:vo': 'Hierusalem', 'name:fa': '\xd8\xa7\xd9\x88\xd8\xb1\xd8\xb4\xd9\x84\xdb\x8c\xd9\x85', 'name:fi': 'Jerusalem', 'name:fj': 'Jerusalemi', 'name:fo': 'Jer\xc3\xbasalem', 'alt_name:vi': 'Gi\xc3\xaarusalem', 'name:fr': 'J\xc3\xa9rusalem', 'name:fy': 'Jeruzalim', 'name:nl': 'Jeruzalem', 'name:wa': 'Djeruzalem', 'name:ga': 'Iar\xc3\xbasail\xc3\xa9im', 'name:gd': 'Ierusalem', 'name:zh-Hant': '\xe8\x80\xb6\xe8\xb7\xaf\xe6\x92\x92\xe5\x86\xb7', 'name:zh-Hans': '\xe8\x80\xb6\xe8\xb7\xaf\xe6\x92\x92\xe5\x86\xb7', 'name:gn': 'Herusal\xe1\xba\xbd', 'name:gl': 'Xerusal\xc3\xa9n'}: missed {'name:zh-yue': 'missing', 'name:zh': 'missing'}.

======================================================================
FAIL: test_san_francisco_osm (integration-test.418-wof-l10n_name.WofL10nName)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/circleci/project/integration-test/418-wof-l10n_name.py", line 42, in test_san_francisco_osm
    'name:zh': None})
  File "/home/circleci/project/integration-test/__init__.py", line 1489, in assert_has_feature
    self.test_instance.assert_has_feature(z, x, y, layer, props)
  File "integration-test/__init__.py", line 1322, in assert_has_feature
    self.assertions.assert_has_feature(z, x, y, layer, props)
  File "integration-test/__init__.py", line 1158, in assert_has_feature
    (properties, closest['properties'], misses))
AssertionError: Did not find feature including properties {'kind': 'locality', 'name': 'San Francisco', 'kind_detail': 'city', 'source': 'openstreetmap.org', 'name:zh': None, 'id': 26819236}. The closest match was {'name:pt': 'S\xc3\xa3o Francisco', 'collision_rank': 344, 'name:ko': '\xec\x83\x8c\xed\x94\x84\xeb\x9e\x80\xec\x8b\x9c\xec\x8a\xa4\xec\xbd\x94', 'name:kn': '\xe0\xb2\xb8\xe0\xb2\xbe\xe0\xb2\xa8\xe0\xb3\x8d \xe0\xb2\xab\xe0\xb3\x8d\xe0\xb2\xb0\xe0\xb2\xbe\xe0\xb2\xa8\xe0\xb3\x8d\xe0\xb2\xb8\xe0\xb2\xbf\xe0\xb2\xb8\xe0\xb3\x8d\xe0\xb2\x95\xe0\xb3\x8a', 'min_zoom': 8.0, 'name:ru': '\xd0\xa1\xd0\xb0\xd0\xbd-\xd0\xa4\xd1\x80\xd0\xb0\xd0\xbd\xd1\x86\xd0\xb8\xd1\x81\xd0\xba\xd0\xbe', 'name:ta': '\xe0\xae\xb8\xe0\xae\xbe\xe0\xae\xa9\xe0\xaf\x8d \xe0\xae\xaa\xe0\xaf\x8d\xc2\xb2\xe0\xae\xb0\xe0\xae\xbe\xe0\xae\xa9\xe0\xaf\x8d\xe0\xae\xb8\xe0\xae\xbf\xe0\xae\xb8\xe0\xaf\x8d\xe0\xae\x95\xe0\xaf\x8a', 'id': 26819236, 'name:fa': '\xd8\xb3\xd8\xa7\xd9\x86 \xd9\x81\xd8\xb1\xd8\xa7\xd9\x86\xd8\xb3\xdb\x8c\xd8\xb3\xda\xa9\xd9\x88', 'kind_detail': 'city', 'name:hi': '\xe0\xa4\xb8\xe0\xa5\x88\xe0\xa4\xa8 \xe0\xa4\xab\xe0\xa5\x8d\xe0\xa4\xb0\xe0\xa4\xbe\xe0\xa4\x82\xe0\xa4\xb8\xe0\xa4\xbf\xe0\xa4\xb8\xe0\xa5\x8d\xe0\xa4\x95\xe0\xa5\x8b', 'name:de': 'San Francisco', 'source': 'openstreetmap.org', 'name:ja': '\xe3\x82\xb5\xe3\x83\xb3\xe3\x83\x95\xe3\x83\xa9\xe3\x83\xb3\xe3\x82\xb7\xe3\x82\xb9\xe3\x82\xb3', 'short_name': 'SF', 'population_rank': 11, 'population': 864816, 'kind': 'locality', 'name': 'San Francisco', 'name:zh-Hant': '\xe6\x97\xa7\xe9\x87\x91\xe5\xb1\xb1', 'name:zh-Hans': '\xe6\x97\xa7\xe9\x87\x91\xe5\xb1\xb1', 'wikidata_id': 'Q62', 'name:eu': 'San Francisco'}: missed {'name:zh': 'missing'}.

peitili · 2021-07-30T16:20:17Z

There are 2 related test failures in Circle CI to fix, relating to the change of name:zh:

======================================================================
FAIL: test_jerusalem (integration-test.418-wof-l10n_name.WofL10nName)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/circleci/project/integration-test/418-wof-l10n_name.py", line 64, in test_jerusalem
    'name:zh-yue': None,
  File "/home/circleci/project/integration-test/__init__.py", line 1489, in assert_has_feature
    self.test_instance.assert_has_feature(z, x, y, layer, props)
  File "integration-test/__init__.py", line 1322, in assert_has_feature
    self.assertions.assert_has_feature(z, x, y, layer, props)
  File "integration-test/__init__.py", line 1158, in assert_has_feature
    (properties, closest['properties'], misses))
AssertionError: Did not find feature including properties {'name:zh-yue': None, 'name:zh': None, 'id': 29090735, 'name:zh-min-nan': None}. The closest match was {'name:cy': 'Jeriwsalem', 'name:ko': '\xec\x98\x88\xeb\xa3\xa8\xec\x82\xb4\xeb\xa0\x98', 'name:cv': '\xd0\x98\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:cu': '\xd0\x87\xd1\x94\xd1\x80\xd0\xbe\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc\xd1\x8a', 'name:cs': 'Jeruzal\xc3\xa9m', 'name:tt': '\xd0\x98\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:kv': '\xd0\x98\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:ku': 'Or\xc5\x9fel\xc3\xaem', 'name:tl': 'Herusalem', 'name:ta': '\xe0\xae\xaf\xe0\xaf\x86\xe0\xae\xb0\xe0\xaf\x82\xe0\xae\x9a\xe0\xae\xb2\xe0\xae\xae\xe0\xaf\x8d', 'name:tg': '\xd0\xa3\xd1\x80\xd1\x88\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:te': '\xe0\xb0\x9c\xe0\xb1\x86\xe0\xb0\xb0\xe0\xb1\x82\xe0\xb0\xb8\xe0\xb0\xb2\xe0\xb1\x87\xe0\xb0\x82', 'name:ka': '\xe1\x83\x98\xe1\x83\x94\xe1\x83\xa0\xe1\x83\xa3\xe1\x83\xa1\xe1\x83\x90\xe1\x83\x9a\xe1\x83\x98\xe1\x83\x9b\xe1\x83\x98', 'name:hr': 'Jeruzalem', 'name:da': 'Jerusalem', 'name:de': 'Jerusalem', 'source': 'openstreetmap.org', 'name:kn': '\xe0\xb2\x9c\xe0\xb3\x86\xe0\xb2\xb0\xe0\xb3\x81\xe0\xb2\xb8\xe0\xb2\xb2\xe0\xb3\x86\xe0\xb2\x82', 'name:be': '\xd0\x86\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd1\x96\xd0\xbc', 'name:dv': '\xde\xa4\xde\xaa\xde\x8b\xde\xaa\xde\x90\xde\xb0', 'name:lv': 'Jeruz\xc4\x81leme', 'name:lt': 'Jeruzal\xc4\x97', 'name:uz': 'Quddus', 'name:kk': '\xd3\x98\xd0\xbb-\xd2\x9a\xd2\xb1\xd0\xb4\xd1\x8b\xd1\x81', 'name:ur': '\xd8\xa8\xdb\x8c\xd8\xaa \xd8\xa7\xd9\x84\xd9\x85\xd9\x82\xd8\xaf\xd8\xb3', 'name:la': 'Hierosolyma', 'name:uk': '\xd0\x84\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:ug': '\xd9\x8a\xdb\x90\xd8\xb1\xdb\x87\xd8\xb3\xd8\xa7\xd9\x84\xdb\x90\xd9\x85', 'name:li': 'Jeruzalem', 'name:ln': 'Yerusal\xc3\xa9mi', 'name:el': '\xce\x99\xce\xb5\xcf\x81\xce\xbf\xcf\x85\xcf\x83\xce\xb1\xce\xbb\xce\xae\xce\xbc', 'name:eo': 'Jerusalemo', 'name:en': 'Jerusalem', 'name': '\xd7\x99\xd7\xa8\xd7\x95\xd7\xa9\xd7\x9c\xd7\x99\xd7\x9d', 'name:tk': 'I\xc3\xbderusalim', 'name:th': '\xe0\xb9\x80\xe0\xb8\xa2\xe0\xb8\xa3\xe0\xb8\xb9\xe0\xb8\x8b\xe0\xb8\xb2\xe0\xb9\x80\xe0\xb8\xa5\xe0\xb8\xa1', 'name:et': 'Jeruusalemm', 'name:es': 'Jerusal\xc3\xa9n', 'name:az': 'Jerusalem', 'name:id': 'Yerusalem', 'name:ar': '\xd8\xa7\xd9\x84\xd9\x82\xd8\xaf\xd8\xb3', 'name:io': 'Ierusalem', 'name:is': 'Jer\xc3\xbasalem', 'name:am': '\xe1\x8a\xa5\xe1\x8b\xa8\xe1\x88\xa9\xe1\x88\xb3\xe1\x88\x8c\xe1\x88\x9d', 'name:it': 'Gerusalemme', 'name:an': 'Cherusalem', 'name:ru': '\xd0\x98\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:rw': 'Yerusalemu', 'kind_detail': 'city', 'name:so': 'Qudus', 'name:sm': 'Ierusalema', 'name:sl': 'Jeruzalem', 'name:sk': 'Jeruzalem', 'name:ja': '\xe3\x82\xa8\xe3\x83\xab\xe3\x82\xb5\xe3\x83\xac\xe3\x83\xa0', 'name:sh': 'Jeruzalem', 'name:sc': 'Gerusalemme', 'name:br': 'Jeruzalem', 'name:bn': '\xe0\xa6\x9c\xe0\xa7\x87\xe0\xa6\xb0\xe0\xa7\x81\xe0\xa6\xb8\xe0\xa6\xbe\xe0\xa6\xb2\xe0\xa7\x87\xe0\xa6\xae', 'name:bo': '\xe0\xbd\x87\xe0\xbd\xba\xe0\xbc\x8b\xe0\xbd\xa2\xe0\xbd\xb4\xe0\xbc\x8b\xe0\xbd\xa6\xe0\xbc\x8b\xe0\xbd\xa3\xe0\xbd\xba\xe0\xbd\x98\xe0\xbc\x8d', 'name:sw': 'Yerusalemu', 'name:sv': 'Jerusalem', 'name:su': 'Yerusalem', 'name:bg': '\xd0\x99\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:sr': '\xd0\x88\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:sq': 'Jeruzalemi', 'name:jv': 'Y\xc3\xa9rusalem', 'name:pt': 'Jerusal\xc3\xa9m', 'name:ps': '\xd8\xa8\xd9\x8a\xd8\xaa \xd8\xa7\xd9\x84\xd9\x85\xd9\x82\xd8\xaf\xd8\xb3', 'name:oc': 'Jerusal\xc3\xa8m', 'min_zoom': 8.0, 'name:ro': 'Ierusalim', 'name:zh-min-nan': 'I\xc3\xa2-l\xc5\x8d\xcd\x98-sat-l\xc3\xa9ng', 'name:os': '\xd0\x98\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:pl': 'Jerozolima', 'admin_level': '2', 'collision_rank': 342, 'name:tr': 'Kud\xc3\xbcs', 'name:hi': '\xe0\xa4\xaf\xe0\xa4\xb0\xe0\xa5\x81\xe0\xa4\xb6\xe0\xa4\xb2\xe0\xa4\xae', 'name:he': '\xd7\x99\xd7\xa8\xd7\x95\xd7\xa9\xd7\x9c\xd7\x99\xd7\x9d', 'name:hy': '\xd4\xb5\xd6\x80\xd5\xb8\xd6\x82\xd5\xbd\xd5\xa1\xd5\xb2\xd5\xa5\xd5\xb4', 'country_capital': True, 'name:hu': 'Jeruzs\xc3\xa1lem', 'name:qu': 'Yerushalayim', 'population': 780200, 'kind': 'locality', 'population_rank': 11, 'alt_name:is': 'J\xc3\xb3rsalir;J\xc3\xb3rsalaborg', 'wikidata_id': 'Q1218', 'name:yi': '\xd7\x99\xd7\xa8\xd7\x95\xd7\xa9\xd7\x9c\xd7\x99\xd7\x9d', 'name:yo': 'Jer\xc3\xbas\xc3\xa1l\xe1\xba\xb9\xcc\x81m\xc3\xb9', 'name:ms': 'Baitulmuqaddis', 'name:mr': '\xe0\xa4\x9c\xe0\xa5\x87\xe0\xa4\xb0\xe0\xa5\x81\xe0\xa4\xb8\xe0\xa4\xb2\xe0\xa5\x87\xe0\xa4\xae', 'name:my': '\xe1\x80\x82\xe1\x80\xbb\xe1\x80\xb1\xe1\x80\x9b\xe1\x80\xaf\xe1\x80\x86\xe1\x80\x9c\xe1\x80\x84\xe1\x80\xba\xe1\x80\x99\xe1\x80\xbc\xe1\x80\xad\xe1\x80\xaf\xe1\x80\xb7', 'int_name': 'Jerusalem', 'name:ml': '\xe0\xb4\x9c\xe0\xb5\x86\xe0\xb4\xb1\xe0\xb5\x81\xe0\xb4\xb8\xe0\xb4\xb2\xe0\xb5\x87\xe0\xb4\x82', 'name:vi': 'Jerusalem', 'name:mn': '\xd0\x98\xd0\xb5\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'id': 29090735, 'name:mk': '\xd0\x95\xd1\x80\xd1\x83\xd1\x81\xd0\xb0\xd0\xbb\xd0\xb8\xd0\xbc', 'name:vo': 'Hierusalem', 'name:fa': '\xd8\xa7\xd9\x88\xd8\xb1\xd8\xb4\xd9\x84\xdb\x8c\xd9\x85', 'name:fi': 'Jerusalem', 'name:fj': 'Jerusalemi', 'name:fo': 'Jer\xc3\xbasalem', 'alt_name:vi': 'Gi\xc3\xaarusalem', 'name:fr': 'J\xc3\xa9rusalem', 'name:fy': 'Jeruzalim', 'name:nl': 'Jeruzalem', 'name:wa': 'Djeruzalem', 'name:ga': 'Iar\xc3\xbasail\xc3\xa9im', 'name:gd': 'Ierusalem', 'name:zh-Hant': '\xe8\x80\xb6\xe8\xb7\xaf\xe6\x92\x92\xe5\x86\xb7', 'name:zh-Hans': '\xe8\x80\xb6\xe8\xb7\xaf\xe6\x92\x92\xe5\x86\xb7', 'name:gn': 'Herusal\xe1\xba\xbd', 'name:gl': 'Xerusal\xc3\xa9n'}: missed {'name:zh-yue': 'missing', 'name:zh': 'missing'}.

======================================================================
FAIL: test_san_francisco_osm (integration-test.418-wof-l10n_name.WofL10nName)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/circleci/project/integration-test/418-wof-l10n_name.py", line 42, in test_san_francisco_osm
    'name:zh': None})
  File "/home/circleci/project/integration-test/__init__.py", line 1489, in assert_has_feature
    self.test_instance.assert_has_feature(z, x, y, layer, props)
  File "integration-test/__init__.py", line 1322, in assert_has_feature
    self.assertions.assert_has_feature(z, x, y, layer, props)
  File "integration-test/__init__.py", line 1158, in assert_has_feature
    (properties, closest['properties'], misses))
AssertionError: Did not find feature including properties {'kind': 'locality', 'name': 'San Francisco', 'kind_detail': 'city', 'source': 'openstreetmap.org', 'name:zh': None, 'id': 26819236}. The closest match was {'name:pt': 'S\xc3\xa3o Francisco', 'collision_rank': 344, 'name:ko': '\xec\x83\x8c\xed\x94\x84\xeb\x9e\x80\xec\x8b\x9c\xec\x8a\xa4\xec\xbd\x94', 'name:kn': '\xe0\xb2\xb8\xe0\xb2\xbe\xe0\xb2\xa8\xe0\xb3\x8d \xe0\xb2\xab\xe0\xb3\x8d\xe0\xb2\xb0\xe0\xb2\xbe\xe0\xb2\xa8\xe0\xb3\x8d\xe0\xb2\xb8\xe0\xb2\xbf\xe0\xb2\xb8\xe0\xb3\x8d\xe0\xb2\x95\xe0\xb3\x8a', 'min_zoom': 8.0, 'name:ru': '\xd0\xa1\xd0\xb0\xd0\xbd-\xd0\xa4\xd1\x80\xd0\xb0\xd0\xbd\xd1\x86\xd0\xb8\xd1\x81\xd0\xba\xd0\xbe', 'name:ta': '\xe0\xae\xb8\xe0\xae\xbe\xe0\xae\xa9\xe0\xaf\x8d \xe0\xae\xaa\xe0\xaf\x8d\xc2\xb2\xe0\xae\xb0\xe0\xae\xbe\xe0\xae\xa9\xe0\xaf\x8d\xe0\xae\xb8\xe0\xae\xbf\xe0\xae\xb8\xe0\xaf\x8d\xe0\xae\x95\xe0\xaf\x8a', 'id': 26819236, 'name:fa': '\xd8\xb3\xd8\xa7\xd9\x86 \xd9\x81\xd8\xb1\xd8\xa7\xd9\x86\xd8\xb3\xdb\x8c\xd8\xb3\xda\xa9\xd9\x88', 'kind_detail': 'city', 'name:hi': '\xe0\xa4\xb8\xe0\xa5\x88\xe0\xa4\xa8 \xe0\xa4\xab\xe0\xa5\x8d\xe0\xa4\xb0\xe0\xa4\xbe\xe0\xa4\x82\xe0\xa4\xb8\xe0\xa4\xbf\xe0\xa4\xb8\xe0\xa5\x8d\xe0\xa4\x95\xe0\xa5\x8b', 'name:de': 'San Francisco', 'source': 'openstreetmap.org', 'name:ja': '\xe3\x82\xb5\xe3\x83\xb3\xe3\x83\x95\xe3\x83\xa9\xe3\x83\xb3\xe3\x82\xb7\xe3\x82\xb9\xe3\x82\xb3', 'short_name': 'SF', 'population_rank': 11, 'population': 864816, 'kind': 'locality', 'name': 'San Francisco', 'name:zh-Hant': '\xe6\x97\xa7\xe9\x87\x91\xe5\xb1\xb1', 'name:zh-Hans': '\xe6\x97\xa7\xe9\x87\x91\xe5\xb1\xb1', 'wikidata_id': 'Q62', 'name:eu': 'San Francisco'}: missed {'name:zh': 'missing'}.

fixed

nvkelso

One final ask... please document the new Chinese languages codes here: https://github.com/tilezen/vector-datasource/blob/024909ed8245a4ad4a25c908413ba3602de6c335/docs/SEMANTIC-VERSIONING.md#common-languages
Also marking the existing name:zh as deprecated.

I filed #1962 for the breaking v2 work later.

Extends work done in #1961 and #1963

Add Chinese paser for NE

370ad38

peitili requested a review from nvkelso July 30, 2021 00:24

iandees approved these changes Jul 30, 2021

View reviewed changes

update unit test method name

0d7ab41