Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError when producing query cursors on dev_appserver #47

Closed
sadovnychyi opened this issue Jan 31, 2022 · 4 comments
Closed

Comments

@sadovnychyi
Copy link

Expected Behavior

No exception is thrown.

Actual Behavior

Traceback (most recent call last):
  File "<string>", line 6, in <module>
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/appengine/ext/ndb/utils.py", line 182, in positional_wrapper
    return wrapped(*args, **kwds)
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/appengine/ext/ndb/query.py", line 1266, in fetch
    return self.fetch_async(limit, **q_options).get_result()
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/appengine/ext/ndb/tasklets.py", line 397, in get_result
    self.check_success()
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/appengine/ext/ndb/tasklets.py", line 394, in check_success
    six.reraise(self._exception.__class__, self._exception, self._traceback)
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/six.py", line 719, in reraise
    raise value
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/appengine/ext/ndb/tasklets.py", line 441, in _help_tasklet_along
    value = gen.throw(exc.__class__, exc, tb)
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/appengine/ext/ndb/query.py", line 1043, in _run_to_list
    batch = yield rpc
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/appengine/ext/ndb/tasklets.py", line 527, in _on_rpc_completion
    result = rpc.get_result()
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/appengine/api/apiproxy_stub_map.py", line 648, in get_result
    return self.__get_result_hook(self)
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/appengine/datastore/datastore_query.py", line 2949, in __query_result_hook
    self._batch_shared.conn.check_rpc_success(rpc)
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/appengine/datastore/datastore_rpc.py", line 1365, in check_rpc_success
    rpc.check_success()
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/appengine/api/apiproxy_stub_map.py", line 614, in check_success
    self.__rpc.CheckSuccess()
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/appengine/api/apiproxy_rpc.py", line 149, in CheckSuccess
    raise self.exception
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/appengine/runtime/default_api_stub.py", line 266, in _CaptureTrace
    f(**kwargs)
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/appengine/runtime/default_api_stub.py", line 262, in _SendRequest
    self.response.ParseFromString(parsed_response.response)
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/protobuf/message.py", line 199, in ParseFromString
    return self.MergeFromString(serialized)
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/protobuf/internal/python_message.py", line 1128, in MergeFromString
    if self._InternalParse(serialized, 0, length) != length:
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/protobuf/internal/python_message.py", line 1195, in InternalParse
    pos = field_decoder(buffer, new_pos, end, self, field_dict)
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/protobuf/internal/decoder.py", line 732, in DecodeField
    if value._InternalParse(buffer, pos, new_pos) != new_pos:
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/protobuf/internal/python_message.py", line 1195, in InternalParse
    pos = field_decoder(buffer, new_pos, end, self, field_dict)
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/protobuf/internal/decoder.py", line 681, in DecodeField
    pos = value._InternalParse(buffer, pos, end)
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/protobuf/internal/python_message.py", line 1195, in InternalParse
    pos = field_decoder(buffer, new_pos, end, self, field_dict)
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/protobuf/internal/decoder.py", line 597, in DecodeField
    field_dict[key] = _ConvertToUnicode(buffer[pos:new_pos])
  File "/var/folders/yh/l2c69w5j6q71gzhvs454p7s80000gn/T/tmpxVxXxD/lib/python3.7/site-packages/google/protobuf/internal/decoder.py", line 559, in _ConvertToUnicode
    value = str(byte_str, 'utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 27: 'utf-8' codec can't decode byte 0x80 in position 27: invalid start byte in field: apphosting_datastore_v3_bytes.CompiledQuery.PrimaryScan.index_name
byte_str == b'\n\x13PROJECTIDXXXXXXXXXX\x1a\x04Test\x80\x01\xff\xff\xff\xff\x07\xb8\x01\xff\xff\xff\xff\x07\xc8\x01\x01'

Steps to Reproduce the Problem

from google.appengine.ext.ndb import Model

class Test(Model):
  pass

Test.query().fetch(produce_cursors=True)

Specifications

  • Version: 0.3.1
  • Platform: macOS, Python 3.7, ARM64

This used to produce just some log warnings on non-ARM platform – not sure yet whether it's due to ARM or just some dependency being newer due to a new system (all python deps are fixed and are the same, maybe they still depend on system wide protobuf and that one got bumped?)

golang/appengine#136 -- maybe related?

@sadovnychyi
Copy link
Author

What worked for me was manually building a wheel for protobuf and forcing the C compilation:

brew install protobuf
python3.7 -m pip wheel --wheel-dir=wheels --no-binary=protobuf --global-option="--cpp_implementation" protobuf==3.19.0

And then adding this into requirements.txt (can point at GCS with some index file):

--find-links=wheels

By default protobuf falls back to a pure-python implementation which causes that error/warning. Compiled binary works fine. This workaround won't be needed once correct wheels are published for protobuf.

protocolbuffers/protobuf#9397

@flash716
Copy link

Thanks for the note on your fix @sadovnychyi, I killed a lot of time trying to figure out what was corrupted in my datastore before seeing this.

@embray
Copy link
Contributor

embray commented Nov 17, 2022

I'm getting a similar error except instead of an exception from the Python implementation I'm getting an error from libprotobuf:

[libprotobuf ERROR google/protobuf/wire_format_lite.cc:618] String field 'apphosting_datastore_v3_bytes.CompiledQuery.PrimaryScan.index_name' contains invalid UTF-8 data when parsing a proto
col buffer. Use the 'bytes' type if you intend to send raw bytes.

The query seems to work though so maybe it's less an error than a warning? I don't really know where it's coming from because it's somewhere quite deep and a couple hours of debugging hasn't revealed anything obvious.

@embray
Copy link
Contributor

embray commented Nov 17, 2022

Hmm, after upgrading protobuf the error message goes away, but the error in the descriptor for CompiledQuery.PrimaryScan still persists, but it doesn't seem to matter since that isn't even used by the client in any way I can discern 🤷‍♀️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants