Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InStream.get_channels_buffer_size uses wrong encoding #620

Open
bkeryan opened this issue Jul 31, 2024 · 1 comment
Open

InStream.get_channels_buffer_size uses wrong encoding #620

bkeryan opened this issue Jul 31, 2024 · 1 comment

Comments

@bkeryan
Copy link
Collaborator

bkeryan commented Jul 31, 2024

InStream.get_channels_buffer_size and OutStream.get_channels_buffer_size read the Task.channel_names property to calculate the maximum buffer size for a fault condition status property such as overcurrent_chans.

The problem is that these functions are calculating the length of the channel_names property after decoding from bytes to str. If Python represents str as UTF-16 or UTF-32, the number of characters in a str may be smaller than the number of UTF-8 bytes used to return that str from the DAQmx C API. When this happens, reading the fault condition property will likely return a "buffer too small" error when there is a fault.

Some potential solutions:

  • Use DAQmxGetTaskChannels(task, NULL, 0) to query the size of the TaskChannels property in the C API's native encoding (UTF-8 or MBCS). This would also be a small optimization because it would query the size of the TaskChannels property without querying the actual value.
  • Convert the task channels back to the C API's native encoding. This is a hack, so the first solution is better.
@bkeryan
Copy link
Collaborator Author

bkeryan commented Aug 2, 2024

If Python represents str as UTF-16 or UTF-32, the number of characters in a str may be smaller than the number of UTF-8 bytes used to return that str from the DAQmx C API.

Correction: it doesn't matter what the internal representation of str is. What matters is that len() on a str counts Unicode code points, not UTF-8 bytes:

>>> x="é"
>>> x
'é'
>>> len(x)
1
>>> x.encode('utf-8')
b'\xc3\xa9'
>>> y="\u2603"
>>> y
'☃'
>>> len(y)
1
>>> y.encode('utf-8')
b'\xe2\x98\x83'
>>> z="\U00010437"
>>> z
'𐐷'
>>> len(z)
1
>>> z.encode('utf-8')
b'\xf0\x90\x90\xb7'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant