-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crypto api - update cipher operation returning a different length #3823
Comments
Can you give an example with a specific algorithm where this is a problem? |
This problem I believe is systemic and not algorithm dependent. Systemic in the sense that the op-tee stack expects that all crypto update operations complete the bytes provided on input when in fact, the underlying hardware might chose to do differently and wait for more data before start generating output. As an example, when running regression_4000.c and execute aes_ctr, the regression software it will:
the output will show:
you can see the first 13 bytes being accepted as valid when in fact the device just was hoping to receive more but since the API could only return success, the receiving end assumed that those 13 bytes were placed in the buffer when in fact they werent. IMO the interface just needs a *dst_len field. |
From a (GP) API perspective,
Yes, maybe. |
good, so shall I just wait (if there is a better alternative to dst_len)? also AFAICS reporting back dst_len should have no impact on how the caller needs to continue with the transfer (so no need to resend bytes): it just informs the caller how to assemble back the receiving data. |
Another option is to let the hardware produce the missing bytes too. From an algorithm point of view the interface is enough. AES-CTR is a stream cipher so for it is possible to match each input byte with an output byte. Holding back output can be inconvenient for the caller. |
There was one issue a few years ago where someone was expecting AES-CTR to return as many bytes as supplied. Can't remember it right now though. |
This one perhaps? #1580 |
From looking at the code and the comments above I believe -please do correct me if I am wrong- that you are saying that OP-TEE in tee_svc_cipher_update_helper must ignore the information provided by the hardware as to how many bytes it has placed in the destination buffer because it is always known in advance independently of the hardware used. IMO and from what I can see, I am not sure the above statements will be applicable to hardware on slow buses applying their own performance optimizations to their algorithms. And anyways, the way I see it, it would be much cleaner that tee_do_cipher_update gets the dst_len and passes it up the caller, and that the caller does the necessary check with real data. |
Yes, thanks @jforissier. @ldts, I'd rather summarize it as stream ciphers must not be buffered and block ciphers must only be supplied with full blocks. If those rules are followed |
yes, that is a better the way you put it. thanks @jenswi-linaro . the obvious question for me now is if there is a plan -or interest?- to support hardware that does not comply to those requirements (ie, use the data provided by the hardware instead of assuming what the hardware will provide). |
One option is to handle it as in optee_os/lib/libmbedtls/core/aes_ctr.c Line 128 in e7d7d25
If the caller uses odd sized buffers it's clearly not performance critical so we can deal with it in software. I think we should avoid different behavior in the API depending on which implementation of the algorithm is used. |
thanks. I'll add a new api to the NXP stack to bypass any buffering and always put the requested number of bytes on the bus (I was trying to avoid changing the NXP API for maintainability reasons but maybe this is indeed required). |
@jenswi-linaro nah, it didnt make any difference. the se050 on AES-CTR did consume the 13 bytes we put on the wire but generated no output and no errors indicating that is waiting for more data to start processing. So buffering occurs internally in the chip, not under software control. I am not sure I fully understand why you are pushing back on using the data that the hardware reports though. why is it such a big deal? I'll look into mbedtls_aes_crypt_ctr as well...but it doesnt seem right having to do this when all we need to do is to listen to the hardware progress. |
The NXP stack you're mentioning, is that open source so I can have a look?
It is an API change of sort you're proposing and there are use cases which may break because of this. The problem is that we expect AES-CTR as a stream cipher. While hardware (and optimized software) usually implements it as a block cipher, that is, it only operates on complete blocks. This is good when optimizing throughput for larger chunks of data, but it's not very useful if you only want to encrypt 13 bytes right now and maybe more later. Some software trick is needed to handle this. In the function I linked to above we handle it by encrypting a block of zeroes and do the xoring in software for the odd sized data. Since this has to be done somewhere it's better done here than in each caller which expects a stream cipher implementation but finds a block cipher implementation. |
I see. OK will revisit that as soon as I am done with RSA. yes (thanks for looking into this) you can download it from this link : it is the Edgelock SE050 Development kit (you might need to create an account but that is all). I have a work in progress branch https://github.com/ldts/optee_os/commits/se050 which I am still working on an in need to clean/tidy up but passing the regression tests for those crypto operations (it is in flux so maybe some of the missing fixes are still local in my office server - for instance the block cipher I am using is 16 bytes but this branch still was tested with a commit changing it to 8...need to revert that). The original stack I used as a reference is there as well (so you could pick it up from https://github.com/ldts/optee_os/tree/se050/lib/libnxpse050/docs) |
Note, I can't find anything in the GP spec mandating that some output shall always be generated whatever the input data size for stream ciphers (or block ciphers used in modes that effectively turn them into stream ciphers). The GP test suite (TEE_Initial_Configuration-Test_Suite_v1_1_0_4-2014_11_07) does not have meaningful test cases for this either. But I agree we should stick to the current behavior (and maybe document it clearly somewhere). |
needs further work (see) OP-TEE#3823 Got 00:00:00:00 00:00:00:00 00:00:00:00 00:D2:DD:11 ................ A8:F7:B0:AE 55:BE:61:7A E6:A1:6C:79 F4:62:51:7B ....U.az..ly.bQ{ E9:7C:A0:31 0C:24:15:70 7F:47:37:69 E0:24:C3:29 .|.1.$.p.G7i.$.) Expected D2:DD:11:A8 F7:B0:AE:55 BE:61:7A:E6 A1:6C:79:F4 .......U.az..ly. 62:51:7B:E9 7C:A0:31:0C 24:15:70:7F 47:37:69:E0 bQ{.|.1.$.p.G7i. 24:C3:29:CD F2:26:69:FF 72:0E:3C:D1 A1:2F:5D:33 $.)..&i.r.<../]3
I just pushed an update to my working branch that does some minor cleaning and clarifies the AES CRT (as per this issue)...um, now it seems github notifies here every-time I force push to my tree (will probably remove the comment to the issue from the commit on the next update) |
needs further work (see) OP-TEE#3823 Got 00:00:00:00 00:00:00:00 00:00:00:00 00:D2:DD:11 ................ A8:F7:B0:AE 55:BE:61:7A E6:A1:6C:79 F4:62:51:7B ....U.az..ly.bQ{ E9:7C:A0:31 0C:24:15:70 7F:47:37:69 E0:24:C3:29 .|.1.$.p.G7i.$.) Expected D2:DD:11:A8 F7:B0:AE:55 BE:61:7A:E6 A1:6C:79:F4 .......U.az..ly. 62:51:7B:E9 7C:A0:31:0C 24:15:70:7F 47:37:69:E0 bQ{.|.1.$.p.G7i. 24:C3:29:CD F2:26:69:FF 72:0E:3C:D1 A1:2F:5D:33 $.)..&i.r.<../]3
needs further work (see) OP-TEE#3823 Got 00:00:00:00 00:00:00:00 00:00:00:00 00:D2:DD:11 ................ A8:F7:B0:AE 55:BE:61:7A E6:A1:6C:79 F4:62:51:7B ....U.az..ly.bQ{ E9:7C:A0:31 0C:24:15:70 7F:47:37:69 E0:24:C3:29 .|.1.$.p.G7i.$.) Expected D2:DD:11:A8 F7:B0:AE:55 BE:61:7A:E6 A1:6C:79:F4 .......U.az..ly. 62:51:7B:E9 7C:A0:31:0C 24:15:70:7F 47:37:69:E0 bQ{.|.1.$.p.G7i. 24:C3:29:CD F2:26:69:FF 72:0E:3C:D1 A1:2F:5D:33 $.)..&i.r.<../]3
ok so RSA is now functional (interesting what we need to do for the nopad case). One issue is that the se050 will not provide the keypair back to the user but only a handle to it. so I might need to extend the keypair definition with a handler for the SE050 (not just the bignum struct) Coming back to this issue, please could you elaborate on your suggestions for a fix?
|
The function |
thanks. I implemented this workaround to the current api and with the level of testing offered by the regression tests (4000) it seems good. so will close this issue. |
Use case:
Some crypto devices cache the input data and only process some of the requested number of bytes during the crypto update operations which are then placed on the destination buffer.
Problem:
the current crypto API assumes that all crypto updates will be performed up to the length of the source data and therefore assumes that the destination buffer contains the exact number of bytes requested
Solution:
the crypto API should should allow the device to provide the number of bytes places in the update buffer
Question: is there any ongoing work to address this situation?
The text was updated successfully, but these errors were encountered: