-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests are failing on arm and aarch64 #1057
Comments
Thanks for that. My suspicion is it might be timing related(?). I tried to make those tests all use deferreds, i.e. each step of the process waits for the previous to complete, but it doesn't currently work like that for the websocket test (checkNotifs is called after a 0.1 second wait). I'm not sure when I'll get round to looking at it, but maybe you could edit this line to wait for say 1 second, to see if it's about timing, just to check my hypothesis. |
Does not help, tried to increase that number even to 10 seconds. |
Same happens also on 32-bit ARM (Raspbian). Both are based on RaspiBolt tutorial. |
Same happens on x86_64.
However, unlike @kristapsk, I was able to get the test to pass by increasing the delays. However, adding delays to make a unit test pass is an anti-pattern and is not the correct answer here. The speed and load of the system should affect only how long a program takes to execute, not whether it executes correctly. |
Reviewing this again in view of recent test failures (including #1421 ): if we look at the relevant portion of the test code: joinmarket-clientserver/jmclient/test/test_wallet_rpc.py Lines 168 to 185 in c503394
... the sequence of events is: 1/ schedule the firing of the notification, then 2/ add a callback after that notification firing event, which waits again before calling Is there some reason we could imagine that the message over the network gets substantially delayed, in some situations? There is some intrinsic timing dependence here, the logic is "the test passes if the notification is received, after some time" - so perhaps I just have to allow the test to repeatedly try to receive the notification, if there is some weird situation where simple localhost test code experiences a substantial network delay? Best I can think of is to just try in a loop, maybe 10 times, over 10 seconds; we have to give up at some point but obviously usually there should be no meaningful delay. Note that this is websocket code using autobahn. I cannot currently reproduce the error on my machine, and I currently don't know what circumstances could cause it, and can't really debug without the ability to reproduce, I can only guess. This is an example of the timings I see (code lines are as above):
The first two are basically instant of course; we just scheduled a future call on line 168, so it takes no time. Then there is the expect 0.1 delay before the notification gets "fired" and sent over the wire. Then there is a less predictable delay before the callback from that sending code returns - it's about 0.06 seconds here and seems to stay around that area, on my machine, in this test, but the whole point of it being a callback is it shouldn't matter. The only issue is the delta between when 'text message received' arrives and when the call to checkNotifs occurs, which is currently 0.1 seconds after the call to sendNotification occurs. I'll build a PR based around that looping idea (but with no additional delay by default) and just see if it passes. |
I'm still having this on Odroid HC4 (arm64) running Ubuntu Server (focal) / RaspiBolt:
|
Solved with #1534? |
This may or may not fix JoinMarket-Org#1057 and other test failures. Since the network delay in receiving the message is unknown (but expected to be very small, usually), we keep trying to receive notification of the transaction message in the websocket test 10 times for a total of 2 seconds (on the assumption that if it takes longer than that, something else is wrong).
Tests are passing for me on 64-bit ARM now, cannot reproduce anymore. |
The text was updated successfully, but these errors were encountered: