Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize connection errors #823

Merged
merged 3 commits into from
Mar 2, 2020
Merged

Conversation

sssoleileraaa
Copy link
Contributor

@sssoleileraaa sssoleileraaa commented Feb 26, 2020

Description

Resolves #798
Resolves #811

Test Plan

for #811:

server connect error

  1. run the client against staging
  2. pause the staging server vm
  3. send a reply and see "The SecureDrop server cannot be reached. Trying to reconnect..." without a retry link
  4. send another reply and see the same message without a retry link
  5. unpause the staging server vm
  6. see the error message go away

request timeout error

  1. apply this diff so a reply will timeout:
--- a/securedrop_client/api_jobs/uploads.py
+++ b/securedrop_client/api_jobs/uploads.py
@@ -96,7 +96,7 @@ class SendReplyJob(ApiJob):
         # TODO: Once https://github.com/freedomofpress/securedrop-client/issues/648, we will want to
         # pass the default request timeout to reply_source instead of setting it on the api object
         # directly.
-        api_client.default_request_timeout = 5
+        api_client.default_request_timeout = 0.01
         return api_client.reply_source(sdk_source, encrypted_reply, self.reply_uuid)
  1. run the client
  2. send a reply and see "The SecureDrop server cannot be reached. Trying to reconnect..." without a retry link
  3. when then next sync succeeds, see the error message go away (note: the error message will show up again because the reply with a 0.01s timeout will try again and always timeout)

for #798:

server connect error

  1. run the client against staging
  2. pause the staging server vm
  3. do not send any replies and wait for the next sync (the sync should fail) and see "The SecureDrop server cannot be reached. Trying to reconnect..."
  4. unpause the staging server vm
  5. see the error message go away

request timeout error

  1. apply this diff so a sync will timeout:
--- a/securedrop_client/api_jobs/sync.py
+++ b/securedrop_client/api_jobs/sync.py
@@ -36,7 +36,7 @@ class MetadataSyncJob(ApiJob):
         # TODO: Once https://github.com/freedomofpress/securedrop-client/issues/648, we will want to
         # pass the default request timeout to api calls instead of setting it on the api object
         # directly.
-        api_client.default_request_timeout = 40
+        api_client.default_request_timeout = 0.01
         remote_sources, remote_submissions, remote_replies = get_remote_data(api_client)
 
         update_local_storage(session,
  1. run the client
  2. the first sync should fail and you should see "The SecureDrop server cannot be reached. Trying to reconnect..."

Checklist

If these changes modify code paths involving cryptography, the opening of files in VMs or network (via the RPC service) traffic, Qubes testing in the staging environment is required. For fine tuning of the graphical user interface, testing in any environment in Qubes is required. Please check as applicable:

  • I have tested these changes in the appropriate Qubes environment
  • I do not have an appropriate Qubes OS workstation set up (the reviewer will need to test these changes)
  • These changes should not need testing in Qubes

@sssoleileraaa
Copy link
Contributor Author

This is blocked until #820 is fixed.

@sssoleileraaa sssoleileraaa marked this pull request as ready for review February 26, 2020 01:59
@sssoleileraaa sssoleileraaa force-pushed the standardize-connection-errors branch from 88764b6 to 16b30f1 Compare February 27, 2020 23:50
@sssoleileraaa sssoleileraaa force-pushed the standardize-connection-errors branch from 16b30f1 to 94b6bb6 Compare February 27, 2020 23:56
tests/test_logic.py Outdated Show resolved Hide resolved
@redshiftzero
Copy link
Contributor

redshiftzero commented Feb 28, 2020

OK so I started looking at this but didn't finish, manual testing thus far:

#811, server connection error - I did not see the error go away after reconnecting my staging server. this might be due to something wrong in my local environment, but didn't get further here.
#811, request timeout error
#798, server connection error - did not test yet
#798, request timeout error

@creviera can you reconfirm in qubes that all looks good? then I'll (or anyone else who beats me to it) take a spin tomorrow

@sssoleileraaa
Copy link
Contributor Author

server connect error

  1. run the client against staging
  2. pause the staging server vm
  3. send a reply and see "The SecureDrop server cannot be reached. Trying to reconnect..." without a retry link
  4. send another reply and see the same message without a retry link
  5. unpause the staging server vm
  6. see the error message go away

I'm also having trouble testing the server connection error with these steps (unpausing vms is finicky), so here are alternative steps to test server connect errors while I simultaneously look into why unpausing doesn't bring the server back up (why we can't even get to the journalist interface):

for #811:

  1. apply this diff:
--- a/securedrop_client/api_jobs/uploads.py
+++ b/securedrop_client/api_jobs/uploads.py
@@ -94,6 +94,11 @@ class SendReplyJob(ApiJob):
                 uuid=self.reply_uuid, e=e))
 
     def _make_call(self, encrypted_reply: str, api_client: API) -> sdclientapi.Reply:
+        import random
+        if random.choice([0, 1]):
+            self.remaining_attempts = 0;
+            raise ServerConnectionError()
+
         sdk_source = sdclientapi.Source(uuid=self.source_uuid)
  1. run the client
  2. send a reply or a few until you see it stay in pending with the error message "The SecureDrop server cannot be reached. Trying to reconnect..."
  3. wait until sync (sync will be able to connect to the server and will unpause the queue so the reply job gets retried)
  4. see the error message go away (unless the random choice code in the diff raises another ServerConnectionError, in that case wait until the error is not raised again after sync)

for #798:

  1. apply this diff:
--- a/securedrop_client/sync.py
+++ b/securedrop_client/sync.py
@@ -116,6 +116,12 @@ class ApiSyncBackgroundTask(QObject):
             self.sync_started.emit()
             session = self.session_maker()
             self.job.remaining_attempts = 2
+
+            import random
+            if random.choice([0, 1]):
+                self.job.failure_signal.emit(ServerConnectionError())
+                return
+
             self.job._do_call_api(self.api_client, session)
         except ApiInaccessibleError as e:
             self.job.failure_signal.emit(e)  # the job's failure signal is not emitted in base
  1. run the client
  2. 50% likely the first sync when the client starts up will fail, otherwise keep waiting for the next sync to fail
  3. See "The SecureDrop server cannot be reached. Trying to reconnect..." and see it go away when the next sync succeeds

@sssoleileraaa
Copy link
Contributor Author

i'm going to squash some of these commits

@sssoleileraaa sssoleileraaa force-pushed the standardize-connection-errors branch from 2d2278e to d848942 Compare February 28, 2020 18:54
@sssoleileraaa sssoleileraaa force-pushed the standardize-connection-errors branch from d848942 to ab7465f Compare March 2, 2020 19:40
Copy link
Contributor

@emkll emkll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @creviera functional testing and visual review looks good to me. Tested in Qubes with a physical instance (I used sudo service apache2 stop/start to simulate a ServerConnectionError, and stopping Tor to simulate a RequestTimeoutError), and also using the diff you've provided (including #823 (comment)

Test Plan

for #811:

server connect error

  1. run the client against prod server
  2. disable apache2
  3. ✔️ send a reply and see "The SecureDrop server cannot be reached. Trying to reconnect..." without a retry link
  4. ✔️ send another reply and see the same message without a retry link
  5. enable apache2
  6. ✔️ see the error message go away. The message is not sent due to Replies get stuck in pending state #825 . restarting the client, the messages appear as sent

Also reproduced this with the diff you've provided in #823 (comment)

request timeout error

  1. apply diff so a reply will timeout:
  2. run the client
  3. ✔️ send a reply and see "The SecureDrop server cannot be reached. Trying to reconnect..." without a retry link
  4. ✔️ when then next sync succeeds, see the error message go away (note: the error message will show up again because the reply with a 0.01s timeout will try again and always timeout)

for #798:

server connect error

  1. run the client against prod server
  2. apache2 stop
  3. ✔️ do not send any replies and wait for the next sync (the sync should fail) and see "The SecureDrop server cannot be reached. Trying to reconnect..."
  4. apache2 start
  5. ✔️ see the error message go away

request timeout error

  1. apply this diff so a sync will timeout:
  2. run the client
  3. ✔️ the first sync should fail and you should see "The SecureDrop server cannot be reached. Trying to reconnect..."

@emkll emkll merged commit 19bda64 into master Mar 2, 2020
@emkll emkll deleted the standardize-connection-errors branch March 2, 2020 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants