Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry once (or more?) on any TransportError #18

Closed
simonw opened this issue Feb 18, 2022 · 12 comments
Closed

Retry once (or more?) on any TransportError #18

simonw opened this issue Feb 18, 2022 · 12 comments
Labels
bug Something isn't working

Comments

@simonw
Copy link
Owner

simonw commented Feb 18, 2022

Got this exception:

  File "/Users/simon/Dropbox/Development/google-drive-to-sqlite/google_drive_to_sqlite/utils.py", line 79, in get
    response = httpx.get(url, params=params, headers=headers, timeout=self.timeout)
  File "/Users/simon/.local/share/virtualenvs/google-drive-to-sqlite-Wr1nXkpK/lib/python3.10/site-packages/httpx/_api.py", line 189, in get
    return request(
  File "/Users/simon/.local/share/virtualenvs/google-drive-to-sqlite-Wr1nXkpK/lib/python3.10/site-packages/httpx/_api.py", line 100, in request
    return client.request(
  File "/Users/simon/.local/share/virtualenvs/google-drive-to-sqlite-Wr1nXkpK/lib/python3.10/site-packages/httpx/_client.py", line 802, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/Users/simon/.local/share/virtualenvs/google-drive-to-sqlite-Wr1nXkpK/lib/python3.10/site-packages/httpx/_client.py", line 889, in send
    response = self._send_handling_auth(
  File "/Users/simon/.local/share/virtualenvs/google-drive-to-sqlite-Wr1nXkpK/lib/python3.10/site-packages/httpx/_client.py", line 917, in _send_handling_auth
    response = self._send_handling_redirects(
  File "/Users/simon/.local/share/virtualenvs/google-drive-to-sqlite-Wr1nXkpK/lib/python3.10/site-packages/httpx/_client.py", line 954, in _send_handling_redirects
    response = self._send_single_request(request)
  File "/Users/simon/.local/share/virtualenvs/google-drive-to-sqlite-Wr1nXkpK/lib/python3.10/site-packages/httpx/_client.py", line 990, in _send_single_request
    response = transport.handle_request(request)
  File "/Users/simon/.local/share/virtualenvs/google-drive-to-sqlite-Wr1nXkpK/lib/python3.10/site-packages/httpx/_transports/default.py", line 217, in handle_request
    with map_httpcore_exceptions():
  File "/Users/simon/.pyenv/versions/3.10.0/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/simon/.local/share/virtualenvs/google-drive-to-sqlite-Wr1nXkpK/lib/python3.10/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.RemoteProtocolError: Server disconnected without sending a response.

Would be good to retry once if this happens.

@simonw simonw added the bug Something isn't working label Feb 18, 2022
@simonw
Copy link
Owner Author

simonw commented Feb 19, 2022

https://www.python-httpx.org/exceptions/#the-exception-hierarchy shows the exception hierarchy:

image

@simonw
Copy link
Owner Author

simonw commented Feb 19, 2022

I want to retry on any form of TransportError i think - no point retrying a DecodingError or a TooManyRedirects error.

@simonw
Copy link
Owner Author

simonw commented Feb 19, 2022

Sadly it looks like httpx itself has decided not to implement retry logic, so I need to build this myself:

@simonw simonw changed the title httpx.RemoteProtocolError error Retry once (or more?) on any TransportError Feb 19, 2022
@simonw
Copy link
Owner Author

simonw commented Feb 19, 2022

While testing this I'm going to want to see if any transport errors have occurred - I think I'll add a -v/--verbose flag to the google-drive-to-sqlite files command.

@simonw
Copy link
Owner Author

simonw commented Feb 19, 2022

I'm only going to retry GET, I won't retry POST.

simonw added a commit that referenced this issue Feb 20, 2022
Because I need to use allow_retry for #18
@simonw simonw closed this as completed in 0a419f5 Feb 20, 2022
@simonw
Copy link
Owner Author

simonw commented Feb 20, 2022

Now manually testing this by running:

google-drive-to-sqlite files --folder 1E6Zg2X2bjjtPzVfX8YqdXZDCoB3AVA7i --nl --verbose > all-files.json-nl.txt

And keeping an eye on it while it runs with:

watch 'wc -l all-files.json-nl.txt && ls -lah all-files.json-nl.txt'

Started it running at 4:31pm.

@simonw
Copy link
Owner Author

simonw commented Feb 20, 2022

It's at 37223 all-files.json-nl.txt and 49MB now, 25 minutes after starting.

@simonw
Copy link
Owner Author

simonw commented Feb 20, 2022

That actually worked! 162M file resulted, with no errors.

@simonw
Copy link
Owner Author

simonw commented Feb 20, 2022

Now running this to see what happens:

 time google-drive-to-sqlite files all-files.db --import-nl all-files.json-nl.txt
43.24s user 94.07s system 71% cpu 3:13.06 total

Produced a 80MB SQLite file, thanks presumably to the owners data being de-duplicated.

@simonw
Copy link
Owner Author

simonw commented Feb 20, 2022

image

I'm suspicious of the 14,100 rows in the drive_users table.

@simonw
Copy link
Owner Author

simonw commented Feb 20, 2022

Confirmed, something went very wrong there:

image

88 rows where permissionId is not null, 14,012 rows where permissionId is null.

@simonw
Copy link
Owner Author

simonw commented Feb 20, 2022

Fixed that bug:

image

simonw added a commit that referenced this issue Feb 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant