Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-1849] [Bug] Unhandled exception in dbt deps with a bad/truncated download #6653

Closed
2 tasks done
barberscott opened this issue Jan 19, 2023 · 2 comments · Fixed by #8182
Closed
2 tasks done

[CT-1849] [Bug] Unhandled exception in dbt deps with a bad/truncated download #6653

barberscott opened this issue Jan 19, 2023 · 2 comments · Fixed by #8182
Assignees
Labels
bug Something isn't working deps dbt's package manager

Comments

@barberscott
Copy link

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

During a github outage, dbt deps failed presumably as a result of codeload.github.com sending us a truncated download of a package gzip. We appear not to handle a failure in the gzip decompress and deps fails overall instead of retrying the download.

Expected Behavior

Deps would retry the download per its normal retry behavior and not crash/throw unexpectedly.

Steps To Reproduce

Intermittent -- not reproducible on demand.

Relevant log output

Traceback (most recent call last):
  File "/usr/src/app/sinter/clients/dbt.py", line 1417, in call
    dbt_main.handle(command + extra_args)
  File "/usr/local/lib/python3.8/dist-packages/dbt/main.py", line 152, in handle
    res, success = handle_and_check(args)
  File "/usr/local/lib/python3.8/dist-packages/dbt/main.py", line 192, in handle_and_check
    task, res = run_from_args(parsed)
  File "/usr/local/lib/python3.8/dist-packages/dbt/main.py", line 246, in run_from_args
    results = task.run()
  File "/usr/local/lib/python3.8/dist-packages/dbt/task/deps.py", line 67, in run
    package.install(self.config, renderer)
  File "/usr/local/lib/python3.8/dist-packages/dbt/deps/registry.py", line 83, in install
    connection_exception_retry(download_untar_fn, 5)
  File "/usr/local/lib/python3.8/dist-packages/dbt/utils.py", line 608, in _connection_exception_retry
    return fn()
  File "/usr/local/lib/python3.8/dist-packages/dbt/deps/registry.py", line 95, in download_and_untar
    system.untar_package(tar_path, deps_path, package_name)
  File "/usr/local/lib/python3.8/dist-packages/dbt/clients/system.py", line 489, in untar_package
    tarball.extractall(dest_dir)
  File "/usr/lib/python3.8/tarfile.py", line 2028, in extractall
    self.extract(tarinfo, path, set_attrs=not tarinfo.isdir(),
  File "/usr/lib/python3.8/tarfile.py", line 2069, in extract
    self._extract_member(tarinfo, os.path.join(path, tarinfo.name),
  File "/usr/lib/python3.8/tarfile.py", line 2141, in _extract_member
    self.makefile(tarinfo, targetpath)
  File "/usr/lib/python3.8/tarfile.py", line 2190, in makefile
    copyfileobj(source, target, tarinfo.size, ReadError, bufsize)
  File "/usr/lib/python3.8/tarfile.py", line 247, in copyfileobj
    buf = src.read(bufsize)
  File "/usr/lib/python3.8/gzip.py", line 292, in read
    return self._buffer.read(size)
  File "/usr/lib/python3.8/_compression.py", line 68, in readinto
    data = self.read(len(byte_view))
  File "/usr/lib/python3.8/gzip.py", line 498, in read
    raise EOFError("Compressed file ended before the "
EOFError: Compressed file ended before the end-of-stream marker was reached


### Environment

_No response_

### Which database adapter are you using with dbt?

other (mention it in "Additional Context")

### Additional Context

Observed in dbt Cloud.
@barberscott barberscott added bug Something isn't working triage labels Jan 19, 2023
@github-actions github-actions bot changed the title [Bug] Unhandled exception in dbt deps with a bad/truncated download [CT-1849] [Bug] Unhandled exception in dbt deps with a bad/truncated download Jan 19, 2023
@jtcohen6
Copy link
Contributor

Thanks @barberscott! We already wrap this in retry logic:

download_untar_fn = functools.partial(
self.download_and_untar, download_url, str(tar_path), deps_path, package_name
)
connection_exception_retry(download_untar_fn, 5)
def download_and_untar(self, download_url, tar_path, deps_path, package_name):
"""
Sometimes the download of the files fails and we want to retry. Sometimes the
download appears successful but the file did not make it through as expected
(generally due to a github incident). Either way we want to retry downloading
and untarring to see if we can get a success. Call this within
`_connection_exception_retry`
"""
system.download(download_url, tar_path)
system.untar_package(tar_path, deps_path, package_name)

I think the relevant detail here is that _connection_exception_retry only retries on RequestException or ReadError:

dbt-core/core/dbt/utils.py

Lines 608 to 620 in 99f27de

def _connection_exception_retry(fn, max_attempts: int, attempt: int = 0):
"""Attempts to run a function that makes an external call, if the call fails
on a Requests exception or decompression issue (ReadError), it will be tried
up to 5 more times. All exceptions that Requests explicitly raises inherit from
requests.exceptions.RequestException. See https://github.com/dbt-labs/dbt-core/issues/4579
for context on this decompression issues specifically.
"""
try:
return fn()
except (
requests.exceptions.RequestException,
ReadError,
) as exc:

And this is an EOFError! Simplest thing here is to add it to the except list above.

A more thorough approach would see if there are any other exceptions that might crop up along the way there (from tarfile.extractall through gzip.read).

@jtcohen6 jtcohen6 added Team:Language deps dbt's package manager and removed triage labels Jan 19, 2023
@jtcohen6
Copy link
Contributor

We haven't seen other exception types in the wild. Let's apply the narrower fix for now: adding EOFError to the set of retryable exception types in _connection_exception_retry

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working deps dbt's package manager
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants