downloadcmd fails during large concurrent downloads #94

ericearl · 2024-04-03T16:13:53Z

Description

When I run about 7,000+ jobs which use downloadcmd in parallel (therefore timing of downloadcmd is unpredictable ), I think they are colliding on either ~/NDA/nda-tools/downloadcmd/packages/MYPACKAGEID/.download-progress/download-job-manifest.csv or ~/NDA/nda-tools/downloadcmd/packages/MYPACKAGEID/.download-progress/SOME-SPECIAL-HASH/download-progress-report.csv. This causes the majority of downloads to fail with the below Error and therefore my downstream data conversions fail.

Suggestions

Perhaps downloadcmd could provide an option to ignore using the download-job-manifest.csv or the download-progress-report.csv, or both? I have found myself needing to purge these files anyway in order to retry my failed conversion jobs that include the downloadcmd in the pipeline.
Alternatively, could the downloadcmd routine have a way to keep only temporary download-job-manifest.csv and download-progress-report.csv files?
Keep a lock on the CSV files while they are being written to, and check the lock is not active before trying to write. Then send back a message to the user if they hit the lock.

Error

Running NDATools Version 0.2.26
Traceback (most recent call last):
  File "/home/earlea/.local/bin/downloadcmd", line 8, in <module>
    sys.exit(main())
  File "/home/earlea/.local/lib/python3.10/site-packages/NDATools/clientscripts/downloadcmd.py", line 185, in main
    s3Download = Download(config, args)
  File "/home/earlea/.local/lib/python3.10/site-packages/NDATools/Download.py", line 173, in __init__
    self.download_progress_report_file_path = self.initialize_verification_files()
  File "/home/earlea/.local/lib/python3.10/site-packages/NDATools/Download.py", line 685, in initialize_verification_files
    job_record = self.find_matching_download_job(download_job_manifest_path)
  File "/home/earlea/.local/lib/python3.10/site-packages/NDATools/Download.py", line 650, in find_matching_download_job
    if is_job_match(job):
  File "/home/earlea/.local/lib/python3.10/site-packages/NDATools/Download.py", line 644, in is_job_match
    return all(map(test_match, must_match))
  File "/home/earlea/.local/lib/python3.10/site-packages/NDATools/Download.py", line 632, in test_match
    val1 = Utils.convert_to_abs_path(val1)
  File "/home/earlea/.local/lib/python3.10/site-packages/NDATools/Utils.py", line 231, in convert_to_abs_path
    return os.path.abspath(os.path.expanduser(os.path.expandvars(file_name)))
  File "/usr/local/Anaconda/envs/py3.10/lib/python3.10/posixpath.py", line 287, in expandvars
    path = os.fspath(path)
TypeError: expected str, bytes or os.PathLike object, not NoneType

The text was updated successfully, but these errors were encountered:

j-tseng · 2024-07-17T19:05:18Z

@gregmagdits - FYI this post might have the details on the file collisions you were looking for on issue #101.

gregmagdits mentioned this issue Jul 18, 2024

Allow configuration of the NDA_ORGINIZATION_ROOT_FOLDER #101

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

downloadcmd fails during large concurrent downloads #94

downloadcmd fails during large concurrent downloads #94

ericearl commented Apr 3, 2024 •

edited

Loading

j-tseng commented Jul 17, 2024

downloadcmd fails during large concurrent downloads #94

downloadcmd fails during large concurrent downloads #94

Comments

ericearl commented Apr 3, 2024 • edited Loading

Description

Suggestions

Error

j-tseng commented Jul 17, 2024

ericearl commented Apr 3, 2024 •

edited

Loading