Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zarr re-upload might still be suffering from "A header you provided implies functionality that is not implemented" #1408

Closed
yarikoptic opened this issue Feb 16, 2024 · 17 comments · Fixed by #1444
Assignees
Labels
cmd-upload HIFNI Zarr uploads failing with "A header you provided implies functionality that is not implemented" released zarr

Comments

@yarikoptic
Copy link
Member

I was trying to produce 2nd version of a sample zarr under https://dandiarchive.org/dandiset/000029/draft/files?location=sub-randomzarrlike&page=1
https://dandi.centerforopenneuroscience.org/zarrs/fd6/ab3/fd6ab3ea-cff6-4006-a9bf-acfa5d983985/
which is also freshly created https://github.com/dandizarrs/fd6ab3ea-cff6-4006-a9bf-acfa5d983985/ .

Was created using python3 mktree.py /home/yoh/proj/dandi/dandisets/000029/sub-randomzarrlike/sub-randomzarrlike_junk.zarr layouts/zarr128-smallfiles.json and now "modified" using python3 mktree.py /home/yoh/proj/dandi/dandisets/000029/sub-randomzarrlike/sub-randomzarrlike_junk.zarr layouts/zarr64-smallfiles.json (after rm -rf prior existing subfolders) so that checksum went from af1b4b5849cd2b79a29762d716863379-37480--38379520 to 422bd4b0fa7a06d2655d3c1f67fd4a6f-263522--269846528.

During re-upload screen was showing

❯ dandi upload --validation ignore -J 5:200 ./sub-randomzarrlike_junk.zarr
2024-02-16 15:16:27,754 [    INFO] Found 2 files to consider
PATH                                            SIZE       ERRORS     UPLOAD STATUS                MESSAGE                  
dandiset.yaml                                   3.0 kB                       skipped               should be edited online  
sub-randomzarrlike/sub-randomzarrlike_junk.zarr               1              producing asset       exists - reuploading     

and that "producing asset" made me worry since AFAIK we should have just modified existing zarr, not generate a new one... but it talks about "asset", not "zarr", so hope remained...

Then dandi CLI was showing no progress or anything with 100% CPU in comparing against remote Zarr exists - reuploading state for a few minutes but then upload commenced. I have used -J 5:200 which resulted in dandi CLI somehow taking over 400% CPU according to top... I decided to look into the log file (on my laptop) 20240216201625Z-3182147.log and saw those errors search for which brought me to

2024-02-16T15:16:25-0500 [INFO    ] dandi 3182147:140109695737920 dandi v0.59.1+2.g7ca670b9, hdmf v3.12.0, pynwb v2.5.0, h5py v3.10.0
...
2024-02-16T15:37:17-0500 [WARNING ] dandi 3182147:140099501135552 Will retry: Error 501 while sending PUT request to https://dandiarchive.s3.amazonaws.com/zarr/fd6ab3ea-cff6-4006-a9bf-acfa5d983985/0/0/d0/d2/d26/f1.dat?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SENSORED240216%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20240216T203627Z&X-Amz-Expires=600&X-Amz-SignedHeaders=content-md5%3Bhost%3Bx-amz-acl&X-Amz-Signature=SENSOREDabc18760fa95f5a1d68d81b26cde2092acc8d49: <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>NotImplemented</Code><Message>A header you provided implies functionality that is not implemented</Message><Header>Transfer-Encoding</Header><RequestId>ZJ4G67HNV35TE36A</RequestId><HostId>epCo3E+XFJuxZoX15VXH9bHprpU36w8U9GHzrNePQWtJ3CuOq+B9xABYDDfizeojW1+KSztpXU4=</HostId></Error>
2024-02-16T15:37:29-0500 [DEBUG   ] urllib3.connectionpool 3182147:140099501135552 Resetting dropped connection: dandiarchive.s3.amazonaws.com
2024-02-16T15:37:29-0500 [DEBUG   ] urllib3.connectionpool 3182147:140099501135552 https://dandiarchive.s3.amazonaws.com:443 "PUT /zarr/fd6ab3ea-cff6-4006-a9bf-acfa5d983985/0/0/d0/d2/d26/f1.dat?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SENSORED240216%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20240216T203627Z&X-Amz-Expires=600&X-Amz-SignedHeaders=content-md5%3Bhost%3Bx-amz-acl&X-Amz-Signature=SENSOREDabc18760fa95f5a1d68d81b26cde2092acc8d49 HTTP/1.1" 501 None
2024-02-16T15:37:29-0500 [DEBUG   ] charset_normalizer 3182147:140099501135552 Encoding detection: utf_8 is most likely the one.
2024-02-16T15:37:29-0500 [DEBUG   ] charset_normalizer 3182147:140099501135552 Encoding detection: utf_8 is most likely the one.
2024-02-16T15:37:29-0500 [ERROR   ] dandi 3182147:140099501135552 HTTP request failed repeatedly: Error 501 while sending PUT request to https://dandiarchive.s3.amazonaws.com/zarr/fd6ab3ea-cff6-4006-a9bf-acfa5d983985/0/0/d0/d2/d26/f1.dat?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SENSORED240216%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20240216T203627Z&X-Amz-Expires=600&X-Amz-SignedHeaders=content-md5%3Bhost%3Bx-amz-acl&X-Amz-Signature=SENSOREDabc18760fa95f5a1d68d81b26cde2092acc8d49: <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>NotImplemented</Code><Message>A header you provided implies functionality that is not implemented</Message><Header>Transfer-Encoding</Header><RequestId>TMYXYY9Z0KK9MPP8</RequestId><HostId>TE19VW1d0HD+zXm8jUdRWkcpfi0UnUkOK8ml04mZGUo+t375JtA3ZzcX2xFdDLZWiuFDDxRPSME=</HostId></Error>
2024-02-16T15:37:29-0500 [DEBUG   ] dandi 3182147:140107868591808 Error uploading zarr: HTTPError: 501 Server Error: Not Implemented for url: https://dandiarchive.s3.amazonaws.com/zarr/fd6ab3ea-cff6-4006-a9bf-acfa5d983985/0/0/d0/d2/d26/f1.dat?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SENSORED240216%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20240216T203627Z&X-Amz-Expires=600&X-Amz-SignedHeaders=content-md5%3Bhost%3Bx-amz-acl&X-Amz-Signature=SENSOREDabc18760fa95f5a1d68d81b26cde2092acc8d49
2024-02-16T15:37:29-0500 [DEBUG   ] dandi 3182147:140107868591808 Cancelling upload
2024-02-16T15:37:29-0500 [ERROR   ] dandi 3182147:140107868591808 Error uploading /home/yoh/proj/dandi/dandisets/000029/sub-randomzarrlike/sub-randomzarrlike_junk.zarr:
Traceback (most recent call last):
  File "/home/yoh/proj/dandi/dandi-cli-master/dandi/upload.py", line 357, in process_path
    for r in dfile.iter_upload(
  File "/home/yoh/proj/dandi/dandi-cli-master/dandi/files/zarr.py", line 486, in iter_upload
    size = fut.result()
           ^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yoh/proj/dandi/dandi-cli-master/dandi/files/zarr.py", line 561, in _upload_zarr_file
    storage_session.put(
  File "/home/yoh/proj/dandi/dandi-cli-master/dandi/dandiapi.py", line 315, in put
    return self.request("PUT", path, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yoh/proj/dandi/dandi-cli-master/dandi/dandiapi.py", line 199, in request
    for i, attempt in enumerate(
  File "/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3.11/lib/python3.11/site-packages/tenacity/__init__.py", line 347, in __iter__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3.11/lib/python3.11/site-packages/tenacity/__init__.py", line 325, in iter
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3.11/lib/python3.11/site-packages/tenacity/__init__.py", line 158, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/home/yoh/proj/dandi/dandi-cli-master/dandi/dandiapi.py", line 235, in request
    result.raise_for_status()
  File "/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3.11/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 501 Server Error: Not Implemented for url: https://dandiarchive.s3.amazonaws.com/zarr/fd6ab3ea-cff6-4006-a9bf-acfa5d983985/0/0/d0/d2/d26/f1.dat?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SENSORED240216%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20240216T203627Z&X-Amz-Expires=600&X-Amz-SignedHeaders=content-md5%3Bhost%3Bx-amz-acl&X-Amz-Signature=SENSOREDabc18760fa95f5a1d68d81b26cde2092acc8d49

upload did after all fail with

❯ dandi upload --validation ignore -J 5:200 ./sub-randomzarrlike_junk.zarr
2024-02-16 15:16:27,754 [    INFO] Found 2 files to consider
PATH                                            SIZE        ERRORS          UPLOAD STATUS                          MESSAGE                                                                                                                                         
dandiset.yaml                                   3.0 kB                             skipped                         should be edited online                                                                                                                         
sub-randomzarrlike/sub-randomzarrlike_junk.zarr 269.8 MB       1               19% ERROR                           501 Server Error: Not Implemented for url: https://dandiarchive.s3.amazonaws.com/zarr/fd6ab3ea-cff6-4006-a9bf-acfa5d983985/0/0/d0/d2/d26/f1.d...
Summary:                                        269.8 MB 1 with errors   42.0 kB/s 1 skipped                       1 should be edited online                                                                                                                       
                                                                                   1 ERROR                         1 501 Server Error: Not Implemented for url: https://dandiarchive.s3.amazonaws.com/zarr/fd6ab3ea-cff6-4006-a9bf-acfa5d983985/0/0/d0/d2/d26/f1...
2024-02-16 15:37:30,390 [    INFO] Logs saved in /home/yoh/.local/state/dandi-cli/log/20240216201625Z-3182147.log
Error: 501 Server Error: Not Implemented for url: https://dandiarchive.s3.amazonaws.com/zarr/fd6ab3ea-cff6-4006-a9bf-acfa5d983985/0/0/d0/d2/d26/f1.dat?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAUBRWC5GAEKH3223E%2F20240216%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20240216T203627Z&X-Amz-Expires=600&X-Amz-SignedHeaders=content-md5%3Bhost%3Bx-amz-acl&X-Amz-Signature=249ff2f8f4a98480a09b11160abc18760fa95f5a1d68d81b26cde2092acc8d49
@jwodder
Copy link
Member

jwodder commented Feb 16, 2024

@yarikoptic Do the logs contain any messages about super_len() or stat()?

@yarikoptic
Copy link
Member Author

I don't think so:

❯ zgrep -e super_len  -e stat 20240216201625Z-3182147.log.gz
    do = self.iter(retry_state=retry_state)
    result.raise_for_status()
  File "/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3.11/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
2024-02-16T15:37:30-0500 [INFO    ] dandi 3182147:140109695737920 Logs saved in /home/yoh/.local/state/dandi-cli/log/20240216201625Z-3182147.log

full log (compressed) now available from dandi@drogon:20240216201625Z-3182147.log.gz

@jwodder
Copy link
Member

jwodder commented Feb 16, 2024

@yarikoptic The errors all seem to be associated with the file 0/0/d0/d2/d26/f1.dat. Is that file empty or somehow odd in any way?

@jwodder
Copy link
Member

jwodder commented Feb 16, 2024

@yarikoptic Can you try redoing the upload with DANDI_DEVEL_INSTRUMENT_REQUESTS_SUPERLEN=1 set?

@yarikoptic
Copy link
Member Author

looks "regular" to me...
❯ ls sub-randomzarrlike_junk.zarr/0/0/d0/d2/d26/f1.dat -l
-rw------- 1 yoh yoh 1024 Feb 16 15:14 sub-randomzarrlike_junk.zarr/0/0/d0/d2/d26/f1.dat
❯ md5sum -l
md5sum: invalid option -- 'l'
Try 'md5sum --help' for more information.
❯ md5sum sub-randomzarrlike_junk.zarr/0/0/d0/d2/d26/f1.dat
4bd9951479a6a7b2b36b6c8b80f1e5f5  sub-randomzarrlike_junk.zarr/0/0/d0/d2/d26/f1.dat
❯ cat sub-randomzarrlike_junk.zarr/0/0/d0/d2/d26/f1.dat
��
  �Կ�[�\������#;r�1�>��G/1{���J�/�ճ]��,L-z����%AT�J��驇c�S��%�
�7f�����,����$�,�<@��⍩&�P��ߑ�����...

seems to be a new file since there is no d26 under https://github.com/dandizarrs/fd6ab3ea-cff6-4006-a9bf-acfa5d983985/tree/0.231017.2004%2Bzarr1/0/0/d0 which should reflect prior version .

@jwodder
Copy link
Member

jwodder commented Feb 19, 2024

@yarikoptic What if you set the envvar DANDI_DEVEL_INSTRUMENT_REQUESTS_SUPERLEN=1 while uploading?

@yarikoptic
Copy link
Member Author

that re-upload succeeded without error

❯ DANDI_DEVEL_INSTRUMENT_REQUESTS_SUPERLEN=1 dandi upload --validation ignore -J 5:200 ./sub-randomzarrlike_junk.zarr
2024-02-16 17:58:38,651 [    INFO] Found 2 files to consider
PATH                                            SIZE        ERRORS          UPLOAD STATUS                          MESSAGE                  
dandiset.yaml                                   3.0 kB                             skipped                         should be edited online  
sub-randomzarrlike/sub-randomzarrlike_junk.zarr 269.8 MB       1              100% done                            exists - reuploading     
Summary:                                        269.8 MB 1 with errors   41.5 kB/s 1 skipped                       1 should be edited online
                                                                                   1 done                          1 exists - reuploading   
2024-02-16 19:25:44,541 [    INFO] Logs saved in /home/yoh/.local/state/dandi-cli/log/20240216225836Z-3262469.log
DANDI_DEVEL_INSTRUMENT_REQUESTS_SUPERLEN=1 dandi upload --validation ignore -  10097.05s user 294.08s system 198% cpu 1:27:08.90 total

log is at dandi@drogon.dartmouth.edu:20240216225836Z-3262469.log.gz .

@yarikoptic
Copy link
Member Author

uploaded also 20240223192140Z-306046.log.gz from another successful upload. I now wonder if that instrumentation is what makes it pass. grep for anything diagnostic due to super_len seems to be not there:

grep -e 'super_len() report size' -e '- ' 20240216201625Z-3182147.log.gz 20240216225836Z-3262469.log.gz 20240223192140Z-306046.log.gz

@yarikoptic
Copy link
Member Author

@jwodder - please try (script) a few uploads like that. May be against staging instance. Original description provides all the commands needed. Would be nice to catch this and figure out why AWS wasn't happy to not make users surprised later on.

@jwodder
Copy link
Member

jwodder commented Apr 5, 2024

@yarikoptic As I have stated previous times this has come up, the "A header you provided implies functionality that is not implemented" error from AWS occurs when an upload is made using "chunked" transfer-encoding, and requests uses "chunked" whenever (a) the data being uploaded is zero-length or (b) a file is being uploaded and requests fails to determine its size. What exactly do you want me to do about this?

@yarikoptic
Copy link
Member Author

Figure out either it is a. or b. or some other c. and in the end/accordingly make sure that we do not error out when we upload.

@jwodder
Copy link
Member

jwodder commented May 22, 2024

@yarikoptic I'm currently uploading a randomly-generated Zarr to staging from my MacBook Pro with DANDI_DEVEL_INSTRUMENT_REQUESTS_SUPERLEN=1 set. No errors have occurred yet (aside from a "too many open files" error at the start which was apparently fixed by lowering the number of upload threads per asset). However, this is taking a long time, so I'm now running the following on my account on smaug:

meta-run.sh:

#!/bin/bash
set -x

cd "$(dirname "$0")"
mkdir -p logs
for i in {1..10}
do
    DANDI_DEVEL_INSTRUMENT_REQUESTS_SUPERLEN=1 \
        ./run.sh |& tee logs/"$(date -u +%Y.%m.%d.%H.%M.%SZ)"-spy.log
    ./run.sh |& tee logs/"$(date -u +%Y.%m.%d.%H.%M.%SZ)"-nospy.log
done

run.sh:

#!/bin/bash
export DANDI_API_KEY=---REDACTED---
export DANDI_DEVEL=1

set -ex
cd "$(dirname "$0")"
. venv/bin/activate

cd 214256
for layout in zarr128-smallfiles zarr64-smallfiles
do
    rm -rf random.zarr
    echo Generating random.zarr from $layout ...
    chronic python3 ../zarr-digest-timings/mktree.py \
        random.zarr \
        ../zarr-digest-timings/layouts/$layout.json
    echo "Now: $(date)"
    if ! dandi upload --devel-debug -i dandi-staging --validation ignore -J 5:200 random.zarr
    then echo "Failed at: $(date)"
         exit 1
    fi
    echo "Now: $(date)"
done

@jwodder
Copy link
Member

jwodder commented May 22, 2024

Update: I just got the following errors on smaug:

  • In a run with DANDI_DEVEL_INSTRUMENT_REQUESTS_SUPERLEN=1 set:

    Error uploading zarr: RuntimeError: requests.utils.super_len() reported size of 0 for '/home/jwodder/dandi-1408/214256/random.zarr/0/0/d0/d10/d7/f3.dat', but os.stat() reported size 1024 bytes 1 tries later
    
  • In a run without the envvar set, trying to upload a certain Zarr entry failed with the "header implies functionality not implemented" message.

According to df -T, the /home directory on smaug is btrfs.

EDIT: And now I've gotten "Error: requests.utils.super_len() reported size of 0 for '/Users/jwodder/dartmouth/tmp/dandi-1408/214255/random.zarr/1/0/d0/d1/d9/f34.dat', but os.stat() reported size 1024 bytes 1 tries later" on macOS (filesystem type: apfs).

@yarikoptic
Copy link
Member Author

"fun!" .

  • we can't trust super_len value of 0
  • worth reporting to requests or even Python itself (iirc it is some python library provided value)?

@jwodder
Copy link
Member

jwodder commented May 22, 2024

@yarikoptic I'm going to try editing the source code of super_len() to add logging statements to see exactly what's going on, and then I'll likely (hopefully) end up filing an issue with requests.

@jwodder jwodder added the HIFNI Zarr uploads failing with "A header you provided implies functionality that is not implemented" label May 22, 2024
@jwodder
Copy link
Member

jwodder commented May 22, 2024

@yarikoptic See #1444.

Copy link

🚀 Issue was released in 0.62.1 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cmd-upload HIFNI Zarr uploads failing with "A header you provided implies functionality that is not implemented" released zarr
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants