Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uploads with TUS should return the new fileid and etag #216

Closed
butonic opened this issue Apr 22, 2020 · 15 comments
Closed

Uploads with TUS should return the new fileid and etag #216

butonic opened this issue Apr 22, 2020 · 15 comments
Assignees

Comments

@butonic
Copy link
Member

butonic commented Apr 22, 2020

The clients need them to update the metadata of the local file tree.

See https://github.com/cs3org/reva/pull/674/files/4d072d0601acc9e21d5e774f1f714cf5ac9fc6f0#r413153186
Related #214

@PVince81
Copy link
Contributor

as discussed, for now OCIS will return the etag and the file id on the last PATCH request, the one where the last bit of the file is sent and before internally finishing the upload.

since we currently don't support the concatenation extension, it's not relevant.
once we think about the concatenation extension, the etag would be returned in the final concatenation request.

@PVince81
Copy link
Contributor

@TheOneRing FYI

@PVince81 PVince81 self-assigned this May 26, 2020
@PVince81
Copy link
Contributor

ideally I'd only want to expose headers call "OC-Etag" and "OC-Fileid" on the OCDAV layer, as the prefix is highly dependent on OC.

however, because TUS directs the client to send chunks directly to the storage provider, it means the headers must also be returned by the storage provider, which means every storage will need to implement returning those.

I'll need to dig deeper to confirm this.

@PVince81
Copy link
Contributor

I just had a quick try in the dataprovider.go and tried to call GetMD() after an upload, but this fails because we don't have the actual path to the file:

reva-storage-oc-data.log:2020-05-28T12:00:50+02:00 ERR error sending grpc GetMD request after upload error="error: not found: /886c7584-6636-43c8-9a81-939dd07b54d7" pkg=rhttp service=reva traceid=49ef24c51fa2de2a3ac43ce35be26289

not sure if there's a way to resolve it from there but it seems it would imply that the data provider needs access to the storage provider, it might be too deep in the layers

@butonic
Copy link
Member Author

butonic commented Jun 2, 2020

I evaluated the upload scenarios with @PVince81. We currently see two that we need to support initially:

  1. small file upload
  2. huge file chunked upload

small file upload

  • should use a single request to reduce latency and request overhead
  • uses POST according to the creation-with-upload extension

huge file chunked upload

  • should use a PROPFIND to discover the fileid and etag of the uploaded file
  • uses multiple PATCH requests
  • the final upload may trigger a longer runnig assembly or workflow, etag and fileid discovery thus should either use (long) polling or a websocket
  • for (long) polling we could add a new property to the PROPFIND response that exposes the workflow progress
  • for a websocket we could send the workflow progress as events

FAQ

Q: Can't we implment only one type of upload for now?
A: We can start with the creation-with-upload and send large files in one POST request. The POST is handled using our ocdav implementation on the server side, which will internally stat the uploaded file and return fileid and etag. Be aware that this currently runs into a 10 sec timeout, which we still need to remove.

@PVince81
Copy link
Contributor

PVince81 commented Jun 3, 2020

for the record, we looked into returning metadata from the PATCH request but as this one is fully handled by the tusd Golang library, there is no mechanism there to allow us to access the response object before it is sent out, so we cannot return additional headers there.

Possible approaches:

  1. Send an upstream patch to the tusd library which provides a callback with the response object to let us add additional headers there: here no guarantee how long it might take to have this merged there, whether it will be accepted, and might risk needing to work with a fork (and sync upstream) for a while
    or
  2. Hack: wrap the response object using decorator pattern and have it inject the headers just before being sent out
    or
  3. Hack: wrap the tusd handler and reimplement the method that sends out the response to inject headers there

If we are to take the route of returning etag/fileid in the PATCH request, I'd be more in favor of the upstream patch...

@PVince81
Copy link
Contributor

PVince81 commented Jun 3, 2020

or:
4. implement the TUS server side logic completely or partially on our own and not rely on libraries: error prone

@PVince81
Copy link
Contributor

PVince81 commented Jun 3, 2020

here's the beginning of my PR where I'm stuck due to the above issue: cs3org/reva#797

@butonic
Copy link
Member Author

butonic commented Jun 4, 2020

@PVince81 the ResponseWriter interface is not too big. we could wrap it, see cs3org/reva#797 (comment)

@PVince81
Copy link
Contributor

PVince81 commented Jun 5, 2020

we could tackle the X-OC-Mtime header as well: owncloud/ocis-reva#174 as wellm we need to return X-OC-MTime: accepted when the mtime was accepted.
The same challenges apply like for the other headers.

@PVince81
Copy link
Contributor

PVince81 commented Jun 9, 2020

for creation-with-upload, here's a PR cs3org/reva#813 that will return the metadata if the upload completed within the POST request.

@PVince81
Copy link
Contributor

PVince81 commented Jun 9, 2020

seems there's already a feature request for custom headers here tus/tusd#388 but got denied for now

@PVince81
Copy link
Contributor

PVince81 commented Jun 9, 2020

oh, there's a handler.finishUploadIfComplete that calls a hook handler.CompleteUploads but sadly only contains the request object, not the response

@PVince81
Copy link
Contributor

we had a discussion to exclude the PATCH case as it's too complex to realize and also might not be that useful once we start thinking of async uploads, where a client will need to poll anyway

@PVince81
Copy link
Contributor

PVince81 commented Sep 3, 2020

owncloud/product#78 is closed

@PVince81 PVince81 closed this as completed Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants