Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement async progress reporting for upload / assembly #214

Closed
butonic opened this issue Apr 21, 2020 · 9 comments
Closed

Implement async progress reporting for upload / assembly #214

butonic opened this issue Apr 21, 2020 · 9 comments
Labels
Category:Research Research is needed Interaction:Needs-Concept Interaction:Needs-help Asking some hints to engineering when the issue can't be reproduced

Comments

@butonic
Copy link
Member

butonic commented Apr 21, 2020

When finishing a large upload the assembly of a 40 GB file might take a long time. Currently, tus has no extension for this and the concatenation extension explicitly states:

The response to a HEAD request for a final upload SHOULD NOT contain the Upload-Offset header unless the concatenation has been successfully finished. After successful concatenation, the Upload-Offset and Upload-Length MUST be set and their values MUST be equal. The value of the Upload-Offset header before concatenation is not defined for a final upload.

transloadit is mentioning the /assembly endpoint https://github.com/tus/tusd/wiki/Uploading-to-Transloadit-using-tus and also that it uses a socket.io server that can be used to track the progress: https://github.com/tus/tusd/wiki/Uploading-to-Transloadit-using-tus#getting-progress-and-more-events-over-web-sockets

The tusd handler has a hook mechanism that can be used to expose progress.

Or we could let clients make HEAD requests against the upload and return a read only progress property in the metadata ... as a quick hack ...

@butonic
Copy link
Member Author

butonic commented Apr 21, 2020

thx @michaelstingl for asking about this

@butonic
Copy link
Member Author

butonic commented Apr 22, 2020

@felixboehm I wonder if we can use websockets for clients and how this integrates with a push server ... maybe related appleboy/gorush#194

@michaelstingl
Copy link
Contributor

I think as a fallback, we'll always need pull/polling-based approach. But it would be cool to have the more fancy options too.

@butonic
Copy link
Member Author

butonic commented May 19, 2020

AFAICT an upload has multiple phases:

  • transfer (bring bytes to the storage, but not yet accessible to the user)
  • assembly (optional, if multiple chunks need to be assembled)
  • preprocessing (optional, eg virus scanning, content manipulation)
  • finishing (make available to the user)
  • postprocessing (content indexing, thumbnail generation ...)

The tus protocol only covers the first and second phase, transfer and assembly.

When no preprocessing happens, we can return the etag and new fileid immediately.
While preprocessing is in progress we want to show the progress when listing the containing folder. This might be a custom PROPFIND request or property? during progress the etag and fileid won't hale changed, but we might return a progress property? Similar to the async / lazy ops header, which returns a location that clients can use to poll for changes.

AFAICT we need to either extend the FinishUpload method of the tus backend implementations or register hooks that can start the preprocessing properly, keep track of progress and finalize the upload by making the file available to the user / moving it to the correct destination.

postrocessing should not produce anything the clients need.

But for all of this we need a workflow. AFAICT the client may not get a new etag until the upload has passed the finishing phase.

@PVince81
Copy link
Contributor

as discussed, we'd need to invent a new TUS extension for async.
the extension would return extra headers to signal that it's going into async mode (like Lazy-Ops)

@settings settings bot removed the discussion label Sep 23, 2020
@refs
Copy link
Member

refs commented Jan 12, 2021

@butonic want to keep this open?

@refs refs changed the title implement async progress reporting for upload / assembly Implement async progress reporting for upload / assembly Jan 13, 2021
@refs refs added Category:Research Research is needed Interaction:Needs-Concept Estimation:XXXL(21) Interaction:Needs-help Asking some hints to engineering when the issue can't be reproduced labels Jan 13, 2021
@butonic
Copy link
Member Author

butonic commented Mar 8, 2021

cc @dragotin This boils down to a state machine for uploads. Then file listings could return the state of a file as well: transfer, assembly, preprocessing, finishing, postprocessing. See #214 (comment)

@stale
Copy link

stale bot commented Jun 6, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.

@fschade
Copy link
Contributor

fschade commented Jun 4, 2024

Please open again if the ticket is still relevant

@fschade fschade closed this as completed Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category:Research Research is needed Interaction:Needs-Concept Interaction:Needs-help Asking some hints to engineering when the issue can't be reproduced
Projects
None yet
Development

No branches or pull requests

5 participants