Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Summarize Big File Upload for clients with TUS #2627

Open
dragotin opened this issue Oct 14, 2021 · 8 comments
Open

Summarize Big File Upload for clients with TUS #2627

dragotin opened this issue Oct 14, 2021 · 8 comments

Comments

@dragotin
Copy link
Contributor

The purpose of this ticket is to summarize the way all clients can and should use TUS to upload big files to oCIS for the MVP/GA version of oCIS. This is a cross-component topic between oCIS and all clients.

Please @pmaier1 @michaelstingl @felix-schwarz @TheOneRing @kulmann review the following text that will go into the spec docu later and help to complete.

Also we need to clearify missing functionality that is needed on the clients.


Big File Upload Protocols

Over the time, there have been different protocols to upload big files to EFSS systems.

Chunking V1

Chunking V1 was introduced in a very early version of ownCloud and is still used with some older clients.

Basically it uploads chunks of defined sizes of the file to upload in separate requests. Once all chunks are uploaded the server assembles all chunks to build the final file.

Chunking NG

Chunking NG is a further development of Chunking V1 and gives the client more control over the process. It uploads the chunks into a directory on the server that is exclusive for the upload. After all chunks are uploaded the client initiates a move of the directory to the final destination which causes the file assembly.

By using PROPFIND requests on the upload directory, the client always can find out the state of a certain upload.

TUS

TUS is an Open Protocol for Resumeable File Uploads and is a standard protocol defined at [TUS.io]/(https://tus.io). The code is hosted on Github.

Reva Big File Support

CERNBox Implementation with EOS Backend

For CERNBox there is no TUS at the moment, as EOS works with Chunking V1 directly.

WebDAV Handling of TUS Posts

In Reva, all file uploads are started as TUS requests. For that a POST request is sent with the according TUS headers.

By default, the maximum size of an upload is infinite. This is subject to change, see #2584

Now, to configure the maximum size of chunks that oCIS accepts the environment variable STORAGE_FRONTEND_UPLOAD_MAX_CHUNK_SIZE needs to be defined in bytes.

With that parameter set to a spefic value the communication between server and client automatically limits the size of the uploaded parts. A file that is bigger than the limit is transferred in multiple requests, using the TUS protocol. That uses a very flexible sequence of PATCH requests that uploads the parts of the big file. All TUS related processing information is transfered in HTTP headers.

For more details on the TUS protocol please refer to the TUS core protocol.

With the last PATCH request, the server indicates that all data was received and returns a list of headers that the clients can use to optimize the synchronization process.

In particular, in the ownCloud ocdav implementation in Reva, these are

  • OC-FileId: The file ID that is stored.
  • OC-ETag and ETag: The ETag of the file after the new file was stored.
  • OC-Perm: The permission flags in the "single character format".
  • Last-Modified: The last modification date of the file.

Client Implementation Considerations

These are the client considerations for oCIS MVP that we want to deliver for the GA. The guiding principle here is to provide a robust way to upload big files in a defensive way, which means that it works in most of the infrastructures we find at users places by default without extra config. The amount of needed requests should be as little as possiblef for performance reasons.

  • Every upload is done as a TUS request, ie. sends requests as described in the TUS core protocol documentation.
  • The clients have to be able to deal with limited chunk sizes and send a bigger file in chunks.
  • For the oCIS MVP the clients can not pre-set a fileID of a new file they upload. However, oCIS returns with a valid fileID in the last requests reply. Ticket# ?
  • Headers reflecting the file owner ID and the owner display name should be added to the list of reply headers if possible on the oCIS side.

Bugreports in that area that are not yet considered here:

#214
#2626

@kulmann
Copy link
Member

kulmann commented Oct 15, 2021

Web currently uses the tus-js-client lib in version 1.8.0 for TUS uploads. TUS uploads work with oCIS so far.
There were some breaking changes in 2.x (current version: 2.3.0) so we need to schedule some time for updating it.

creation-with-upload is supported server side (already sending data in the initial POST, if I'm not mistaken) but with version 1.8.0 of the js client lib it's not supported client side (see owncloud/web#3436 (comment)). If we want to utilize that we need to schedule some time to upgrade to 2.3.0 in Web. That is tracked in owncloud/web#5371 - we also wanted to refactor the upload business logic in web. Debatable (but a good idea, because it's a mess).

IMO it would also be worth looking into js libs that handle uploads as a whole. @butonic brought up https://uppy.io and I still think that it's a good idea to try that out. That part is unrelated to TUS.

@TheOneRing
Copy link
Member

Looks good.

@pmaier1
Copy link
Contributor

pmaier1 commented Oct 15, 2021

Thanks a lot! Looks good 👍

One thing: Not sure whether this is really relevant here but I found the checksum feature (file integrity checking) missing.

@wkloucek
Copy link
Contributor

@dragotin could you please have a look at #1343 and see how it fits in here?

@stale
Copy link

stale bot commented Dec 24, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status:Stale label Dec 24, 2021
@michaelstingl
Copy link
Contributor

@dragotin could you please have a look at #1343 and see how it fits in here?

I'd say it's unrelated?

@stale stale bot removed the Status:Stale label Dec 27, 2021
@stale
Copy link

stale bot commented Feb 25, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.

@butonic
Copy link
Member

butonic commented Jun 28, 2022

@dragotin how is this actionable? Move your description to devdocs and close?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants