Any way to avoid copying and compressing files when creating new task? #204

Paperone80 · 2018-11-21T01:28:34Z

Hi,

Is there any way to avoid copying and compressing images when a new task is created with Source: Shared? Also, source is bound read-only so it shouldn't overwrite anything.
Not sure what the reason is but I have some high resolution imagery and I am loosing the important details to set proper attributes and polygon masks. Also, it takes up unnecessary time and disk space. Thanks.

I am using cvat github version from 2018-11-19.

gzvulon · 2019-01-02T09:03:59Z

Same here, I have a lot of HD images on net share and s3,
I'd like to map them to /share dir and use only links, without any data copy.

vfdev-5 · 2019-09-13T08:26:42Z

any updates on this ?

nmanovic · 2019-09-13T09:30:58Z

@vfdev-5 , we are going to reimplement our way to serve data from server (https://github.com/opencv/cvat/tree/az/video_stream). I hope to see the functionality merged in a month or so. After it is merged we will think how to implement the feature. Probably it will be possible to provide data in a pre-defined format. I hope to see the feature in v1.0.0. But we cannot promise.

nmanovic · 2019-12-24T10:24:03Z

Some notes for future reference.

Pipeline to use original data:

Prepare data in a format which CVAT can understand and put them onto your remote storage:
- directory with images
- Use CVAT script (will be provided) to prepare data in right format (chunks)
- Use CVAT script (will be provided) to prepare data for "protected" access (e.g. S3 with credentials). Thus instead of original data the user will upload some text files with links on original data and meta information about data.
CVAT will use the remote storage as is and don't try to copy files internally. There are three main use cases:
- For directory with images CVAT will convert them into "own format" on the fly and cache using DiskCache (http://www.grantjenks.com/docs/diskcache/) in a temporary directory (thus next access should be fast and storage size will be limited).
- Serve original data as is if data was prepared using CVAT script. Aka remote links on the required data. The user already prepared original data for us, don't need to do anything else. But for compressed data we need to compress them on the fly and cache using DiskCache (http://www.grantjenks.com/docs/diskcache/) in a temporary directory (thus next access should be fast and storage size will be limited). Because the original data was prepared as small chunks it will be fast enough to prepare compressed data.
cvat-data will be responsible to accept prepared data and transform them to actual images with meta information.
- chunkN.zip with images will be unzipped
- chunkN.mp4 with video frames will be decoded
- chunkN.txt with links to data will be converted to images if credentials are provided by the client.

bsekachev · 2021-02-16T10:41:31Z

@Marishka17 Do you think the issue was implemented in #2377?

Marishka17 · 2021-02-16T12:24:40Z

@bsekachev, I think yes.

nmanovic added this to the Backlog milestone Jan 21, 2019

nmanovic self-assigned this Jan 21, 2019

nmanovic added the enhancement New feature or request label Jan 21, 2019

KaiLicht mentioned this issue Aug 5, 2019

No duplicate storage for image only jobs #621

Closed

nmanovic modified the milestones: Backlog, 1.0.0 - Beta Sep 13, 2019

sankhakarfa mentioned this issue Sep 23, 2019

Add information about the original URL for an image if it is known #728

Open

nmanovic assigned azhavoro and unassigned nmanovic Nov 26, 2019

nmanovic assigned nmanovic and unassigned azhavoro Dec 24, 2019

nmanovic modified the milestones: 1.0.0 - Beta, 1.1.0 - Alpha Dec 24, 2019

anderflash mentioned this issue Apr 7, 2020

(develop branch) Symbolic links for raw, original and compressed when using NFS share #1374

Closed

2 tasks

nmanovic modified the milestones: 1.1.0-alpha, 1.1.0-release Jun 22, 2020

azhavoro assigned Marishka17 and unassigned nmanovic Sep 30, 2020

bsekachev mentioned this issue Dec 2, 2020

Share without copying & mount cloud storages #2377

Merged

11 tasks

Marishka17 linked a pull request Feb 16, 2021 that will close this issue

Share without copying & mount cloud storages #2377

Merged

11 tasks

bsekachev closed this as completed Feb 16, 2021

iraadit mentioned this issue Feb 24, 2021

Add possibility to not copy files when using share files and cli #2862

Closed

2 tasks

shortcipher3 mentioned this issue Aug 11, 2021

Add option to not copy files when using share files and web ui/cli #3542

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any way to avoid copying and compressing files when creating new task? #204

Any way to avoid copying and compressing files when creating new task? #204

Paperone80 commented Nov 21, 2018 •

edited

Loading

gzvulon commented Jan 2, 2019

vfdev-5 commented Sep 13, 2019

nmanovic commented Sep 13, 2019

nmanovic commented Dec 24, 2019

bsekachev commented Feb 16, 2021

Marishka17 commented Feb 16, 2021

Any way to avoid copying and compressing files when creating new task? #204

Any way to avoid copying and compressing files when creating new task? #204

Comments

Paperone80 commented Nov 21, 2018 • edited Loading

gzvulon commented Jan 2, 2019

vfdev-5 commented Sep 13, 2019

nmanovic commented Sep 13, 2019

nmanovic commented Dec 24, 2019

bsekachev commented Feb 16, 2021

Marishka17 commented Feb 16, 2021

Paperone80 commented Nov 21, 2018 •

edited

Loading