(develop branch) Symbolic links for raw, original and compressed when using NFS share #1374
Closed
2 tasks done
Labels
enhancement
New feature or request
My actions before raising this issue
Expected Behaviour
Reduce storage by using links on raw, original, and compressed subfolders when using NFS share.
Current Behaviour
As far as I understood from engines/task.py (develop), data is copied from share to task/raw, then clustered and zipped to task/original and compressed to task/compressed, resulting in 4 times the same data.
Possible Solution
Is it possible to add an option to only use symbolic links? This is crucial when we have GBs of images and videos. Some related issues are #1210 and #204. Since folders task/original and task/compressed contain zipped chunks, it might be more difficult to mitigate, but for task/raw it's possible. There is an argument to the zip command called --symlinks, but I don't know exactly whether it's reflected onto the python ZipFile class. It's gonna be more difficult to the compressed folder, since it manipulates the images (by scaling the color range to 8 bits and converting it to 8 bits before compression).
Steps to Reproduce (for bugs)
n/a
Context
Each video has around 2 GB of size, and we have dozens of them. Each will be annotated by some students. We expect multiple users for the same dataset (in order to compare their annotations).
Your Environment
git log -1
): commit 76fc8e4 (HEAD -> develop, origin/develop)docker version
(e.g. Docker 17.0.05): Docker version 19.03.5, build 633a0ea838The text was updated successfully, but these errors were encountered: