feat: Remove original file stem from filenames in the blobstorage (#4309) #4583

iequidoo · 2023-07-27T20:05:41Z

Based on #5495

r10s · 2023-08-04T18:30:42Z

thinking it over again, forcing random file name in blob-directory will worsen UX at least on deltachat-desktop, at least currently:

on desktop, currently, eg. a pdf-reader is opened without the file being copied, the pdf-reader will show the filename then - and there, sth. as one-sheet.pdf or one-sheet-12345678.pdf is better than 1234567812345678.pdf. cmp eg the following pdf viewer on macos - the title would become sth. totally random with this pr:

i known, the idea is that all files should be copied before being accessed from external apps, however, we're not there yet on any UI; most visible, however it seems to be on desktop, cc @Simon-Laux

so, as the filename in the blob-directory does not have a meaning anyways and we're anyways have no consistency there, it would remove pressure from UIs and simplify getting this pr to stable if we keep filename generation as is (or maybe always add the random number so that it is more visible where the "wrong" filenames are used).

this pr could add only add the save-function, use correct names on forwarding, and what else is done here (@iequidoo i find pr descriptions as "some discussions here and there" without some additional words hard for reviewers, it would be good to always have at least a handful sentences in your own words about reasonings, effects, successor of etc)

as a side-effect, this would keep debugging blob-dir simple :)

when UI are adapted to the new API, we can think over to use completely random file names.

link2xt · 2023-08-04T18:43:41Z

The main tracking issue for this PR is #4309

This PR is not meant to be a next step at this point, see my comment at #4309 (comment)
We indeed first need to adapt the UIs to avoid opening the files directly from the blobstore. Desktop is currently not even using the filename for the Save As dialog: deltachat/deltachat-desktop#3330

iequidoo · 2023-08-05T03:14:26Z

@iequidoo i find pr descriptions as "some discussions here and there" without some additional words hard for reviewers, it would be good to always have at least a handful sentences in your own words about reasonings, effects, successor of etc

Ok, i will better improve commit messages, i prefer putting everything there and use PR descriptions just for additional info.

As for dc_msg_get_file(), i can implement what @link2xt suggested:

dc_msg_get_file API should create a temporary directory inside the blobdir, save a copy with unmangled filename, and return its path. In the housekeeping we regularly clear these directories. This API is kept for compatibility, but deprecated.

but then we need to add some new call like dc_msg_get_filedata_path() which will return the path of a file with mangled name (but with an original extension) to avoid unnecessary copying.

iequidoo · 2023-08-05T03:33:17Z

Also i can just remove the last commit so that there will be random suffixes, but not whole filenames. Anyway now we can get files with random suffixes from dc_msg_get_file() in case of filename duplication

iequidoo · 2024-01-24T02:51:17Z

Removed the last commit to unblock merging this

r10s

a high-level comment:

making dc_msg_get_file() return unmangled name will improve some things in UI. at the same time, as this requires copying the file, converts the function from a cheap to an expensive one, potentially worsening things - UI have used dc_msg_get_file() more as get_state(), it is always fast anyways, no need to think over it can be slow or one needs to save results.

to target this, UI has to check all usages of dc_msg_get_file() - to avoid side effects, but also as dc_msg_get_file is anyways deprecated

but if UI anyways needs to check this, then it seems better to leave the functionality of dc_msg_get_file() as is - so it will become the dc_msg_get_filedata_path() (which it is already mostly, function description is mostly the same), UI wise, things may get slightly worse, as the filename is always “bad” - but UI would need to do the checks anyways and by this approach:

no unexpected slowness or worsening can happen - there are lots of calls to dc_msg_get_file() - and that in already complex areas as "share", i expect things to be overseen
no new deprecated API
things are working as before (apart from more often mangled names)
it is very visible to devs/users what is missing: if eg. external apps show the mangled names, one needs to transfer things via a BLOB+dc_msg_get_filename() or use dc_msg_save_file()
finally, but only as a sidenode on an implementation detail: not creating additional subdirectories in blobdir may have maintenance benefits in the future

might be i voted against that approach before, idk, but it looks better to me, thinking it over, also with the otherwise behaviour change of dc_msg_get_file()

Recently there was an accident with a chatbot that replaced its avatar set from the command line with an unrelated avatar of a contact. Both the `selfavatar` setting and the contact avatar `i` param pointed to `$BLOBDIR/avatar.png` at the time it was detected. How this happened is unclear, but it is possible that `avatar.png` was removed, unmounted or otherwise not detected by the core, and the core stored avatar received from the contact as `avatar.png`, while `selfavatar` config still pointed to `$BLOBDIR/avatar.png`. Such bugs are unavoidable even if the core itself has no bugs as we cannot rely on blobdir not reside on the faulty network filesystem, being incorrectly backed up and restored etc., so we should assume that files may be randomly removed. Then there may be dangling `$BLOBDIR/...` references in the database which may accidentally point to unrelated files, could even be an `avatar.png` file sent to the bot in private. To prevent such bugs, we add random filename suffixes for the blobdir objects. Thanks to the added Param::Filename these random suffixes aren't sent over the network.

) This way filenames in the blobstorage are just random hex numbers. This also allows us to get rid of the `sanitize-filename` dependency. This also requires `Param::Filename` to be set to "debug_logging*.xdc" for messages containing logging webxdc-s, otherwise they are not detected properly. This is done in "fix: Message::set_file_from_bytes(): Set Param::Filename", so don't forget to update senders as well.

…pace (#5338) Before file extensions were also limited to 32 chars, but extra chars in the beginning were just cut off, e.g. "file.with_lots_of_characters_behind_point_and_double_ending.tar.gz" was considered to have an extension "d_point_and_double_ending.tar.gz". Better to take only "tar.gz" then. Also don't include whitespace-containing parts in extensions. File extensions generally don't contain whitespaces.

iequidoo · 2024-10-08T21:44:48Z

Opened #6033 instead

iequidoo force-pushed the iequidoo/blob-random-filename branch from 05a5904 to f8133a1 Compare July 28, 2023 03:16

iequidoo marked this pull request as ready for review July 28, 2023 04:08

iequidoo requested review from link2xt and r10s July 28, 2023 04:09

iequidoo force-pushed the iequidoo/blob-random-filename branch 3 times, most recently from d14434c to 1c8fa0d Compare July 30, 2023 05:46

iequidoo changed the title ~~feat: Use random filename suffixes for blobstorage (#4309)~~ feat: Use random filenames for blobstorage (#4309) Jul 30, 2023

iequidoo force-pushed the iequidoo/blob-random-filename branch from 1c8fa0d to 48edc0c Compare August 3, 2023 19:02

iequidoo force-pushed the iequidoo/blob-random-filename branch from 48edc0c to b79b2e1 Compare August 28, 2023 22:30

link2xt deleted the branch iequidoo/blob-random-file-suffixes October 25, 2023 21:22

link2xt closed this Oct 25, 2023

link2xt reopened this Oct 25, 2023

iequidoo changed the base branch from master to main October 26, 2023 01:13

iequidoo force-pushed the iequidoo/blob-random-filename branch from b79b2e1 to 0776811 Compare October 28, 2023 03:13

iequidoo marked this pull request as draft October 28, 2023 03:24

iequidoo force-pushed the iequidoo/blob-random-filename branch from 0776811 to eadd138 Compare October 28, 2023 08:03

iequidoo marked this pull request as ready for review October 28, 2023 08:06

link2xt force-pushed the main branch from ca190eb to d7aecab Compare November 10, 2023 17:56

iequidoo force-pushed the iequidoo/blob-random-filename branch from eadd138 to bb48814 Compare November 14, 2023 07:32

iequidoo force-pushed the iequidoo/blob-random-filename branch from bb48814 to 53f01ca Compare December 2, 2023 20:04

iequidoo force-pushed the iequidoo/blob-random-filename branch from 53f01ca to f2bae94 Compare January 7, 2024 01:46

iequidoo force-pushed the iequidoo/blob-random-filename branch from f2bae94 to 7358c33 Compare January 24, 2024 02:47

iequidoo changed the title ~~feat: Use random filenames for blobstorage (#4309)~~ feat: Use random filename suffixes for blobstorage (#4309) Jan 24, 2024

link2xt force-pushed the main branch from 50ac1f7 to cd9f6c3 Compare March 1, 2024 20:28

iequidoo changed the base branch from main to iequidoo/dc_msg_save_file March 13, 2024 04:42

iequidoo force-pushed the iequidoo/blob-random-filename branch from 49b0e82 to aaffb9d Compare March 18, 2024 05:32

iequidoo force-pushed the iequidoo/dc_msg_save_file branch from a1bf4a9 to e635108 Compare March 18, 2024 05:33

iequidoo force-pushed the iequidoo/dc_msg_save_file branch 2 times, most recently from fbceb01 to 435ff59 Compare April 24, 2024 02:32

iequidoo force-pushed the iequidoo/blob-random-filename branch from aaffb9d to 27dc07e Compare April 24, 2024 02:51

Base automatically changed from iequidoo/dc_msg_save_file to main April 24, 2024 19:38

iequidoo force-pushed the iequidoo/blob-random-filename branch from 27dc07e to 25f1eec Compare April 25, 2024 02:12

iequidoo mentioned this pull request Apr 25, 2024

feat: Store blobs in subdirs with random names (#4309) #5495

Open

iequidoo changed the base branch from main to iequidoo/blob-random-file-suffixes April 25, 2024 02:20

r10s reviewed Apr 25, 2024

View reviewed changes

iequidoo force-pushed the iequidoo/blob-random-file-suffixes branch from b248bbf to ed33f30 Compare April 26, 2024 00:19

iequidoo added 3 commits April 27, 2024 01:02

feat: MsgId::get_info(): Report original filename as well

7391c8d

iequidoo force-pushed the iequidoo/blob-random-filename branch from 25f1eec to 562c4f8 Compare April 27, 2024 04:17

iequidoo requested a review from r10s April 27, 2024 04:19

iequidoo force-pushed the iequidoo/blob-random-file-suffixes branch 3 times, most recently from deca65d to 0610b7d Compare May 23, 2024 21:51

iequidoo force-pushed the iequidoo/blob-random-file-suffixes branch 4 times, most recently from 8a315e8 to d2e299d Compare October 5, 2024 18:47

iequidoo marked this pull request as draft October 6, 2024 21:23

iequidoo changed the title ~~feat: Use random filenames for blobstorage (#4309)~~ fix: Assume file extensions are 32 chars max and don't contain whitespace (#5338) Oct 6, 2024

iequidoo closed this Oct 8, 2024

iequidoo changed the title ~~fix: Assume file extensions are 32 chars max and don't contain whitespace (#5338)~~ feat: Remove original file stem from filenames in the blobstorage (#4309) Oct 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Remove original file stem from filenames in the blobstorage (#4309) #4583

feat: Remove original file stem from filenames in the blobstorage (#4309) #4583

iequidoo commented Jul 27, 2023 •

edited

Loading

r10s commented Aug 4, 2023 •

edited

Loading

link2xt commented Aug 4, 2023

iequidoo commented Aug 5, 2023 •

edited

Loading

iequidoo commented Aug 5, 2023

iequidoo commented Jan 24, 2024

r10s left a comment •

edited

Loading

iequidoo commented Oct 8, 2024

feat: Remove original file stem from filenames in the blobstorage (#4309) #4583

feat: Remove original file stem from filenames in the blobstorage (#4309) #4583

Conversation

iequidoo commented Jul 27, 2023 • edited Loading

Based on #5495

r10s commented Aug 4, 2023 • edited Loading

link2xt commented Aug 4, 2023

iequidoo commented Aug 5, 2023 • edited Loading

iequidoo commented Aug 5, 2023

iequidoo commented Jan 24, 2024

r10s left a comment • edited Loading

Choose a reason for hiding this comment

iequidoo commented Oct 8, 2024

iequidoo commented Jul 27, 2023 •

edited

Loading

r10s commented Aug 4, 2023 •

edited

Loading

iequidoo commented Aug 5, 2023 •

edited

Loading

r10s left a comment •

edited

Loading