-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix TypeError on large folder downloads #27562
Fix TypeError on large folder downloads #27562
Conversation
cc @ChristophWurst @juliushaertl @nickvergessen for feedback on this |
FYI I do mostly auth, APIs and dependencies in the server repo :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good to me
A bit of testing would be nice :)
@skjnldsv I'm more than happy to help, but I have not written PHP since high school (must have been PHP4 or early PHP5 - can't really remember). So I'm not sure if I am more help than hazard here 😄 I am currently running the "smaller" fix using |
/rebase |
What is missing here? |
@szaimen I wrote this fix off the top of my head - and I have not written anything substantial in PHP for about a decade. Also, the code is untested. I created the PR as a base for discussion, as requested in #15117. If you guys think the code is good enough to be merged, I will obviously not object. But I advice extra caution 😄 |
This comment has been minimized.
This comment has been minimized.
@j-be can you rebase please? |
There is an issue when downloading large folders (i.e. bigger than 2GB) as archives. This commit tries to improve the situation. Signed-off-by: Juri Berlanda <juriberlanda@hotmail.com>
Signed-off-by: Juri Berlanda <juriberlanda@hotmail.com>
3bf788a
to
2b351aa
Compare
@skjnldsv here you go 😃 |
Test failure unrelated |
sorry for the delay. We discussed this internally and decided that it should not be done like this. There should rather be a config option which allows the admin to specify the archive format. cc @skjnldsv @PVince81 @nickvergessen |
@szaimen no worries 😃 That sounds like a good idea. Feel free to close this PR if it is not needed anymore. |
@szaimen Sounds good. |
Probably not relevant anymore due to: nextcloud/documentation#9071 But I like the idea to have an option for specifying the archive format. zip is really the worst possible option, as long as you don't run one of these Windows systems without any proper archiver installed. |
Closing then |
@skjnldsv For the record: the issue is still present on 32 Bit machines. But taken this is now over 2 and a half years old and I see there is no interest in tackling this, I am beyond caring. To close some honest feedback: Trying to contribute to the project sucks. My working PR was declined for an alternative solution, which never materialized leaving the actual issue present to this day. |
@j-be are you sure it is still an issue? I just successfully downloaded a 2.4 GiB directory from my NC 28.0.1, hosted on a Raspberry Pi 2 ARMv7 with 32-bit Linux OS. No error message was triggered anywhere and the downloaded archive was complete. |
@MichaIng Not entirely sure as I am applying my patch (i.e. this PR) on every update of Nextcloud. Did you try to download a folder, that will lead to a Zip bigger than 4GB from non Mac (it will correctly fall back to Tar only on MacOSX, not on Windows and Linux)? I feel like you should end up with either an error, or an invalid Zip file. |
I just downloaded a 4.3 GiB archive, packaged as 4.1 GiB sized zip archive. It is all complete. There have been a number of 32-bit bit related fixes, after it broke completely with NC25 (IIRC), after which 32-bit support was aimed to be removed, and then the decision was reverted, with fixed 32-bit support in NC26. This included fixes for old 32-bit issues, like downloads and trashbin. |
@MichaIng That is very weird. If you look at the comment in the constructor of https://github.com/nextcloud/server/blob/master/lib/private/Streamer.php it would indicate, that a Zip32 file with more than 4GB should not be valid. But assuming I seem to remember I saw somewhere, that the used library cannot create Zip64 files on 32Bit machines (not sure why, never cared), which is why the Tar thingy is there to begin with. Am I missing something, or how is it possible this code produces a valid Zip archive with more than 4GB content on a 32Bit machine? |
Hmm, strange indeed. This interface is definitely used? The condition to set root@micha:~# php -r 'echo PHP_INT_SIZE . PHP_EOL;'
4 I could add some debug code to just verify that this particular condition was met/code line reached, just to be sure. And is there a way to check the format of the downloaded zip? I'll just re-download it and see. I find this "32-bit workaround" strange anyway. I mean if the size is above 4 GB (first condition is false), and 32-bit zips are really not able to contain >4 GiB data, then NC shouldn't try to generate that zip in the first place. But it just generates a 32-bit zip then. So either 32-bit zips are able to contain data >4 GiB size, or this is complete nonsense. The comments above the code indicate that it is complete nonsense, Actually I do not understand either why a 32-bit zip is used when the size is below 4 GB, as even the comments state that it can fail when the "central directory" is larger (whatever this is). Is there any benefit of 32-bit zip over 64-bit zip? Otherwise, on non-mac (what about Linux clients?), why not use 64-bit zip always and disable it for 32-bit servers only? EDIT: Ah okay, I checked the zip falsely last time. Indeed the data ending within the zip archive is at point 4 GiB. I did not recognise at is was some DVD VOB files, and I accidentally checked the 2nd last one last time, which was complete, as every file has exactly 1 GiB. But the 5th is an additional 110 MiB and is broken. The archive itself still has the whole 4.11 GiB size. Checking back at the PR: Does the TAR streamer do any compression or just uncompressed TAR? Because if it does no compression, then I indeed would prefer zip whenever possible. Actually But the actual issue is the broken >4 GiB archives with 32-bit servers. And it does not help much to use TAR in some more cases (Linux), but ist must be simply ruled out that a >4 GiB zip archive is even tried to be generated on a 32-bit server, but instead a proper error message must be presented that this is (currently) not possible. The 32-bit vs 64-bit decision should/can then be done based on the architecture, ignoring the anyway not failsafe size check, at least as long as there are no other downsides of 64-bit zips. As it is the default in this ZipStreamer, I assume there are none. A quick other approach, if >4 GiB tarballs are possible, would be of course to stream TAR on all 32-bit servers. Like any download is better than no download? |
I just tested a tarball download, and that one was complete. So we can use that as fallback for larger directories on 32-bit servers. Not sure about the max number of files there, but that should be a rare case compared to the max size. So taking everything together, unless someone finds time for proper GUI adjustments (GUI error message, archive type selector etc) NOW, I suggest this quick "fix": // Use TAR if preferred, it does support > 4 GiB on 32-bit servers
if ($request->isUserAgent($this->preferTarFor)) {
$this->streamerInstance = new TarStreamer();
// 32-bit server
} elseif (PHP_INT_SIZE === 4) {
// Use 32-bit zip if size and numberOfFiles allow it
if ($size > 0 && $size < 4 * 1000 ** 3 && $numberOfFiles < 65536) {
$this->streamerInstance = new ZipStreamer(['zip64' => false]);
// Use TAR as fallback, it does support > 4 GiB on 32-bit servers
} else {
"log info that zip archives with > 4 GiB data or > 65535 files are not supported on 32-bit systems, and an uncompressed TAR archive is downloaded instead"
$this->streamerInstance = new TarStreamer();
}
// 64-bit server: Use 64-bit zip, respectively whatever the streamer does by default in this case
} else {
$this->streamerInstance = new ZipStreamer();
} Ah, I see that this PR is doing effectively the same. However, I would avoid the math on 64-bit servers and just serve 64-bit zips right away, respectively just what the streamer serves by default, regardless which size or number of files the archive has. Since the tarball is completely uncompressed, I would not add further client to prefer it. Not sure whether zip is really so badly supported on Apple systems, but we might even want to prefer zip in every case, and use the raw tarball really only as fallback for large directories on 32-bit systems? For smaller directories it is already the case now, so this would mean a change only for large directories, where the compression is even more reasonable. Not sure how to implement the log right now. Would need to check some other files to find the right interface for this. Not sure whether log level info or warning is appropriate. We do not want to spam higher severity entries into the log on a regular basis, if e.g. larger downloads are more common on this server, but we want admins to at least recognise it once, to avoid related questions popping up all the time. Since there is an admin panel warning and link to docs already on 32-bit servers, we might add the information here and keep the log level low: https://docs.nextcloud.com/server/latest/admin_manual/installation/system_requirements.html#cpu-architecture-and-os And just to make that point clear: I totally agree with @j-be that we should fix this now (it is a bug, not an enhancement) instead of waiting for someone to implement a proper GUI choice, GUI error or such. The above code should assure that a valid archive is downloaded in every case, while currently on 32-bit servers, a EDIT: Ah, the broken 32-bit zips are not generated silently. There is an overflow exception generated: {
"reqId": "0nOwimsvOqohDi8aelBr",
"level": 3,
"time": "2024-02-26T14:50:18+01:00",
"remoteAddr": "192.168.1.1",
"user": "Micha",
"app": "no app in context",
"method": "GET",
"url": "/nextcloud/apps/files/ajax/download.php?dir=%2FVideos%2FDVD%20Hochzeit%20Zeremonie&files=%5B%22VIDEO_TS_1%22%5D&downloadStartSecret=3ococfijiya",
"message": "Exception thrown: OverflowException",
"userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36 OPR/109.0.0.0 (Edition developer)",
"version": "28.0.1.1",
"exception": {
"Exception": "OverflowException",
"Message": "Count64 object limited to 32 bit (overflow)",
"Code": 0,
"Trace": [
{
"file": "/var/www/nextcloud/3rdparty/deepdiver/zipstreamer/src/ZipStreamer.php",
"line": 222,
"function": "add",
"class": "ZipStreamer\\Lib\\Count64_32",
"type": "->"
},
{
"file": "/var/www/nextcloud/lib/private/Streamer.php",
"line": 172,
"function": "addFileFromStream",
"class": "ZipStreamer\\ZipStreamer",
"type": "->"
},
{
"file": "/var/www/nextcloud/lib/private/Streamer.php",
"line": 139,
"function": "addFileFromStream",
"class": "OC\\Streamer",
"type": "->"
},
{
"file": "/var/www/nextcloud/lib/private/legacy/OC_Files.php",
"line": 216,
"function": "addDirRecursive",
"class": "OC\\Streamer",
"type": "->"
},
{
"file": "/var/www/nextcloud/apps/files/ajax/download.php",
"line": 77,
"function": "get",
"class": "OC_Files",
"type": "::"
},
{
"file": "/var/www/nextcloud/lib/private/Route/Route.php",
"line": 155,
"args": [
"/var/www/nextcloud/apps/files/ajax/download.php"
],
"function": "require_once"
},
{
"function": "OC\\Route\\{closure}",
"class": "OC\\Route\\Route",
"type": "->",
"args": [
"*** sensitive parameters replaced ***"
]
},
{
"file": "/var/www/nextcloud/lib/private/Route/Router.php",
"line": 324,
"function": "call_user_func"
},
{
"file": "/var/www/nextcloud/lib/base.php",
"line": 1069,
"function": "match",
"class": "OC\\Route\\Router",
"type": "->"
},
{
"file": "/var/www/nextcloud/index.php",
"line": 39,
"function": "handleRequest",
"class": "OC",
"type": "::"
}
],
"File": "/var/www/nextcloud/3rdparty/deepdiver/zipstreamer/src/Lib/Count64_32.php",
"Line": 82,
"CustomMessage": "Exception thrown: OverflowException"
}
} But an automatic fallback to TAR is still better. |
As requested in #15117, here a PR for discussion.
Additionally, I propose to add Linux to the
preferTarFor
user agent matches.Please note the
TODO
: I would not consider this PR ready to be merged until that is fixed.