Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

7z performance #4

Open
selmf opened this issue Aug 21, 2015 · 1 comment
Open

7z performance #4

selmf opened this issue Aug 21, 2015 · 1 comment

Comments

@selmf
Copy link

selmf commented Aug 21, 2015

Hi Zeniko,

it's me again. This time it's not about a bug I encountered or a suggestion regarding Linux/Unix, but rather a technical question. I have been experimenting with adding .7z support to my cmake build script for unarr using the stuff you do for SumatraPDF as a blueprint and I've mostly succeeded on this. However, with my success and my habit of using large files as a test case I have encountered a problem which kind of is a deal-breaker for me. Apparently, the ANSI-C 7z code unarr uses for 7z support first decodes a whole block of the 7z archive before it lets you access all of the files inside this block. On a small archive this isn't a problem, but with larger archives with a size of several hundred megabytes up to a few gigabytes in the extreme case the time till the first page of a comic or the first file of the archive is available is seriously delayed.
Since my usecase involves extracting the covers of whole libraries of comics this is a serious problem. Also, I don't feel confident enough in my programming skills to fix this limitation by myself. So my question to you is... do you have any idea on how to proceed on this problem? Is it worth working on at all or is it a lost cause? 7z upstream suggests using the CPP code by the way, but as far as I know it's tied too heavy to the windows API to be of any use to me.

Best Regards,

Selmf

@zeniko
Copy link
Owner

zeniko commented Aug 25, 2015

This is a known issue for 7z archives using solid compression. A proper fix would require reimplementing SzArEx_Extract and all of 7zDec.c so that it behaves the same as unarr's rar uncompressor which only decompresses up to the required file and only keeps uncompression state in memory instead of all uncompressed data.

I'll look into it should I ever get sufficiently bored or annoyed by the limitation (unlikely). Patches would be welcome, though, and I might be able to help a bit with the implementation, should you or anybody else feel up to it.

BTW: Using solid compression is not recommended for comic books anyway, since it can significantly slow down random access to files (unless you're willing to cache a few MBs of uncompression state per file which may also be prohibitive for large archives).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants