Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proxy errors #9

Closed
1223334444abc opened this issue May 27, 2023 · 22 comments
Closed

Proxy errors #9

1223334444abc opened this issue May 27, 2023 · 22 comments
Labels
bug Something isn't working enhancement New feature or request

Comments

@1223334444abc
Copy link

There seems to be some errors in the proxy settings. When I specify a proxy server in this setting, such as setting it to --proxy http://127.0.0.1:1080 , still encountering some unreachable errors. And when I used a --proxy, I observed a link established with kemono.party, but it still prompts for various connection errors.

But when I took over all network connections using a virtual network card, the error no longer occurred. All downloads are proceeding normally. The virtual network card and proxy server use the same server connection. I doubt if there are any network connections that have not been overwritten by proxy settings.

Here are some error messages I have encountered:

Error getting favorites: Get https://kemono.party/api/favorites?type=user: dial tcp 199.59.148.209:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

HTTP:EOF (Forgotten specific information)

@1223334444abc
Copy link
Author

I have tried HTTP, HTTPS, and Socks5 proxies, but none of them have been able to solve the problem.

@1223334444abc
Copy link
Author

Most of the time, there are errors when obtaining the favorite list, but sometimes they start downloading images without any speed.

@elvis972602 elvis972602 added the bug Something isn't working label May 27, 2023
@elvis972602
Copy link
Owner

elvis972602 commented May 27, 2023

It seems that some requests are not covered by the proxy.
I will try to fix it.

@1223334444abc
Copy link
Author

1223334444abc commented May 28, 2023

unexpected EOF
download post error: failed to write file: unexpected EOF
download post: xxxxxxx

Here is a new question. When the download file encounters the above error, it will not automatically retry the download and will be skipped. (and generate incomplete. tmp files)

And there is another small issue.
09m27.54s Download 70.3% 800 B/s 634.08 KB 16.png
(Most files each have 200-300KB/s.)
Due to network problem, some file downloads may experience prolonged delays. Can there be some mechanisms to solve this problem? For example, setting a download timeout based on average download speed and file size, or automatically retrying after how many seconds are below 1kb/s?

@elvis972602 elvis972602 added the enhancement New feature or request label May 28, 2023
@1223334444abc
Copy link
Author

1223334444abc commented May 28, 2023

Then there are some functional suggestions: (Taking this page as an example:)

  1. Have a txt file that can save the text in the "Content" section. Some pages contain key information such as the download link for the complete version of Google's online drive.

  2. Save the files in the "Downloads" section with their original file name. At present, it seems that all have been replaced with serial numbers.

  3. Hope to add the name of the source website, such as fanbox/fantia, before or after the<ks: creator>. For example: [Fanbox]xxx

@elvis972602
Copy link
Owner

Thank you for your advice!
I will try to add some of the features
Also, when there is an unexpected EOF, does it happen when a specific post is encountered or is it random?

@1223334444abc
Copy link
Author

EOF errors occur randomly. Usually it doesn't appear when I download it again.
The international network connection is quite unstable, and I need to access it all through a proxy server.

@elvis972602
Copy link
Owner

I understand. I will try to add re-download and clear the temporary files.

@elvis972602
Copy link
Owner

elvis972602 commented May 28, 2023

The file name will be replaced with the file's hash value to confirm more quickly that the file has been downloaded and is complete, as it appears that the file name on the site may be changed.

@1223334444abc
Copy link
Author

1223334444abc commented May 28, 2023

In the above link, we would like the videos in the Downloads section to be saved as "xxx.mov", while the images are arranged in sequential order in the folder. This is more convenient for organizing and managing databases.

[Fanbox]xxxx
[20211111] [111111] aaaaaaaaaaaa

xxx.mov
and then 0.png 1.png 2.png ..........
and Content.txt

And I suddenly realized that after encountering an EOF error, the other images after the error file in this post will not be downloaded.

@elvis972602
Copy link
Owner

I see what you mean, this is a good suggestion, and this naming convention also seems more reasonable

@1223334444abc
Copy link
Author

1223334444abc commented May 28, 2023

It seems that some requests are not covered by the proxy. I will try to fix it.

Thank you for your work. After testing, the new version can run in --proxy.

......
5.08s Success ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% 1.29 MB/s 6.53 MB 7.png
5.25s Success ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% 1.24 MB/s 6.50 MB 6.png
302.82ms Failed ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0% 0 B/s 0 B 13.png
download failed
6.50s Success ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% 1.00 MB/s 6.53 MB 10.png
2.10s Success ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% 3.13 MB/s 6.56 MB 11.png
......

During the test run, it was found that this file reported an error, but was not retried but skipped. For me, I hope that all files will be constantly retried when encountering download errors until they are successful. Perhaps we should force a retry when encountering any errors? (For a large database, finding and filling in gaps is even more painful.)

Another small question is that downloading more than three files simultaneously in proxy mode will result in a 429 error. Does this mean that 'max download parallel' needs to be modified to below 3?

......
1.41s Failed ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0% 0 B/s 564 B 6.mp4
http 429
request too many times, retry after 1.0 seconds...
2.22s Failed ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0% 0 B/s 564 B 6.mp4
http 429
request too many times, retry after 1.0 seconds...
55.17s Success ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% 3.53 MB/s 194.76 MB 5.mp4
1.50s Failed ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0% 0 B/s 564 B 6.mp4
http 429
request too many times, retry after 1.0 seconds...
57.77s Success ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% 4.63 MB/s 267.27 MB 1.mp4
01m48.84s Success ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% 2.62 MB/s 284.65 MB 2.mp4
01m3.03s Success ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% 3.74 MB/s 235.98 MB 6.mp4
......

@elvis972602
Copy link
Owner

Yes, if you keep encountering HTTP 429, reducing max download parallel may be a good option.

@1223334444abc
Copy link
Author

1223334444abc commented May 28, 2023

I have roughly looked at the "59a979f" branch (I don't know programming, I just skimmed through it), and perhaps .pdf (Multi page manga) or [.psd .psb .sai .pntr .clip] (Drawing Source File) also need to be considered.
It would be even better if you could provide a file other than an image.

I checked my fanbox and fantia databases, and the file format is probably missing these.

@elvis972602
Copy link
Owner

I was also wondering which category to put these graphics files in, maybe it would be better to put it in a separate category?

@1223334444abc
Copy link
Author

1223334444abc commented May 28, 2023

PDF files should preferably be in a separate category, while source files [.psd .psb .sai .pntr .clip] should be in the same category.

It may be a bit redundant, but please also note that when obtaining the file name for the psd file mentioned above, it is "xxx.psd " instead of "Download xxx.psd ".

@1223334444abc
Copy link
Author

I suddenly remembered a problem when using the command line to input parameters before: swapping the order of parameters would result in the inability to obtain 'creator'. Due to using. yaml instead, I forgot the specific error information before, but it does exist.

@elvis972602
Copy link
Owner

I think the current categories are sufficient. You can use the default --template to determine their naming convention and use --image-template for the images.
example:

template: "[<ks:service>] <ks:creator>/<ks:post>/<ks:filename><ks:extension>"
image-template: "[<ks:service>] <ks:creator>/<ks:post>/<ks:index><ks:extension>"
video-template: "[<ks:service>] <ks:creator>/<ks:post>/video/<ks:filename><ks:extension>"

The result will be something like: 0.jpg, 1.jpg, 2.jpg, 3.jpg, xxxx.pdf, video/xxxx.m4v.

@1223334444abc
Copy link
Author

1223334444abc commented May 30, 2023

I think this is feasible. For files in 'Downloads', normal file names can generally be obtained. It is indeed possible to merge them for processing.

If convenient, provide a exe for try, waiting for new Releases.

@elvis972602
Copy link
Owner

Sorry, I just forgot.
Now you can download it in release

@1223334444abc
Copy link
Author

1223334444abc commented May 30, 2023

Congratulations, the program has been running continuously for an hour, and all download errors have been retried. Each file is downloaded well according to the rules.
Surprisingly, today I was able to use 'max download parallel: 10' without any problem.
I will continue to run it and observe the situation.

@elvis972602
Copy link
Owner

Thank you so much for your feedback and advice!
I will close this issue for now. If there are any other problems, you are very welcome to open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants