Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A post from AAAAA was not parsable #157

Closed
wacher74 opened this issue Jun 10, 2021 · 7 comments
Closed

A post from AAAAA was not parsable #157

wacher74 opened this issue Jun 10, 2021 · 7 comments

Comments

@wacher74
Copy link

wacher74 commented Jun 10, 2021

When I start a scanning I get this message:
"A post from AAAAA was not parsable. Setting 'posts per page' to 1 in the Details might increase the downlaodable content.
If i hover previous message with mouse then I get
"Internal error description
Expecting element 'root' from namepsace ''.. Encountered 'None' with name '', namepsace ''.

Log details:
20210610 11:08:04.924 Inf IsLongPathSupported: True
20210610 11:08:19.103 Vrb TumblrLikedByCrawler.Crawl:Start
20210610 11:08:20.438 Err AbstractCrawler:ConvertJsonToClass: Could not parse data
20210610 11:08:20.454 Vrb TumblrLikedByCrawler:CrawlPageAsync: System.ArgumentNullException: Value cannot be null.
Parameter name: source
at System.Linq.Enumerable.FirstOrDefault[TSource](IEnumerable1 source, Func2 predicate)
at TumblThree.Applications.Crawler.AbstractTumblrCrawler.RetrieveOriginalImageUrl(String url, Int32 width, Int32 height) in C:\projects\Tumblthree\src\TumblThree\TumblThree.Applications\Crawler\AbstractTumblrCrawler.cs:line 360
at TumblThree.Applications.Crawler.AbstractTumblrCrawler.AddTumblrPhotoUrl(String text, Post post) in C:\projects\Tumblthree\src\TumblThree\TumblThree.Applications\Crawler\AbstractTumblrCrawler.cs:line 237
at TumblThree.Applications.Crawler.TumblrLikedByCrawler.AddPhotoUrlToDownloadList(String document) in C:\projects\Tumblthree\src\TumblThree\TumblThree.Applications\Crawler\TumblrLikedByCrawler.cs:line 285
at TumblThree.Applications.Crawler.TumblrLikedByCrawler.d__12.MoveNext() in C:\projects\Tumblthree\src\TumblThree\TumblThree.Applications\Crawler\TumblrLikedByCrawler.cs:line 226
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at TumblThree.Applications.Crawler.TumblrLikedByCrawler.d__7.MoveNext() in C:\projects\Tumblthree\src\TumblThree\TumblThree.Applications\Crawler\TumblrLikedByCrawler.cs:line 109

Desktop (please complete the following information):

  • TumblThree version: 1.4 - 1.6
  • OS: Windows 10 Pro

What can I do?
By the way, there is not such 'post per page' in Details.

@thomas694
Copy link
Contributor

Hello,
thanks for reporting the bug.
We already found the problem, Tumblr changed one of it's message formats. The newest version can download these posts again.

@thomas694
Copy link
Contributor

The issue has been fixed and closed. You can still comment. Feel free to ask for reopening the issue if needed.

@wacher74
Copy link
Author

wacher74 commented Jun 10, 2021

It's not working.
"A post from AAAAA was not parsable. Setting 'posts per page' to 1 in the Details might increase the downlaodable content.
If i hover previous message with mouse then I get

Accessed JArray values with invalid key value: "0". Int32 array index expected.

20210611 00:09:34.031	Err 	AbstractTumblrCrawler:RetrieveOriginalImageUrl: System.ArgumentException: Accessed JArray values with invalid key value: "0". Int32 array index expected.
   at Newtonsoft.Json.Linq.JArray.get_Item(Object key)
   at TumblThree.Applications.Crawler.AbstractTumblrCrawler.DeserializeImageResponse(String s) in C:\projects\Tumblthree\src\TumblThree\TumblThree.Applications\Crawler\AbstractTumblrCrawler.cs:line 341
   at TumblThree.Applications.Crawler.AbstractTumblrCrawler.RetrieveOriginalImageUrl(String url, Int32 width, Int32 height) in C:\projects\Tumblthree\src\TumblThree\TumblThree.Applications\Crawler\AbstractTumblrCrawler.cs:line 376

I got this 76 times.

@thomas694
Copy link
Contributor

Yes, unfortunately they use different (at least two) formats depending on the condition/blog. We have to look again.

@thomas694 thomas694 reopened this Jun 11, 2021
thomas694 added a commit that referenced this issue Jun 12, 2021
- As Tumblr changed the format of the image response json, the parsing wasn't working any more. Additionally it returns two different json formats depending on the image resp. blog.
- Adapted the parsing of the returned jsons.
@thomas694
Copy link
Contributor

Please check the new version.

@thomas694
Copy link
Contributor

General note:
Depending on the blogs you have and whether you are using the image size "best", it might be good to crawl your blogs which contain larger images once with the option "force rescan" as previously not all posts have been downloaded properly.

@wacher74
Copy link
Author

Thank you very much, it is working again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants