[Feature Request] Download theme/css, mirror website look, create browsable content #296

ZaCloud · 2018-12-04T08:52:55Z

Hello. Is this how the program is supposed to function: It only downloads media (sound clips, images, etc), and not the posts themselves? In the directory I chose for it to download to, all there is is media (and it seems to cut off, not including anything under a "Read More"), and no way to view the actual posts (.html or any such files). Unless there's supposed to be a way to open .tumblr files? Am I doing things wrong or is this by design?

If this IS by design, then consider the ability to open the blog itself (even without stylesheets), and the ability to download content under a "Read More" link, as a feature suggestion. But if not, then sorry I'm a noob, lol. Thanks.

johanneszab · 2018-12-04T19:28:08Z

Turn on the Download *** meta and/or Download *** post-options in the details pane. You might want to change the metadata format to json if you want to parse it further.

If that's still not enough, enable the dump crawler data-option.

Everything under the Details pane (on the right side of the application, after selecting a blog).

MrEldritch · 2018-12-05T23:06:37Z

Hmm... given just how much data is in that "dump crawler data" dump, I'm wondering how difficult it would be to put together a bare-bones viewable-as-blog skin like tumblr-utils does. The JSON from the crawler dump contains the html for each individual post, so it should "just"* be a matter of stringing them together, swapping out the Tumblr image URLs for the locally-downloaded image files, and applying some default backup CSS.

ZaCloud · 2018-12-05T23:45:04Z

Turn on the Download *** meta and/or Download *** post-options in the details pane. You might want to change the metadata format to json if you want to parse it further.

If that's still not enough, enable the dump crawler data-option.

Everything under the Details pane (on the right side of the application, after selecting a blog).

Thanks, but it still didn't work. The only change that adding the 'meta' options did, was adding .txt files containing copy/pastes of the text portions of posts, questions/answers, url text, etc each in their own respective .txt files. There's still no way to open the posts themselves including the images in context. No .html/.xml/.pdf or any such files that reconstruct the posts with the media. Just pictures and .txt files. Changing to JSON format did nothing new either.

MrEldritch · 2018-12-06T00:50:54Z

ZaCloud, if you turn on "Dump Crawler Data", then each post will also be saved as its own .json, which carries a very large amount of metadata - including the HTML for the post!

ZaCloud · 2018-12-06T03:12:35Z

Well, I don't know what I'm supposed to do with a huge pile of individual .json files full of code. This still doesn't get me any closer to having a simplified replication of opening a tumblr blog in an easily readable format. :/

MrEldritch · 2018-12-06T05:45:38Z

Oh, sorry, I misunderstood. ZaCloud, currently TumblThree doesn't have that functionality. tumblr-utils, however, can do pretty much exactly what you're asking (although it's got its own shortcomings).

johanneszab · 2018-12-06T06:45:48Z

Hello. Is this how the program is supposed to function: It only downloads media (sound clips, images, etc), and not the posts themselves?

First: It obviously does download the actual posts. Like you say yourself, in text or json format. It's just not in your wanted format.

I was never interested in mirroring the exact tumblr website structure, nor the theme. I simply didn't see the gain in opening the posts in this bloaty, heavy java script site.

There probably already is an open issue/request for mirroring the theme/css/website. Since no one was interested in implementing it, it' not here. But as @MrEldritch said, most of the parts necessary are already filtered and somewhere in the code. Someone (still) has to implement the theme/css grabbing and path redirecting parts.

ZaCloud · 2018-12-06T14:53:33Z

Ahh, I see. Well thank you everyone. @johanneszab , as @MrEldritch pointed out, tumblr-utils does indeed make the blogs viewable, with a slim, simplified format that doesn't emulate the bloated themes, and that's honestly fine. While I was able to get tumblr-utils to do what I wanted, I'm sure many are a bit intimidated by the thought of using command lines, so it'd still be nice to see this utility have a similar capability.

The remaining problem now is that media under a "Read More" seems to still not be downloaded, and I'm sure that'd also be a feature of interest for most.

MrEldritch · 2018-12-06T16:29:22Z

Yeah; I concur - I would also be extremely interested in a way to download your blog in a blog format, but I do not care about theme/css mirroring - the minimal, simplified form that tumblr-utils mirrors blogs in would be entirely sufficient. (In fact, given how many blogs are actually quite painful to read in their own theme, css mirroring might actually be a downside)

MrEldritch · 2018-12-08T00:37:26Z

Honestly, I'd be happy to help with this myself, but I don't know .NET / C#. I already spent a few hours trying to do the opposite - figure out how TumblThree accessed hidden blogs and see if I could replace tumblr-utils' crawler with it, because tumblr-utils is Python and I do know that - but I just couldn't quite get the trick with the cookies to work. And it's clear that TumblThree is just a much more powerful crawler, in general, than tumblr-utils; the trick is just in the final reprocessing step.

Still .... tumblr-utils is only a thousand lines of Python, and much of that is for the scraping and json processing that TumblThree already does. The actual meat might not be that complicated, maybe simple enough that I could actually figure out how to do it and quickly learn enough .NET to integrate it.

santa-man · 2018-12-15T12:46:56Z

@MrEldritch @ZaCloud

I made a script that converts files downloaded by TumblThree into html files.

You can find the script here: tumblr_generate_html_files

It would be great to include this functionality in the existing program but for now this does the trick.

johanneszab · 2018-12-15T13:11:24Z

Thanks a lot, @santa-man!

I'll take a look at it after the holidays, and maybe we can together integrate this/something similar into TumblThree, in case of you are interested and a functionality like this is still needed. Well, maybe even Tumblr recognizes in a few weeks from now that they made a mistake, or things aren't going to be as bad as people think ..

johanneszab mentioned this issue Dec 4, 2018

Not downloading external links. #297

Closed

johanneszab changed the title ~~No Way To View Blog Posts?~~ [Feature Request] Download theme/css, mirror website Dec 6, 2018

johanneszab added enhancement help wanted labels Dec 7, 2018

johanneszab changed the title ~~[Feature Request] Download theme/css, mirror website~~ [Feature Request] Download theme/css, mirror website look, create browsable content Dec 7, 2018

This was referenced Dec 10, 2018

Is Mac currently supported? Are themes currently supported? #333

Closed

Turn on the "meta options" for texts of photos, videos a audio posts #347

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Download theme/css, mirror website look, create browsable content #296

[Feature Request] Download theme/css, mirror website look, create browsable content #296

ZaCloud commented Dec 4, 2018

johanneszab commented Dec 4, 2018

MrEldritch commented Dec 5, 2018

ZaCloud commented Dec 5, 2018

MrEldritch commented Dec 6, 2018 •

edited

Loading

ZaCloud commented Dec 6, 2018

MrEldritch commented Dec 6, 2018

johanneszab commented Dec 6, 2018

ZaCloud commented Dec 6, 2018

MrEldritch commented Dec 6, 2018

MrEldritch commented Dec 8, 2018

santa-man commented Dec 15, 2018 •

edited

Loading

johanneszab commented Dec 15, 2018

[Feature Request] Download theme/css, mirror website look, create browsable content #296

[Feature Request] Download theme/css, mirror website look, create browsable content #296

Comments

ZaCloud commented Dec 4, 2018

johanneszab commented Dec 4, 2018

MrEldritch commented Dec 5, 2018

ZaCloud commented Dec 5, 2018

MrEldritch commented Dec 6, 2018 • edited Loading

ZaCloud commented Dec 6, 2018

MrEldritch commented Dec 6, 2018

johanneszab commented Dec 6, 2018

ZaCloud commented Dec 6, 2018

MrEldritch commented Dec 6, 2018

MrEldritch commented Dec 8, 2018

santa-man commented Dec 15, 2018 • edited Loading

johanneszab commented Dec 15, 2018

MrEldritch commented Dec 6, 2018 •

edited

Loading

santa-man commented Dec 15, 2018 •

edited

Loading