Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build system 0.0.1 #31

Merged
merged 11 commits into from
Feb 9, 2014
Merged

Build system 0.0.1 #31

merged 11 commits into from
Feb 9, 2014

Conversation

michealbenedict
Copy link
Contributor

Addresses Issue #8

[Dependencies: Pandoc and TeXLive|MacTex]
Quick heads up: I used gulp instead of grunt to play with it. I am willing to change back to grunt if required.

This PR adds a primitive build system to generate the book in pdf and html formats. Thanks to the use pandoc, this book can be exported to various formats with the addition of simple gulp tasks. I have also listed the requirements, constraints I assumed for my proposed solution. Refer to README.md for instructions on how to use the gulp tasks to build the book. Happy to iterate based on feedback

Requirements

  1. Ability to add and set order for topics and chapters easily.
  2. Ability to export the book to various file types.

Constraints:

  1. Ability to add a new adhoc directory structure for new chapters or topics. The build system must cater to these as well.

Proposed solution:

  1. Centralized file (chapters/toc.md) which lists the order of how the content is structured in the book (granularity: individual topics)

This could have been a simple JSON file as well since the main purpose is to define how the content should be exported (as pdf,html or any other file format), but I chose markdown to be consistent with the content. I also felt having a toc.md would be flexibile to generate a (well linked) toc portion of the book easily.

I used pandoc since I felt that it addressed the need to export to various file types as required.

Pros
  • Easy to
  • Easy to add/remove chapters (and/or topics)
Cons
  • Can grow to be large (and could potentially be hard to maintain)
  • May limit flexibility in terms of how the book can be generated. For example, How can I generate a rich HTML site with custom design?

args: []
}))
.pipe(gulp.dest('./dist'));
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for pandoc here. just use gulp-markdown

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I was wondering should I go ahead with pandoc (since I am using it) or use gulp-markdown (heh, its still there in the package.json I have included). My motivation was that since I am using pandoc to generate pdf for now, I might as well use it to generate html from markdown simply to keep it consistent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per my later comment, we should think about how we're going to handle ePub etc.

@dashed
Copy link

dashed commented Feb 2, 2014

On this point:

May limit flexibility in terms of how the book can be generated. For example, How can I generate a rich HTML site with custom design?

Pandoc has a css and template flag.

@dashed
Copy link

dashed commented Feb 2, 2014

IMO, I think toc.md should be a JSON file, which makes preprocessing so much easier by slimming up gulpfile.js (i.e. no need for all that parsing).

If you need to, you can easily generate the markdown from the JSON as needed.

@sindresorhus
Copy link
Contributor

May limit flexibility in terms of how the book can be generated. For example, How can I generate a rich HTML site with custom design?

gulp-markdown-pdf supports custom CSS: https://github.com/alanshaw/markdown-pdf#optionscsspath

IMO, I think toc.md should be a JSON file, which makes preprocessing so much easier by slimming up gulpfile.js (i.e. no need for all that parsing).

#31 (comment)

@michealbenedict
Copy link
Contributor Author

Summarizing the discussion around pandoc and non-pandoc solution (npm modules) to help come to a consensus.

AFAIK, the formats we are looking to export this book are

  1. PDF
  2. HTML
  3. EPUB
  4. MOBI
Pandoc

@sindresorhus is concerned regarding speed and ease of installation
My qualm with pandoc is the requirement of TeXLive/MacTex to help generate PDF. Apart from that, pandoc is a very well supported tool which works in a predictable manner across all three platforms (win, osx and linux).

Non-Pandoc solution

I am particularly concerned regarding the support when opting to use npm modules.
*HTML (node-markdown) - project has a large community and is supported well
*PDF (markdown-pdf) - seems to be updated frequently
I was unable to find well established projects to help export to EPUB, MOBI formats, any recommendations?

I am leaning a bit towards pandoc.

@michealbenedict
Copy link
Contributor Author

IMO, I think toc.md should be a JSON file, which makes preprocessing so much easier by slimming > up gulpfile.js (i.e. no need for all that parsing).

If you need to, you can easily generate the markdown from the JSON as needed.

@dashed Do you have any thoughts around moving to semantic file naming as suggested in #21 and based off @sindresorhus #31 (comment)? I'll soon be submitting a patch for this.

@sindresorhus
Copy link
Contributor

@rowoot i would prefer both. Pandoc for epub/mobi and gulp-markdown-pdf/gulp-markdown for pdf/html. That way we can skip the tex dependency, while still being able to produce all wanted formats.

@dashed
Copy link

dashed commented Feb 3, 2014

@rowoot I think semantic file naming may be the way to go -- though I don't think it may matter when you consider something like the following for JSON:

[
  {
    "name": "Introduction",
    "file": "path/to/file"
  },
  {
    "name": "Build Systems",
    "toc": [
      {
        "name": "Modern tools vs shell scripts",
        "file": "build-systems/modern-tools-vs-shell-scripts.md"
      }
    ]
  }
]

If it's possible to have metadata in the markdown files, then you can just outsource the name property to the respective file. Then with this and the semantic file naming, you wouldn't need the toc.json anymore since you can just recursively traverse the directory tree, and inspect the metadata of each file.

So, something like https://gist.github.com/Dashed/8792825 would allow custom frontmatter in YAML for markdown files. All it does is extract frontmatter. You'll need another transform stream to parse them when building the toc.

@michealbenedict
Copy link
Contributor Author

I tried both the approaches, and here is my feedback.

  • Semantic naming based approach
    • Naming file with numbers makes sorting simple (programmatically), but introduces pain for a user when adding a new topic or a chapter. This can lead to commits with bulk diffs :/
    • This approach would require a custom frontmatter (as @dashed suggested) to help store meta-data associated to chapter/topic. This is somewhat good (despite the need for a custom frontmatter) since files can be adhoc and do not require an "entry" in something like toc.json.
  • TOC based approach (JSON)
    • File specific meta data can live in this one file (but is an extra entry by the user)
    • This can grow fairly large and can become hard to navigate compared to the current markdown based toc. I personally prefer the markdown based TOC approach which looks much cleaner albeit the added overhead of preprocessing in gulp.

After re-evaluating, I am leaning towards the TOC based approach because of its simplicity (from the perspective of a user who is contributing to the book and the folks who would review the PR). Thoughts?

@dashed
Copy link

dashed commented Feb 4, 2014

Just curious, what does your JSON object look like for the TOC based approach?

@dashed
Copy link

dashed commented Feb 4, 2014

Looks like the JSON file can get gnarly really fast. What if you store it in YAML format? Then convert to JSON when importing it. I think either JSON or YAML would still be verbose.

Looking at the alternative solutions, it looks like, in my opinion, that the markdown format (as a list) looks to be the way to go -- in terms of readability and maintainability. At this point, it would be consideration between of having custom markdown parser and the verbosity that the JSON approach requires.

I feel that having a table-of-contents metadata file should somehow be unnecessary -- but it wouldn't be a big deal if it's easy to maintain.

@michealbenedict
Copy link
Contributor Author

What if you store it in YAML format?

Readability would still not match the markdown approach.

At this point, it would be consideration between of having custom markdown parser and the verbosity that the JSON approach requires.

My focus is to make the process of contributing content to the book super simple. Thus I am leaning towards using markdown (over JSON) for the TOC approach even though there is some minor preprocessing (parse the markdown file) using gulp.

I feel that having a table-of-contents metadata file should somehow be unnecessary -- but it wouldn't be a big deal if it's easy to maintain.

Along with being a metadata file, It also helps in structuring the chapters/topics when exporting to different formats (no need for semantic naming).

@dashed
Copy link

dashed commented Feb 4, 2014

My vote's on the markdown approach of TOC as is.

@michealbenedict
Copy link
Contributor Author

Changelog:

  • Added EPUB gulptask which uses pandoc underneath (/cc @addyosmani)
  • Moved the markdown based TOC parser (decided to stick with this, refer to above discussion) to a separate gulp-plugin directory
  • Added build-assets folder (which contains the pdf.css to help generate pdf files)
  • Moved to using gulp-markdown and gulp-markdown-pdf instead of pandoc for generating HTML and PDF (/cc @sindresorhus)

Usage:

$ gulp
[gulp] Running 'default'...
[gulp] Available Tasks:
[gulp]   generate:html
[gulp]   generate:pdf
[gulp]   generate:epub
[gulp] Finished 'default' in 1.5 ms

@michealbenedict
Copy link
Contributor Author

(ping) .. does this look good to merge?

@dashed
Copy link

dashed commented Feb 8, 2014

I noticed there is no task for individual HTML pages. I only see a concatenated, conglomerate version.

@michealbenedict
Copy link
Contributor Author

I did think about it, but descoped it at the moment till I can find out better mechanisms (along with pros and cons) of exporting it as a site.

For now, removing the concat() gulp task can export it as individual HTML pages.

@dashed
Copy link

dashed commented Feb 8, 2014

Does the markdown-to-html plugin support templates?

@addyosmani
Copy link
Member

This looks good to merge but will need a little rebase work. Once we're all green on this side I'd be happy for us to land this.

"gulp-util": "~2.2.12",
"gulp": "~3.5.0",
"gulp-markdown": "~0.1.2",
"through": "~2.3.4",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

through is bugged, use through2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to through2. Out of curiosity why do you feel through is bugged? (gulp-concat uses through)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used to use it with gulp and plugins, but it's a bit buggy (don't remember the details) and it's using old streams. Through2 is just a tiny abstraction of new streams. Look at it's code. There's barely any.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet, thanks for that information! through2 looks good. I recently updated a gulp plugin (gulp-pandoc) with it.

@michealbenedict
Copy link
Contributor Author

@dashed What do you mean by templates? The gulp markdown task uses https://github.com/chjj/marked library if this helps.

Changelog:

  • Moved to using through2 over through (/cc @sindresorhus)
  • Merge with master (includes latest docs for brunch, grunt, gulp). Also removed some unwanted files in the process. Refer to commit 4634836

A draft PDF version of the book (currently not styled), https://dl.dropboxusercontent.com/u/32194349/book.pdf

addyosmani added a commit that referenced this pull request Feb 9, 2014
@addyosmani addyosmani merged commit cac05e7 into tooling:master Feb 9, 2014
@addyosmani
Copy link
Member

Thanks for all your work on this @rowoot. Landed!

@michealbenedict
Copy link
Contributor Author

Thanks @addyosmani @sindresorhus and @dashed for all the feedback! Just FYI, this PR addresses #21 and #8 (feel free to close the issues or let me know if there is something missing).

@michealbenedict michealbenedict deleted the feautre-gulp-pandoc branch February 24, 2014 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants