Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize to other writers (LatexWriter) #18

Closed
goerz opened this issue Jul 14, 2023 · 6 comments · Fixed by #46
Closed

Generalize to other writers (LatexWriter) #18

goerz opened this issue Jul 14, 2023 · 6 comments · Fixed by #46
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@goerz
Copy link
Member

goerz commented Jul 14, 2023

Transferred from JuliaQuantumControl/QuantumCitations.jl#7

It might be better not to generate raw HTML for the bibliography. See the comments at JuliaDocs/Documenter.jl#1162 (comment)

@goerz goerz added the enhancement New feature or request label Jul 14, 2023
@goerz goerz added the help wanted Extra attention is needed label Sep 22, 2023
@goerz goerz changed the title Generalize to other writers Generalize to other writers (LatexWriter) Oct 9, 2023
@goerz goerz changed the title Generalize to other writers (LatexWriter) Generalize to other writers (LatexWriter) Oct 9, 2023
@goerz
Copy link
Member Author

goerz commented Oct 9, 2023

There's two potential routes to implementing this (or citation rendering in general):

  1. Let DocumenterCitations render citation links and bibliographies into a general AST that then any writer can convert into the correct output format, or
  2. Let DocumenterCitations translate citation links and bibliographies into format-specific raw nodes

Approach 2 is probably easier, but I'll detail the advantages and disadvantages of both strategies below.

Currently we only support the HTML writer, but the strategy is a mix of the two approaches:

  • citation links ([key](@cite) and its variants) are translated into standard markdown links (MarkdownAST.Link elements)
  • bibliographies (@bibliography blocks) are rendered into a raw HTML node

So in any case, we'll have to change the internals to generalize to LatexWriter

Approach 1 (universal AST)

Fundamentally, this does all the rendering within DocumenterCitations. To the LatexWriter, the citations/bibliographies are just text, with low-level hyperlinks. Specifically, latex will not invoke bibtex to render bibliographies.

We'll not have to change anything related to format_citation, but we will have to change how expand_bibliography works: instead of generating a raw HTML node, it'll have to generate something higher level that describes the bibliography semantically in the MarkdownAST, and then methods both for HTMLWriter and LatexWriter will have to be implemented that translate that semantic AST to the final output format.

This approach is advocated for in the old JuliaDocs/Documenter.jl#1162 (comment). As pointed out there, there will probably have to be a BibliographyNode that works very similarly to Documenter.DocsNode

Advantages

  • Support for the full set of DocumenterCitations features (e.g., multiple, non-canonical bibliographies at arbitrary locations)
  • Identical rendering of citations in HTML and LateX/PDF
  • Very clean and future-proof: It would be easier for custom writers like DocumenterMarkdown to add support for bibliographies at some point in the future.
  • Requires no breaking changes to the API. See Render citations links as arbitrary markdown AST #43 (comment))

Disadvantages

  • Unclear if it needs changes in Documenter and/or MarkdownAST to define some kind of BibliographyNode. The Documenter.DocsNode is defined entirely in Documenter, so presumably BibliographyNode could be entirely in DocumenterCitations, just extending MarkdownAST.
  • Definitely more complicated to implement
  • On some level it feels silly to have implemented a citation system modeled after LaTeX to then just throw LaTeX's citation support away when actually rendering to LaTeX. But that's not too much of an argument: the LaTeX code that LatexWriter emits isn't exactly "clean" anyway, and we really only care about the PDF that results.

So, "more work to implement" is probably the only real disadvantage.

Approach 2 (raw nodes, exploit bibtex)

Advantages

  • Easier to implement

Disadvantages

  • Requires a "breaking" change to the internals: expand_bibliography and format_citation will need to know the output format
  • It's not entirely clear how to have inline raw LaTeX (\cite). May need some kind of workaround wrapping it in either a MarkdownAST.InlineMath or MarkdownAST.Code node.
  • Citations might be rendered slightly differently from the HTML: we'll be limited to available LaTeX citation styles (.bst files). Luckily, I've modeled the rendering after LaTeX (revTeX, specifically), so that might not be that much of a problem.
  • There are some limitations on the .bib file to make it work well with DocumenterCitations (unicode, capitalization). Having the same .bib file work well with DocumenterCitations and bibtex should be possible (and is a design goal), but might require more work for the author of the .bib file
  • Unclear how multiple non-canonical bibliography blocks (at arbitrary inline locations!) can be supported. See https://www.overleaf.com/learn/latex/Questions/Creating_multiple_bibliographies_in_the_same_document, but I'm not sure any of those packages gives the exact set of features that we would need to fully support everything in DocumenterCitations. This is the most likely to be a dealbreaker.
  • The documentation of DocumenterCitations itself uses some very internal undocumented tricks to have multiple bibliography blocks with different styles in the same document (for the Gallery). There's absolutely no way that could ever be converted into LaTeX with anything other than Approach 1.
  • Unclear whether to use bibtex or the biblatex package (I have no personal experience with biblatex)

So it seems to me like Approach 1 is much cleaner and safer. The more I think about it, the less likely it seems like Approach 2 can possibly support all the features of DocumenterCitations. That might be a dealbreaker for Approach 2. But certainly, Approach 1 is a non-trivial project (and probably not one I would take on myself in the near future)

@goerz
Copy link
Member Author

goerz commented Oct 9, 2023

From @Seelengrab on Slack:

Wild idea: is it enough to make CitationLink be a markdown element, have it expand appropriately based on HTMLWriter.html/LaTeXWriter.latex and then let multiple dispatch just handle it?

That might still be a good idea and would fit in with either approach!

@goerz
Copy link
Member Author

goerz commented Oct 9, 2023

Although it would push a lot of the logic from the expander stage into the writer stage, and I'm not sure how well it fits with what I just did in #43, so I'll have to think about that more

@goerz
Copy link
Member Author

goerz commented Oct 10, 2023

I've looked a bit more into Approach 1, and it turns out to be a lot more straightforward to implement than I first thought. So, none of the drawbacks actually turned out to be real, while all the advantages materialized. In fact, having the BibliographyNode as an intermediate element in the markdown AST, which is then translated to HTML only at the writing stage is a much cleaner design even for the original HTML-only use case.

So at this point, Approach 1 is absolutely the way to go. I have a working prototype, which I'll clean up and push after #43 is merged.

@Seelengrab
Copy link
Contributor

Seelengrab commented Oct 10, 2023

That might still be a good idea and would fit in with either approach!

Right; any third party writer plugin would have to overload their write_output function (html/latex in HTMLWriter/LaTeXWriter) based on the CitationLink node.

Another advantage is that the default Documenter pipeline already uses latexmk, which handles bibtex references by default already:

[sukera@tower ~]$ latexmk --help
Latexmk 4.79: Automatic LaTeX document generation routine

Usage: latexmk [latexmk_options] [filename ...]

  Latexmk_options:
   -aux-directory=dir or -auxdir=dir 
                 - set name of directory for auxiliary files (aux, log)
                 - See also the -emulate-aux-dir option
   -bibtex       - use bibtex when needed (default)
   -bibtex-      - never use bibtex
   -bibtex-cond  - use bibtex when needed, but only if the bib file exists
   -bibtex-cond1 - use bibtex when needed, but only if the bib file exists;
                   on cleanup delete bbl file only if bib file exists
[...]

So from my naive (hah, I should learn to not be naive about difficulties with LaTeX 😂) POV, this should "just work" if the @bibliography node "just" outputs a \bibliography LaTeX element. Since the node is currently an empty block like

\`\`\`@bibliography
\`\`\`

it could even support multiple bibliographies by specifying their paths in that block.

@Seelengrab
Copy link
Contributor

So at this point, Approach 1 is absolutely the way to go. I have a working prototype, which I'll clean up and push after #43 is merged.

Awesome! You're a lifesaver!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants