-
-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSL support #2082
CSL support #2082
Conversation
0f5d659
to
330cd9d
Compare
2024/07/14 "Stage 0" milestone: Successfully processed 1355 references
|
d08cbaf
to
f6b9b65
Compare
2024/07/20 "Stage 1" milestone: Successfully processed 1508 references,
|
Soon leaving for vacations, so here are just some advancement notes to myself, in order to remember:
That's a minimal set. There would still be a few missing features from the CSL spec, but at least all Chicago-styles would be covered fairly decently, and a first milestone would be passed. |
f6b9b65
to
dca5de9
Compare
dca5de9
to
87e9264
Compare
Slowly back on track. |
ac68bef
to
d020ba5
Compare
c407336
to
c775542
Compare
I hate names with particles, definitively. 🤣 -- Doh, it was hard for my tired brain. One checkbox ticked. |
a4f380d
to
623bf3a
Compare
That is, you can of course "Move CSL support module unter bibtex package", though I don't know what CSL and bibtex have in common :D |
Just that CSL in only used in connection with bibliographies, and the only (poorly named) implementation of that we have is our bibtex package. Keeping all the related utilities together makes it much easier to package and maintain. If/when we do have other packages we can consider whether abstracting the CSL stuff under more general utilities makes sense. Putting anything in the root of this project comes with some caveats for packaging and distribution and I'd rather not deal with that unless I really understand why that namespacing choice is warranted.
Sure, but it also isn't that hard. I'm still seriously considering it for this module. I looked through the sources and it seems like the only real issue is the use of SILE's |
I understand. |
Note that these files are licensed under CC-BY-SA 3.0 and are only included as a default minimal set for testing.
This should even be the default when generating a bibliography.
After citeproc-java, let's check also citeproc-lua.
Honor the page-range-delimiter from the locale.
We can use the bibtex.style setting to help switching implementations We can also ensure printbibliography works with legacy citations. This will make deprecations and transition easier.
cc581e2
to
ac0b1e4
Compare
I think I'm going to go ahead with this namespacing (under bibtex) and we can refactor from there. I'd still consider extracting this to an external library, but to do that the interaction between SILE and the CSL engine interface should be much clearer, and potentially interchangeable with another implementation (e.g. a Rust one). A couple hundred kb of code we're still refactoring can live here until such a time as we have a clearly better plan, and also clearly moving ahead with this is better than what we have now. When we get BCP-47 locale stuff straitened out the actual bundled styles can probably move to the language module, but that should be a non-breaking change later. |
I understand the idea of potentially using a Rust implementation in the future, but I’d like to clarify whether it’s something that’s actively being pursued or just a long-term consideration? |
I'm not actively working on this one (CSL), but the tooling is already in place for Rust, so when you look around to see if there are existing implementations we can leverage don't just look at Lua. I haven't looked for CSL engines yet, but before doing more work on any topic, do check crates.io or other Rust sources too. If there are existing libraries that implement some function or type in either language then we can leverage them in SILE. Wrapping an existing Rust library in Lua bindings so we can expose them to our users is fairly trivial. We do want everything accessible from the Lua side (since user tinkering with internals is one of the stand out features of SILE) but we have pretty robust bridging at our finger tips now. |
I have - there's Typst Hayagriva. (tongue-in-cheek on) Typst has a huge active community, it also has a fairly good indexer, and plenty of advanced modules. (tongue-in-cheek off) Last time I checked this summer, entry sorting was not language-dependent ... checking ... yep, the issue is still open. Well they don't have ICU yet under the hood. |
I hear you on the hype train. Don't get me wrong Typst has done some things right and well. It's fast and does some things very well. But I was quite surprised at the amount of attention it got before it supported basics like footnotes. Even now it doesn't support flexible vertical spaces. Supporting that alone is the cause of much of our nasty pushback problems and a huge chunk of our time, but it's widely used in publishing for good reason. Anyhow I am more convinced than ever that there is room for more than one approach here. Typst is doubling down on being a sandboxed environment with no tinkering (unless you build your own) and only its own dedicated syntax, and SILE allows you access to call out to anything external and tinker with anything internal and process whatever input formats you want. Anyhooooo, yes Hayagriva also look like it took the approach of managing all it's own data in their own YAML format instead of existing bibliography data formats, so I'd view it more as an alternative backend we could support rather than a pathway to our primary support. |
There is also https://github.com/zotero/citeproc-rs. It implements the complete features of CSL spec while hayariva only implements < 70% of them (I guess). |
@zepinglee Indeed, thanks for pointing it! Seeing the online demo again, I'm pretty sure I looked at it too this summer, albeit very briefly, and had some concerns. |
Closes #2074
It already does nice things (see screenshots in the referred issue).
In order to support CSL (Citation Style Language), we need to:
Regarding the conversion of BibTeX entries, the mappings are not straightforward, but there is some prior art that we can check... None of the implementations I checked did the exact same things, so it's likely a bit messy...
Regarding the CSL engine, there are various existing implementations.
Yet, I had a look at them, and I am not really convinced by their code quality, so I went implementing the CSL 1.0.2 specifications from scratch. Because it's fun, and SILE has the guts to do it. And because I think I can.
Additionally, this would also close several other items.
Closes #2024 = The CSL locales takes care of it.
Closes #2022 = The CSL styles have appropriate fallbacks (substitutes, conditionals, etc.)
Closes #2027 = The CSL styles and locales define how to format localized dates in the selected citation or bibliography style.
Closes #2026 = Some CSL styles sort entries by citation order ("citation-number"), so keeping track of cited entries was needed anyhow.