-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Snuggletex Integration #646
Conversation
…ult error messages
…es the jabref properties for translations and excludes dom building errors
…to pass the error arguments to our own localization
…hanges to make the linter happy
Co-authored-by: Christoph <siedlerkiller@gmail.com>
In JabRef's localization, we keep the English keys equal to the English full text. In this way, we can have readabile code. Details at https://devdocs.jabref.org/code-howtos/localization.html. Regarding the SnuggleTeX errors, it would be OK to show "LaTeX cannot be parsed" and have the detail message in English only. These are all very technical terms and I thnk, no JabRef tranlsator should be bothered to translate them. |
To workaround the method-too-large exception, I am trying to include our custom JDK build. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks pretty readable.
Please add test cases - and then this should be good to go for a review at the jabref repo
src/main/java/org/jabref/logic/integrity/LatexIntegrityChecker.java
Outdated
Show resolved
Hide resolved
src/main/java/org/jabref/logic/integrity/LatexIntegrityChecker.java
Outdated
Show resolved
Hide resolved
src/main/java/org/jabref/logic/integrity/LatexIntegrityChecker.java
Outdated
Show resolved
Hide resolved
Thank you for having worked on this. Sorry for the late reply. Hope, you will continue on this!
\text appears in math-mode only. Not outside math mode. Maybe, it is available inside math mode (e..g,
Which concrete case did you have?
Please unzip what you mean. Ah, I understand: It does not check for ampersands, wrong bibtex string concatenation, and percent signs. For BibTeX string concatenation, we already discussed. Difficult thing as this is JabRef custom logic, too.
Could be a fun excercise? With the possible outcome that the code is harder to read as our existing code?
Suggestion: Finish this PR with the current functionality. - Create a new branch (based on this branch) to rewrite the other checks. In this way, you can work in parallel. If the updated code turns out to be good, we can continue working forward to include it. If it turns out it is not that maintainable, the issue JabRef#8712 is still fixed. |
Okay, just to make sure I understand you correctly, a message would for example look like this:
And we would only translate this part: So having Consequently we would only need one entry inside the properties file: and "feed it" the internal error messages. |
Sounds great, I am looking forward to trying it out :^) |
No worries, I was very busy anyway. I would be happy to continue :^) .
It really is not handled. I looked through the library with text search and tried it out. \text is an undefined command according to snuggletex.
Exactly, could have made this clearer, sorry for the confusion.
Just guessing, I'd say it won't make much of a difference. Perhaps we should just be happy with what we already have. I will do a case study regardless, so we can see the possible outcome before deciding.
Sounds good to me, I'll do it. :) |
Yes. We should move forward in that way. Later, we should collect (somehow) the returned errors and translate them. We have telemetry infrastructure for that, but currently not working. Needs some updates...
The %0 needs to be on the left side, too. Key and value really need to be the same 😅
Yes! |
Looking forward 🤩
Maybe worth a PR for SnuggleTex? 😅
👍 |
Localization now uses a string from jabrefs properties to wrap the internal error messages. Local fields have been made static members in order to improve performance with large bib files. We no longer instanciate an Engine and Session per bib entry.
Co-authored-by: Oliver Kopp <kopp.dev@gmail.com>
Do you mean to translate them automaticly, via a service?
Slip of the pen. 😅
Errors are now prefixed with "LaTeX Parsing Error:" I found that quite appealing, but we can change that back to "LaTeX cannot be parsed" if you like. |
I can try, but do you think it will be merged? I saw your PR in the repo is still dangling around unmerged. 😅 |
Tests will be ready over the next week. :)
Thanks, just did some performance oriented refactoring regardless, but that should not have too big of an effect on readability. Engine and Session are now static members, prior to that we instantiated them per bib entry. (Keeping the references aroung only adds ~1 KB of memory overhead) |
No ^^. Here my line of thought:
The translation will be done as usual using
Reas good! |
@davemckain is the original developer on snuggletex. My bet is that he is happy if several people contribute to his repository - https://github.com/davemckain/snuggletex. Note to self: The other "maintained" fork seems to be https://github.com/rototor/snuggletex. I would, however, like to stick to the "original" one ^^. |
…therefore for its encapsulated SnuggleTex Parser). Further, slightly adjusted the LatexIntegrityChecker to expose a static errorMessageFormatHelper method to increase maintainability.
Test cases have been added in this commit. Some that I expected to work did not (due to snuggletex) these are commented out for now. |
Well that was unfortunate. I accidentally closed the PR in this comment, sorry. |
Oh, okay, now I get it - I did not know you had telemetry infrastructure for that kind of thing. Thanks for letting me know. I agree, narrowing down the selection of messages that need translation in advance is the better choice. |
This should now be a PR to JabRef's main repo. I resolved the merge conflicts at 4fb512e. Should go as one commit in the upstream repo - if possible. |
Thank you, I am happy to hear that. I am excited to see how it will do in the wild! :d |
Submitted JabRef#10376 - therefore closing this. |
Snuggletex
Snuggletex is a library from the University of Edinburgh for converting latex to XML, but can be used for latex parsing as well. It is extendible, easy to use and powerful, all whilst containing almost no external dependencies.
In the future, it could become our main latex parser for integrity checks.
What it does do
Takeaways
Some commands may be missing, for example I found
\text{}
to be absent,to check which commands are supported by default, refer here: CorePackageDefinitions.java
Thankfully, enough of the package is exposed to be able to inject new commands, like so for example:
engine.getPackages().get(0).addComplexCommandOneArg("text", false, ALL_MODES,LR, StyleDeclarationInterpretation.NORMALSIZE, null, TextFlowContext.ALLOW_INLINE);
I have not checked if this is the correct way to represent the text command, but now it parses it correctly.
What it does not do
What we could do
Use the tokens provided by snuggletex to implement our own parser on top
Or
Keep our integrity checks for & and # and implement % like we used to
What I would like to do
I am really fascinated by this library, it's clean, well documented and build thoughfully and extendible. I'd really like to do more with it. If you do not mind, I'd like to port our integrity checks to snuggletex, rather than writing them as we used to.
Mandatory checks
CHANGELOG.md
described in a way that is understandable for the average user (if applicable)