Skip to content

Commit

Permalink
Vendor the antlr4 runtime library (#487)
Browse files Browse the repository at this point in the history
* Format the generated code at generation time

* Code to vendor the antlr runtime

* Vendor the antlr4 runtime library

* Remove the path from the generated code

* Move the antlr runtime outside of the parser directory, to avoid circular dependencies

* Use relative imports within the antlr4 runtime. This should be upstreamed

* Update the readme to explain more about ANTLR4 and the vendoring process

* Update cf_units/_udunits2_parser/README.md

Co-authored-by: Patrick Peglar <patrick.peglar@metoffice.gov.uk>

* Update cf_units/_udunits2_parser/README.md

Co-authored-by: Patrick Peglar <patrick.peglar@metoffice.gov.uk>

---------

Co-authored-by: Patrick Peglar <patrick.peglar@metoffice.gov.uk>
  • Loading branch information
pelson and pp-mo authored Oct 30, 2024
1 parent 792ba1e commit 04a4350
Show file tree
Hide file tree
Showing 71 changed files with 14,041 additions and 41 deletions.
2 changes: 2 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
.git_archival.txt export-subst
cf_units/_udunits2_parser/parser/**/*.py linguist-generated=true
cf_units/_udunits2_parser/_antlr4_runtime/**/*.py linguist-generated=true
23 changes: 19 additions & 4 deletions cf_units/_udunits2_parser/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,10 @@ a number of convenient lexical elements.

Once the Jinja2 template has been expanded, the
[ANTLR Java library](https://github.com/antlr/antlr4) is used to
compile the grammar into the targetted runtime language.
compile the grammar into the targeted runtime language.

[A script](compile.py) is provided to automate this as much as possible.
It has a dependency on pip, Jinja2, Java and ruff.

The compiled parser is committed to the repository for ease of
deployment and testing (we know it isn't ideal, but it really does make things easier).
Expand All @@ -24,9 +25,23 @@ changes to the grammar being proposed so that the two can remain in synch.

### Updating the ANTLR version

The above script downloads a Java Jar which needs updating to the same version
as antlr4-python3-runtime specified in the python requirements. Once these have
been updated, run [the script](compile.py) to regenerate the parser.
The [compile.py script](compile.py) copies the ANTLR4 runtime into the _antlr4_runtime
directory, and this should be commited to the repository. This means that we do not
have a runtime dependency on ANTLR4 (which was found to be challenging due to the
fact that you need to pin to a specific version of the ANTLR4 runtime, and aligning
this version with other libraries which also have an ANTLR4 dependency is impractical).

Since the generated code is committed to this repo, and the ANTRL4 runtime is also vendored into it, we won't ever need to run ANTLR4 unless the grammar changes.

So, we will only change the ANTLR4 version if we need new features of the
parser/lexer generators, or it becomes difficult to support the older version.

Upgrading the ANTLR4 version is a simple matter of changing `ANTLR_VERSION` in the compile.py
script, and then re-running it. This should re-generate the parser/lexer, and update
the content in the _antlr4_runtime directory. One complexity may be that the imports
of the ANTRL4 runtime need to be rewritten to support vendoring, and the code needed
to do so may change from version to version. This topic is being followed upstream
with the ANTRL4 project with the hope of making this easier and/or built-in to ANTLR4.

### Testing the grammar

Expand Down
10 changes: 7 additions & 3 deletions cf_units/_udunits2_parser/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,14 @@

import unicodedata

from antlr4 import CommonTokenStream, InputStream
from antlr4.error.ErrorListener import ErrorListener

from . import graph
from ._antlr4_runtime import (
CommonTokenStream,
InputStream,
)
from ._antlr4_runtime.error.ErrorListener import (
ErrorListener,
)
from .parser.udunits2Lexer import udunits2Lexer
from .parser.udunits2Parser import udunits2Parser
from .parser.udunits2ParserVisitor import udunits2ParserVisitor
Expand Down
309 changes: 309 additions & 0 deletions cf_units/_udunits2_parser/_antlr4_runtime/BufferedTokenStream.py

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 04a4350

Please sign in to comment.