-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scan classpath for EDD subclasses/dataset types #206
Scan classpath for EDD subclasses/dataset types #206
Conversation
b170d3f
to
12c1c03
Compare
Need to do some testing tomorrow to verify that third-party EDD class discovery actually works, but here's the code for review. |
Ah, I missed the SaxHandler entirely. Looking at that now... |
Could we use an annotation instead of looking for a static method? I think that would simplify the check for in the reflection logic. It would also give an indication in the dataset file that the class is referenced by reflection (it can be difficult to differentiate dead code and code only referenced by reflection and I'd like to have some indication the code is not dead). Also I have a minor refactoring of the sax handlers in this pull that you might want to take a look at: #205 |
Before seeing your message I've been working on using an annotation on the EDD classes to indicate their Sax handler, so that's good confirmation. Are you saying we should use annotations for the non-sax parser |
Yeah, I'd prefer to indicate code that's only getting called through reflection with an annotation to make that explicit in the code. |
e672ee3
to
51421e8
Compare
Ok, I reimplemented this, adding support for both |
// appropriate one? | ||
Constructor<?> constructor = eddClassInfo.getSaxHandlerClass().get().getConstructors()[0]; | ||
|
||
// Constructors for handlers don't have uniform signatures so we have to do some investigation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invoking the constructor on the sax handler/State class is complex because the various State subclasses/Handlers don't have a uniform constructor signature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly there's essentially 2 different constructors for the dataset handlers. Many take 3 parameters (saxHandler, datasetId, completeState) and some take a 4th parameter (SaxParsingContext). It could make sense to have a dataset constructor param object that has all 4 of those and have all datasets take that object in their constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there are currently three constructor signature variations with five total possible arguments, and the first three are always consistent.
$ git show main:WEB-INF/classes/gov/noaa/pfel/erddap/handlers/HandlerFactory.java | grep -o "saxHandler, datasetID,[^)]*" | sort | uniq -c
10 saxHandler, datasetID, completeState
9 saxHandler, datasetID, completeState, context
3 saxHandler, datasetID, completeState, datasetType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I forgot the FromFilesHandler version. dataset type can be included in that param object.
Also, I think it should be fairly straightforward to adapt this to #205, maybe just changing the usage of |
We should probably file an issue to update the GenerateDatasetsXml.java (https://github.com/ERDDAP/erddap/blob/main/WEB-INF/classes/gov/noaa/pfel/erddap/GenerateDatasetsXml.java#L187) with a similar reflection approach. I think that will be quite a bit more complicated due to the varying parameters though. Unless you wanted to tackle that here? |
Good call, let's handle |
@ChrisJohnNOAA One thing I want to check on. With the next release are we going to set a hard requirement on Java 21+? Or continue to support both 17 and 21? This page strongly suggests using Java 21 but mentions that both 17 and 21 are tested and supported. https://github.com/ERDDAP/erddap/blob/main/DEPLOY_INSTALL.md On the changes page there's no mention of Java 21. Java 17 is currently noted as being the required version there. https://erddap.github.io/changes.html I ask because some of the class casting used in the new reflections code works in 21 and fails in 17. If we want to continue to support 17 I think I can make the generics unspecific. |
This change uses reflection, specifically the ClassGraph classpath scanner, to discover concrete EDD subclasses and associated class metadata (fromXml methods and Sax handlers for XML deseralization) at runtime. The scan is performed once and stored in a static map for use in XML deserialization methods. In addition to eliminating the need to register various EDD implementations in the ERDDAP codebase in EDD.fromXml and HandlerFactory.getHandlerFor, this change also allows loading of third party EDD implementations at runtime. The reflection based EDD class discovery is enabled by the ERDDAP setting useEddReflection. If true, the class map built using reflection is used to discover classes/methods for XML deserialization. If false, legacy hardcoded approaches are used. In local testing classpath scanning took about 1 second. This scan occurs only once at startup. Scanning is confined to EDD's package (gov.noaa.pfel.erddap.dataset) for performance reasons; scanning all packages took around 12 seconds. For this reason, all EDD implementations need to belong to the gov.noaa.pfel.erddap.dataset package to be detected. This adds a dependency on classgraph, which is very small (~560k).
51421e8
to
e4664e1
Compare
This seems to be working well. I made the reflection based discovery configurable with a |
<warSourceDirectory>${project.basedir}</warSourceDirectory> | ||
<warSourceExcludes>**/*.java, **/*.javaINACTIVE, **/*.javaInactive, **/*.java_NOT_YET, **/*.javaNOT_FINISHED, **/*.javaOLD, content/**, data/**, development/**, download_cache/**, /src/test/**, target/**, src/**, WEB-INF/classes/gov/noaa/pfel/coastwatch/log.*, WEB-INF/lib/**, WEB-INF/NetCheck.*, WEB-INF/*.web.xml, WEB-INF/DoubleCenterGrids.sh, WEB-INF/FileVisitorDNLS.sh, WEB-INF/FindDuplicateTime.*, WEB-INF/GenerateO*, WEB-INF/GenerateT*, WEB-INF/GridSaveAs*, WEB-INF/incompleteMainCatalog.xml, WEB-INF/iobis.m, WEB-INF/obis.m, WEB-INF/MapViewer.*, WEB-INF/QN2005193*, WEB-INF/ValidateDataSetProperties*, WEB-INF/fonts/**, WEB-INF/temp/**, *.*, .settings/**, .github/**, .vscode/**</warSourceExcludes> | ||
</configuration> | ||
</plugin> | ||
<plugin> | ||
<groupId>org.apache.maven.plugins</groupId> | ||
<artifactId>maven-source-plugin</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a source jar to the build so projects importing classes can examine source without using decompilers.
@@ -180,10 +180,25 @@ | |||
<artifactId>maven-war-plugin</artifactId> | |||
<version>3.4.0</version> | |||
<configuration> | |||
<attachClasses>true</attachClasses> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Build a jar containing ERDDAP classes for use in downstream projects.
I believe last release I upgraded to 21 shortly before the release was done so I had done a lot of testing with both 17 and 21. It'd be good if ERDDAP can continue to work on 17, however in my opinion the most important thing is to be compatible with the latest LTS version (currently 21 which has been available for about a year now). |
Description
This change uses reflection, specifically the ClassGraph classpath scanner, to discover concrete EDD subclasses and associated class metadata (
fromXml
methods and Sax handlers for XML deseralization) at runtime. The scan is performed once and stored in a static map for use in XML deserialization methods.In addition to eliminating the need to register various EDD implementations in the ERDDAP codebase in
EDD.fromXml
andHandlerFactory.getHandlerFor
, this change also allows loading of third party EDD implementations at runtime.The reflection based EDD class discovery is enabled by the ERDDAP setting
useEddReflection
. If true, the class map built using reflection is used to discover classes/methods for XML deserialization. If false, legacy hardcoded approaches are used.In local testing classpath scanning took about 1 second. This scan occurs only once at startup. Scanning is confined to EDD's package (
gov.noaa.pfel.erddap.dataset
) for performance reasons; scanning all packages took around 12 seconds. For this reason, all EDD implementations need to belong to thegov.noaa.pfel.erddap.dataset
package to be detected.This adds a dependency on classgraph, which is very small (~560k).
Type of change
Checklist before requesting a review