Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LibLinear and LibSVM have unmanaged global RNGs #172

Merged
merged 6 commits into from
Sep 10, 2021

Conversation

Craigacp
Copy link
Member

@Craigacp Craigacp commented Sep 6, 2021

Description

This change fixes Tribuo's LibLinear and LibSVM interfaces so they use trackable RNG states.

LibLinear has a protected global RNG with a reset method (to a fixed seed), so now Tribuo sequentialises all LibLinear training runs under a lock on LibLinearTrainer.class and resets the seed before each call to Linear.train. There's a new reproducibility test in the classification LibLinear module which fails without these changes. This fix may change depending on the resolution of bwaldvogel/liblinear-java#40.

LibSVM has a public global RNG so we use Tribuo's standard RNG idiom (a per train call localRNG split from the trainer's instance of SplittableRandom), LibSVM's global RNG's seed is set to a draw from localRNG and all LibSVM training runs under a lock on LibSVMTrainer.class. There's a new reproducibility test in the classification LibSVM module which fails without these changes. Unfortunately it's quite a chatty test as LibSVM prints a lot during this run.

Motivation

We need Tribuo models to be reproducible.

… it synchronized and resetting the RNG everywhere.
LibSVM has a global RNG which is used when making probailistic
predictions. This commit uses Tribuo's standard mechanisms to track RNG
state to ensure this is reproducible, at the cost of sequentializing all
LibSVM training runs to ensure that the RNG state is consistent with the
provenance.

Also added some more Javadoc.
@Craigacp Craigacp added Oracle employee This PR is from an Oracle employee squash-commits Squash the commits when merging this PR labels Sep 6, 2021
Copy link
Member

@pogren pogren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The train methods in the trainers for LibLinear and LibSVM call Linear.resetRandom() svm.rand.setSeed(localRNG.nextLong()), respectively. The latter looks a little suspect at first glance but note that different instances of LibSVMTrainer will have identical localRNG instances. The respective train methods also synchronize on LibLinearTrainer.class and LibSVMTrainer.class. A couple of unit tests demonstrate that subsequent runs of the train method for both produce the same model parameters.

The trade-off between locking on classes Tribuo does not control (not done) and the risk of someone using Liblinear in the same jvm in a non-synchronized way and mucking things up is documented.

SVMAnomalyType.isClassification and .isAnomaly had the boolean return values flipped. oops! Corrected here.

Various and sundry documentation comments are sprinkled in the code

@Craigacp Craigacp merged commit 21cc450 into main Sep 10, 2021
@Craigacp Craigacp deleted the liblinear-libsvm-reproducibility branch September 10, 2021 21:39
Craigacp added a commit that referenced this pull request Oct 1, 2021
* Fixing a concurrency and reproducibility issue in liblinear by making it synchronized and resetting the RNG everywhere.

* Changing the lock so it's on LibLinearTrainer.class rather than the instance of LibLinearTrainer.

* Fixing a bug where SVMAnomalyType reported itself as classification not anomaly detection.

* Makes LibSVM training reproducible.

LibSVM has a global RNG which is used when making probailistic
predictions. This commit uses Tribuo's standard mechanisms to track RNG
state to ensure this is reproducible, at the cost of sequentializing all
LibSVM training runs to ensure that the RNG state is consistent with the
provenance.

Also added some more Javadoc.

* Updating comments after review. Adding an extra bit to TestLibSVM.testReproducibility to show the models differ in tracked ways.

* extended unit test to test that two subsequent models are the same

Co-authored-by: Philip Ogren <philip.ogren@oracle.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Oracle employee This PR is from an Oracle employee squash-commits Squash the commits when merging this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants