Roc is software for generating and working with ROC (Receiver Operating Characteristic) and PR (Precision-Recall) curves. These curves are typically used to evaluate classification approaches in areas such as Machine Learning, Statistics, Medicine, and Epidemiology. Students, scientists, and researchers are the target audience of this software.
The goal of this project is to provide software for evaluating ROC and PR curves that correctly implements the traditional and recent approaches using languages allowing for flexibility in each investigator's environment.
Roc, the name of the software, is pronounced "rock" like its namesake, roc, an enormous, legendary bird of prey.
- Download Roc from the releases page.
- Download the latest code as a ZIP file or a TAR ball.
Roc is free, open source software. It is released under the BSD
2-Clause License (also known as the FreeBSD License). See the file
LICENSE.txt
in your distribution (or on
GitHub) for
details.
Roc is version 0.1.0.
This software is still in the alpha stages of design and development. However, it is mature enough for the authors to use it as part of their everyday workflow.
It is released as a library and as a command-line interface (CLI) front-end for the library. The table below contains a summary of features.
Features are S=stable, T=tested, I=implemented, P=planned, NP=not planned, ?=undecided, NA=not applicable. Languages are J=Java, P2=Python 2.x.
Feature Description Library Status CLI Status
------------------- -------------- ----------
ROC curves
. Points J:T P2:P J:T P2:P
. Area J:T P2:P J:T P2:P
. Maximum area (convex hull) J:T P2:P J:P P2:P
. Aggregation (averaging) J:P P2:P J:? P2:?
. Confidence bounds J:P P2:P J:? P2:?
. Clipping J:P P2:P J:? P2:?
PR curves
. Points J:T P2:P J:T P2:P
. Area J:T P2:P J:T P2:P
. Maximum area (convex hull) J:I P2:P J:P P2:P
. Aggregation (averaging) J:P P2:P J:? P2:?
. Confidence bounds J:P P2:P J:? P2:?
. Clipping J:P P2:P J:? P2:?
. Minimum awareness J:P P2:P J:? P2:?
Plotting J:NP P2:P J:NP P2:P
Inputs
. Ranking of labels J:T P2:P J:T P2:P
. Scores, labels J:T P2:P J:T P2:P
. Score-label pairs J:P P2:P J:T P2:P
. Example weights J:P P2:P J:P P2:P
Convenience
. File I/O J:P P2:P NA
Ranking Statistics
. Mann-Whitney-U J:T P2:P J:? P2:?
This software is designed and tested to support 1 million total examples. It probably works on many more, but the performance and accuracy have not been tested at such larger scales.
- Java 6 (or later)
If you want to develop this software, there are some additional requirements.
- Standard Linux core utilities
- GNU Make
- JUnit >= 4.6
- Hamcrest >= 1.3 (if not already included in your JUnit release)
Note that certain JUnit versions contain some Hamcrest classes and so may conflict with (override) those from Hamcrest. If you encounter missing Hamcrest symbols, try placing Hamcrest ahead of JUnit on the class path or updating JUnit.
The Java library provides an API for working with ROC and PR curves in your Java programs. It is distributed as a Java archive (JAR) containing source code, bytecode, and documentation. The JAR can be obtained on the releases page. To include the library in your Java project, just place the JAR in a convenient location and include it in your classpath. You can browse the documentation by extracting it from the JAR or by viewing the latest version on GitHub.
The JAR also contains the command-line interface which can be run like this:
java -jar roc-0.1.0.jar --help
Please search the existing documentation before contacting us. There is this README, the Javadoc, the wiki, and existing issues. Then, open an issue to report a bug or ask a question.
Copyright (c) 2014 Roc Project. This is free software. See LICENSE.txt for details.