Releases: apache/datasketches-java
Releases · apache/datasketches-java
Apache Release DataSketches Java 6.1.1
Update dependency DataSketches-Memory version to 3.0.2
Apache Release 6.1.0
What's Changed since 6.0.0
- Added new KLL Longs Sketch
- Optimized HLL Union for merge HLL to HLL
- Change Memory dependency to version 3.0.1
- Added one exclusion to FindBugsExcludeFilter.xml
- Fixed bug in Memory that was preventing limited runtime operation with Java 17, see Memory PR#209. (Java 17 is not formally supported by Memory. See datasketches-memory README).
New Contributors
- @ZacBlanco made their first contribution in #556
- @Claudenw made their first contribution in #574
Full Changelog: 6.0.0...6.1.0
6.1.0-RC1
Apache Release 6.0.0
- New: quantiles T-Digest sketch
- New: BloomFilter
- New: Exact and Bounded Sampling Proportional to Size (EB-PPS) sketch
- Added Weighted Inputs to quantiles KllFloatsSketch, KllDoublesSketch and KllItemsSketch
- Added Vector Inputs to quantiles KLLFloatsSketch and KllDoublesSketch
- Enhanced quantiles Sorted Views for KLL and Classic quantiles sketches.
- Enhanced Partitioning API.
5.0.2
This is a PATCH release. No new functionality has been introduced. There are a number of changes stemming from two issues:
- Issue 527: Properly use the comparator for sorting level 0 in the KllItemsSketch
- A new version of SpotBugs created a number of potential security warnings around Finalizer Attacks. Having done our best to look into the matter, we do not believe sketches are meaningfully vulnerable -- any data in the sketches is already available via reflection and there are no methods with special conditional access. Regardless, we felt that good code hygiene meant that we should prioritize fixing any issues found.
Apache Release 5.0.1
5.01 fixed two issues:
PR 482: The HLL Union :: toString(), which prints out a simple diagnostic summary of the sketch, might change the internal state of the union. This was not intended.
PR 485: The KllItemsSketch<Boolean> was not serializing and deserializing the min and max values properly. It only affects this specific generic case of <Boolean>. This is a rather bizarre use case for quantiles -- but nonetheless it is fixed! :)
Apache Release 5.0.0
- A new Example Partitioner Tool is useable in its own right for partitioning medium sized data sets up to about 1E9 items. But the same algorithm could be used in a parallel environment for partitioning data sets many orders-of-magnitude larger.
- Lots of internal cleanup and a few API improvements for consistency across the different quantile sketches, for example. These changes in the API, although relatively minor, were the reason to move to a major release.
- Fixed an integer overflow bug caught by Karan Kumar (via Druid), where very large partitioning datasets using the classic quantiles DoublesSketch::getPartitionBoundaries() would fail.
Apache DataSketches 4.2.0
- added generic KLL quantile sketch
Apache Release 4.1.0
- This is a minor release that primarily fixed a problem where the Java KLL sketches could not read KLL images produced by C++.
- In addition, a number of code improvements to fix issues found by SpotBugs and CodeQL.
- Documentation improvements both internal as well as Javadocs.
Apache Release 4.0.0
Major new features and enhancements
- Quantile Sketches
- The major APIs for all the quantile sketches now derive from interfaces common to all the quantile sketches. This makes it much easier for the user to move from one quantile sketch to another with only very minor API changes.
- All the quantile sketches now have a "SortedView", which is iterable and makes analysis of the quantile distribution even easier.
- HLL Sketches
- Major speed performance improvements for HLL union/merge operations.
- Major improvements to the HLL Javadocs.
- Theta Sketches
- The Theta sketch has been enhanced with an optional compress operation that makes the serialized theta sketch smaller.
- TestNG has been updated to version 7.5.1 (works with Java 8), which includes the Zip Slip Vulnerability fix.