[NEW] Valkey-Bloom: BloomFilter support for Valkey. #407

KarthikSubbarao · 2024-04-30T22:13:32Z

The problem/use-case that the feature addresses

Bloom filters are a space efficient probabilistic data structure that can be used to “check” whether an element exists in a set (with a defined false positive), and to “add” elements to a set. While checking whether an item exists, false positives are possible, but false negatives are not possible. https://en.wikipedia.org/wiki/Bloom_filter

Description of the feature

Valkey-Bloom is a Rust Valkey-Module which brings a native and space efficient probabilistic Module data type to Valkey. With this, users can create filters (space-efficient probabilistic Module data type) to add elements, perform “check” operation to test whether an element exists, check cardinality / INFO, auto scale their filters, reserve filters, perform RDB Save and load operations, etc.

Valkey-Bloom is built using bloomfilter::Bloom (https://crates.io/crates/bloomfilter which has a BSD-2-Clause license).

It is compatible with the BloomFilter (BF.*) command APIs of redislabs/rebloom from Redis Ltd. which has over 10M image pulls on Docker and is compatible with several client libraries.

The following commands are supported.

BF.EXISTS
BF.ADD
BF.MEXISTS
BF.MADD
BF.CARD
BF.RESERVE
BF.INFO
BF.INSERT

We would like to bring Valkey-Bloom into the valkey-io project as an open source Valkey-Module that is free to use, contribute to, etc.

Alternatives you've considered

A bloom filter module does exist today for Redis - https://github.com/goodform/rebloom. However, it uses an AGPL-3.0 license which has additional obligations that are are difficult to meet for many of the active contributors who are looking to provide Valkey as a service. AGPL is also widely disallowed by company open source program offices (including Amazon). Given that this package has not been significantly modified since it was created six year ago, it seems likely that the license is part of the issue.

The text was updated successfully, but these errors were encountered:

natoscott · 2024-04-30T22:53:19Z

@KarthikSubbarao we are continuing the goodform.io modules as native valkey modules too. Personally I don't think the lack of activity relates to the license - it's more that the code is essentially done and that all modules generally get little attention once mature - but we're just speculating here.

Can we find a way to co-exist? I have used naming like valkey-bloom (all lower case) and the module shared library valkeybloom.so for a simple transition for users (this module will be in Fedora soon with this naming convention as we transition away from Redis). This matches up with the other goodform.io modules like valkey-search, valkey-json, valkey-graph, and so on.

Would it be possible to name this new module in a way that highlights the differences perhaps? (e.g. Valkey-Bloom-Rust?)

madolson · 2024-04-30T23:13:16Z

Can we find a way to co-exist?

Given your precedence, I think we shouldn't overwrite your naming. If you want to translate the names to valkey-*, I think we should respect that.

Would it be possible to name this new module in a way that highlights the differences perhaps? (e.g. Valkey-Bloom-Rust?)

We could call it Val-Bloom or something, more similar to how Redis was naming. Or we could name it based on the probability. Based on reading the docs (I've been historically advised not to read AGPL code while working in an AWS capacity), the rebloom only supports the Bloom data types and not any of the newer ones supported by Redis (like Top-K or Cookoo). I don't know how popular any of those are though.

hpatro · 2024-04-30T23:14:35Z

Thanks @KarthikSubbarao for creating this.

This is one of the most popular modules and I've seen users used various alternatives like lua scripts, custom application around BITSET command when the prior modules weren't accessible (due to licensing). I believe it would be good if Valkey organization can make it part of the project.

Key questions :

How do we bundle modules? Should it be part of the binary/containers/release(s) by default?
Integration tests? Each module having their own testing framework might make it difficult for maintenance over the years. I would rather prefer continuing with TCL tests or introduce new lightweight framework and use it for each modules.

natoscott · 2024-04-30T23:34:30Z

@hpatro there is an existing python-based test framework (BSD licensed) from the early days that has been kept and used with all of the goodform modules. The earlier version is named 'rmtest' (Redis Modules Test) and I've been working on transitioning it to 'vkmtest' (ValKey Modules Test). Maybe it'll work for the Rust module testing too - you can find the initial version here: https://github.com/goodform/valkey-module-test

madolson · 2024-04-30T23:38:50Z

@natoscott That is something I am very interested in taking over (specifically because I want a python based testing framework for the main project) if you have any interest in offloading the maintenance of it. Ideally it could be re-usable across all projects that run Valkey (or Redis even).

madolson · 2024-04-30T23:39:23Z

How do we bundle modules? Should it be part of the binary/containers/release(s) by default?

This isn't the question we should answer here. Can you make a separate issue for it?

natoscott · 2024-04-30T23:47:52Z

@madolson happy to either work with you on it or have you take it over - I have alot on my plate (as I'm sure you do!) but I can definitely still dedicate some time to it. This test framework is also packaged in Fedora and I'd like to upload it to pypi for ease of use within the Valkey modules too.

natoscott · 2024-04-30T23:53:21Z

@KarthikSubbarao another possibility if you're super keen on ValkeyBloom and not something with 'Rust' in the name would be for me to use valkey-module-bloom for the existing modules. In hindsight I see I've used that prefix for -test and -sdk (python and C respectively) and that convention could be used on the C modules also perhaps? Anyway, let me know your thoughts, I'm happy to change it at this early stage. There was also mention of a new implementation of ValkeyJSON (not sure if its using Rust) from someone at Alibaba IIRC - so this naming issue may not be an isolated problem.

madolson · 2024-05-01T00:14:39Z

happy to either work with you on it or have you take it over

Cool! Not an immediate something to figure out, but would love to collaborate on this.

hwware · 2024-05-01T14:51:12Z

Thanks @KarthikSubbarao for creating this.

This is one of the most popular modules and I've seen users used various alternatives like lua scripts, custom application around BITSET command when the prior modules weren't accessible (due to licensing). I believe it would be good if Valkey organization can make it part of the project.

Key questions :

How do we bundle modules? Should it be part of the binary/containers/release(s) by default?

Integration tests? Each module having their own testing framework might make it difficult for maintenance over the years. I would rather prefer continuing with TCL tests or introduce new lightweight framework and use it for each modules.

Here we are #408

daniel-house · 2024-05-01T15:19:54Z

Would it be possible to name this new module in a way that highlights the differences perhaps? (e.g. Valkey-Bloom-Rust?)

I like a name that highlights the differences in behavior but not one that gives the slightest hint about how it is implemented.

madolson · 2024-06-11T15:40:02Z

@valkey-io/core-team I guess maybe ask for a vote if we want to adopt this and continue developing it as an official bloom module? This is not committing to a specific date for when we will release it, just to start the ball rolling for a module based distribution.

Some things to consider. There are other modules like the good form modules. I believe Alibaba also has a module that implements bloom that they have not open sourced.

zuiderkwast · 2024-06-11T22:21:46Z

Regarding naming, I though we had sort-of decided to reserve the valkey prefix for official modules and clients. OTOH, we agreed that the license precondition to become official is that it's open source / free software, which AGPL is, although cloud vendors and other enterprises don't like it. :)

Anyhow, I hope both can co-exist and that they're made API compatible. In that way, users don't need to worry about the differences running on their distro vs running against a hosted database as a service.

@KarthikSubbarao How complete is your module?

I'm fine with adding it, if you (or anyone else) promises to maintain it actively.

My name suggestion is "ValkeyBF", picking up the BF. prefix used in the command names.

PingXie · 2024-06-12T06:19:53Z

@zuiderkwast I think @KarthikSubbarao's bloom filter is licensed under BSD 3-clause and it is the one being proposed here. My vote is yes on the same conditions as Viktor mentioned above: 1) full command compat; 2) active maintenance. Name wise, my preference would be Valkey-Bloom. BF is too short IMO and I would also prefer a dash after Valkey

madolson · 2024-06-12T16:14:18Z

@PingXie Do you want to clarify what you mean with 1) full command compat;. I think right now there is not full command compatibility, since only some of the commands are implemented. Do you just mean that the APIs that do exist are compatible?

zuiderkwast · 2024-06-12T16:15:13Z

@zuiderkwast I think @KarthikSubbarao's bloom filter is licensed under BSD 3-clause and it is the one being proposed here.

Yes it is, but we're also discussing the already-exising AGPL "valkey-bloom" module here.

@natoscott It's good that you're willing to name the AGPL module "valkey-module-bloom" and we can name the BSD licensed module "Valkey-Bloom". That's no collission.

Even if we allow projects under the Valkey unbrella to be AGPL, it might be good to avoid it for modules that are to be included in the "Valkey+" (name TBD) package, which will be a container containing Valkey + some official modules.

PingXie · 2024-06-12T18:40:36Z

@PingXie Do you want to clarify what you mean with 1) full command compat;. I think right now there is not full command compatibility, since only some of the commands are implemented. Do you just mean that the APIs that do exist are compatible?

yeah existing commands being fully compatible is good for now. Also the maintainers (whoever they are) agree that eventual full compat (meaning new commands as well) is a p0 goal by default. We can always discuss exceptions on a case-by-case basis. "incremental perfection" (R) :-)

KarthikSubbarao · 2024-06-12T20:21:53Z

@KarthikSubbarao How complete is your module?

What is done:

Support for the Bloom Filter Module commands (compatible with the ReBloom Module syntax): BF.ADD, BF.EXISTS, BF.INFO, BF.INSERT, BF.MADD, BF.MEXISTS, BF.RESERVE
Auto Scaling of Bloom Filters
RDB save and load for Bloom Filter data types
Configs for bloom filter expansion rate (used for scaling) and max size of bloom filters (number of element that can be "added")
Additional Bloom data type callbacks: Copy command, Free, Memory Usage check, Defrag, Free Effort, etc.
Initial sanity Memory Usage and Performance tests

What is remaining:

Perf testing to set a baseline. We can decide on a baseline scenario and run tests & document results
Integration Testing / Unit testing coverage
Additional Bloom data type callbacks: AOF rewrite and Digest. These are generic Module data type callbacks that can be implemented in the Module.
Memory Based restrictions - If the expected memory that will allocated upon a bloom write type operation (such as BF.REVERSE, BF.CREATE) will result in exceeding allowed memory, then we should reject the command. We need to check if any additional logic needed to handle this should be added to the Module.
Additional Bloom specific Module configurations for customizing the created bloom objects & Tuning default/min/max config values.

full command compat;

This Module supports every Bloom Filter command (from ReBloom) except for the BF.LOADCHUNK and BF.SCANDUMP and the commands have been implemented with ReBloom compatibility. The reason for not implementing the two cmds is because the Module provides the ability to load and save BloomModule data type items during RDB load and save. BF.LOADCHUNK and BF.SCANDUMP are APIs to load BloomModule data types through commands, but since we will provide RDB save & load and also AOF Rewrite, having specific commands for the same purpose was not considered as required. This can always be re-evaluated if we think it is useful

active maintenance.

I would be glad to help with maintenance of the Module by addressing issues and having discussions on missing aspects that we would like to build into the Module's functionality and testing

hpatro · 2024-06-12T20:33:37Z

@zuiderkwast I think @KarthikSubbarao's bloom filter is licensed under BSD 3-clause and it is the one being proposed here. My vote is yes on the same conditions as Viktor mentioned above: 1) full command compat; 2) active maintenance. Name wise, my preference would be Valkey-Bloom. BF is too short IMO and I would also prefer a dash after Valkey

Full command compat is one of the point I wanted clarification on for all the future modules we're planning to build/accept. As we don't have any data points for Redis Modules, one can't be sure which API(s) were really used. Do you think it's wise to build full compatibility? Right now the changes which @KarthikSubbarao has made supports all the bloom filter related API(s) but leaves out some of the other probabilistic filter(s). I think we should not strive for full command compatibility to accept a Module. Rather accept one if it meets the performance/memory/language/coding standards aspect of the project. We can always improve/add as per user(s) request.

PingXie · 2024-06-12T21:33:51Z

As we don't have any data points for Redis Modules, one can't be sure which API(s) were really used.

We can argue the opposite way too without concrete data and this would become pure speculation at the end.

If there is a legit reason to not be fully compatible we can always take an exception but I think it is important to aim at a higher compat bar so that existing Redis users can migrate their workload seamlessly to Valkey. Any incompatibility adds adoption friction and they add up. I am not saying a module needs to be bit by bit compatible in order to be adopted under Valkey. I am talking about directional alignment on helping all customers move on to Valkey with minimum possible friction.

hpatro · 2024-06-12T21:39:53Z

As we don't have any data points for Redis Modules, one can't be sure which API(s) were really used.

We can argue the opposite way too without concrete data and this would become pure speculation at the end.

If there is a legit reason to not be fully compatible we can always take an exception but I think it is important to aim at a higher compat bar so that existing Redis users can migrate their workload seamlessly to Valkey. Any incompatibility adds adoption friction and they add up. I am not saying a module needs to be bit by bit compatible in order to be adopted under Valkey. I am talking about directional alignment on helping all customers move on to Valkey with minimum possible friction.

Well the bloom filter module proposed here has all the bloom filter commands implemented. Remaining commands, technically don't fit under bloom filter they would ideally be under probabilistic filter.

@KarthikSubbarao could we also list out the remaining commands not built yet?

PingXie · 2024-06-12T21:40:18Z

This Module supports every Bloom Filter command (from ReBloom) except for the BF.LOADCHUNK and BF.SCANDUMP and the commands have been implemented with ReBloom compatibility. The reason for not implementing the two cmds is because the Module provides the ability to load and save BloomModule data type items during RDB load and save. BF.LOADCHUNK and BF.SCANDUMP are APIs to load BloomModule data types through commands, but since we will provide RDB save & load and also AOF Rewrite, having specific commands for the same purpose was not considered as required.

This got me thinking about the on-disk format compatibility, which would be another very valuable property. Though I can see it being harder to achieve.

This can always be re-evaluated if we think it is useful

I agree.

Along the compat topic, I would also like the module maintainer to provide migration best practices, when applicable.

hpatro · 2024-06-13T00:42:11Z

@KarthikSubbarao could we also list out the remaining commands not built yet?

Realized the other probabilistic filter/algorithm commands are each under different command namespace like Cuckoo filter commands s are under CF.*, count min sketch commands are under CMS.*, etc.

zuiderkwast · 2024-06-13T02:34:59Z

Ok, so we can eventually have separate modules for cuckoo and minsketch. Seems reasonable.

madolson · 2024-06-13T04:22:11Z

If there is a legit reason to not be fully compatible we can always take an exception but I think it is important to aim at a higher compat bar so that existing Redis users can migrate their workload seamlessly to Valkey. Any incompatibility adds adoption friction and they add up. I am not saying a module needs to be bit by bit compatible in order to be adopted under Valkey. I am talking about directional alignment on helping all customers move on to Valkey with minimum possible friction.

I think we should start with first principals and decide what we want the APIs to look, and then decide if we want to be API compatible with Redis. You are starting with the assumption that our users are migrating from Redis, but that need not be the case. They also might be net new developers, and we want to build the right application for them. We may want to alter the APIs to better suite those users.

It should always be evaluated case by case, and should not be a general tenet. I also would bias to skipping APIs that don't make a lot of sense. For example, I know in the search modules they implemented functionality like FT.CONFIG SET, which has largely been replaced with the module config functionality.

madolson · 2024-06-13T04:29:51Z

This Module supports every Bloom Filter command (from ReBloom) except for the BF.LOADCHUNK and BF.SCANDUMP and the commands have been implemented with ReBloom compatibility. The reason for not implementing the two cmds is because the Module provides the ability to load and save BloomModule data type items during RDB load and save. BF.LOADCHUNK and BF.SCANDUMP are APIs to load BloomModule data types through commands, but since we will provide RDB save & load and also AOF Rewrite, having specific commands for the same purpose was not considered as required.

This got me thinking about the on-disk format compatibility, which would be another very valuable property. Though I can see it being harder to achieve.

This can always be re-evaluated if we think it is useful

I agree.

Along the compat topic, I would also like the module maintainer to provide migration best practices, when applicable.

On the compat topic, we have a lot of issues to deal with the issue that Redis RDB OP code has changed. I documented the issue here: #645. We don't have a good compatibility story in general with Redis.

hpatro · 2024-06-25T17:34:47Z

@valkey-io/core-team Any TSC interested in helping shape this up? I think this would be a nice module to start with and help set the baseline for other modules in the future.

zuiderkwast · 2024-06-25T20:59:56Z

I'm not particularly interested in spending time with this, but I'm in favor of accepting it, with the relevant bloom filter commands being rebloom-compatible. It's no problem that it excludes non-bloom probabilistic filters (they can be provided by another module in the future) and the obsolete commands (dump/load).

PingXie · 2024-06-26T02:17:59Z

I think this would be a nice module to start with and help set the baseline for other modules in the future.

+1. I am in favor of accepting this module too.

hwware · 2024-06-26T13:50:04Z

@hpatro I already spoke with Ping and Viktor privately, I will take this module support, Thanks

KarthikSubbarao · 2024-07-18T20:06:13Z

Hello - I wanted to post an update here regarding the work remaining and also follow up on the next steps regarding the valkey-bloom Module's review.

From this list, do we want to close out on all these items before the Module can be accepted into the Valkey project?

Or instead, would we want to pick and address the "high priority" items from this list as a requirement? (And continue addressing the remaining as follow up issues)

Issues remaining to close on:

Performance Comparison with ReBloom - Both Memory Usage and Latency/TPS
Integration Testing strategy for Modules
- We can consider using this approach to write integration tests for valkey-bloom: https://github.com/valkey-io/valkeymodule-rs/blob/main/tests/integration.rs
Compatibility Story (with ReBloom). We are currently API compatible with the BloomFilter APIs of ReBloom. However, we are not RDB compatible.
AOF Rewrite Support (Datatype callback) - If we want, we can support AOF Rewrite. However, Bloom is not like Data types like String where we can write a command to AOF to recreate it exactly. We can execute a command (BF.RESERVE / BF.INSERT) and it will result in an empty bloom object being created (without any items added / bits set). If we want the exact same bloom object to be created, we need a mechanism for restoring an item using a dump containing bit arrays of existing objects. This means we need to support operations such as BF.LOADCHUNK - which restores from a dump of the BloomObject. On the other hand, if we are OK with AOF Rewrite resulting in empty bloom objects created, we can just write BF.REWRITE / BF.INSERT to the AOF file.
Large value consideration: We can consider exempting bloom objects greater than X bytes from synchronous free and from defrag. We can also consider available memory based validation before BloomObject creation.
Configs and Tuning - In general, we need to review min/max/default of all existing configs - Default Capacity, Default Expansion Rate. We can also consider the following additional configs:
- Default False Positive Rate.
- Max number of sub-filters (scaling) per Bloom object.
- Bytes Threshold after which we exempt items from synchronous free and from defrag.
Digest datatype callback: This will be invoked from DEBUG DIGEST and generates a checksum on the BloomObject.
Metrics / Counters (Optional)
- Bytes and Number of items, number of defrag hits

Performance Comparison

Default parameters:

Number of requests = 1000000
Number of Filters per BloomFilter object = 1
Expansion rate = Scaling disabled
Number of BloomFilter objects = 1
False positive rate = 0.001
Number of cores on the machine = 4
Command Used: BF.EXISTS

Starting the server (pinned to 2 cores):

valkey-server --loadmodule <path_to_module>
sudo taskset -cp 0,1 <valkey-server pid>

Creating the BloomObject & Running the benchmark (pinned to 1 core):

127.0.0.1:6379> bf.reserve key 0.001 <capacity>
sudo taskset -c 2 /home/ec2-user/valkey-benchmark -n 1000000 BF.EXISTS key item

Performance Comparison (Non Scaling)

Capacity	BloomFilter objects	ReBloom p50	ValkeyBloom p50	ValkeyBloom p50 % Increase	ReBloom p95	ValkeyBloom p95	ValkeyBloom p95 % Increase	ReBloom p99	ValkeyBloom p99	ValkeyBloom p99 % Increase	ReBloom TPS	ValkeyBloom TPS	ValkeyBloom TPS % Increase
1	1	0.24967	0.24433	-2.14%	0.303	0.30567	0.88%	0.51633	0.495	-4.13%	98130.28667	99324.09667	1.22%
100	1	0.24967	0.247	-1.07%	0.32167	0.25767	-19.90%	0.54033	0.431	-20.23%	97041.71333	103638.66333	6.80%
10000	1	0.247	0.247	0.00%	0.319	0.30833	-3.34%	0.52167	0.47367	-9.20%	97793.19667	99049.15667	1.28%
1000000	1	0.247	0.247	0.00%	0.30033	0.31367	4.44%	0.50033	0.48433	-3.20%	99641.27	99115.8	-0.53%
100000000	1	0.24967	0.24967	0.00%	0.327	0.327	0.00%	0.559	0.53233	-4.77%	94195.21	95751.18	1.65%

Regarding performance comparison - valkey-bloom performs roughly the same as Rebloom during Non Scaling tests.

hwware · 2024-07-19T15:05:58Z

Thanks for sharing the information. Reference all above comments, I prefer you can begin with the following items.

Performance Test for Auto Scaling of Bloom Filters
Create the Integration Testing framework (But I am hesitating which language we should use, Rust, Python or TCL which I do not like ^_^), do you some idea to suggest?

I always want to build a proper framework then add more features here, then other contributors could work on the same way.
And we can use the existing features to build the baseline for performance, then it is easier to figure out how much the new feature influence current system if the feature is involved..

After above 2 tasks are done, let us check which work we should do on next step, how do you think?

zuiderkwast · 2024-07-19T21:46:38Z

On the other hand, if we are OK with AOF Rewrite resulting in empty bloom objects created, we can just write BF.REWRITE / BF.INSERT to the AOF file.

I don't understand. Does this mean that the data is lost on AOF rewrite?

KarthikSubbarao · 2024-07-20T01:39:15Z

On the other hand, if we are OK with AOF Rewrite resulting in empty bloom objects created, we can just write BF.REWRITE / BF.INSERT to the AOF file.
I don't understand. Does this mean that the data is lost on AOF rewrite?

Currently, we have not yet implemented the AOF callback and I was hoping to first discuss this here.

If we handle AOF rewrite by saving commands such as BF.RESERVE or BF.INSERT we will be able to re-create a Bloom Object with the same properties (expansion, capacity, false positive rate). However, the bits will not be set and when restored from the AOF, no items would "exist" (be set) on the bloom object. So, yes, this can be considered data loss.

This is not an issue with RDB Load and Save because we save the raw BloomObject's byte vector data during RDB Save and we are able to restore this during RDB Load.

For AOF rewrite to support saving the exact state of the Bloom object (including the items that were "set"), we need to include the dump in the AOF and will need to support a command that can restore this data. ReBloom supports a BF.LOADCHUNK command to restore a bloom object from its dump

zuiderkwast · 2024-07-20T21:54:45Z

Thanks. It sounds to me that LOADCHUNK is needed. Is seems wrong to me to assume a BF is only volatile cache data.

hwware · 2024-07-23T17:39:48Z

As we discussed in the meeting, now we are blocked by 2 issues:

RDB compatible
AOF rewrite problem,

Please draft these 2 points in our rfc https://github.com/valkey-io/valkey-rfc

@KarthikSubbarao Thanks

KarthikSubbarao · 2024-07-25T23:28:54Z

Hello - I have documented the ValkeyBloom feature as an RFC here: valkey-io/valkey-rfc#4

Please do take a look when possible

ashtul · 2024-08-24T06:01:53Z

@hpatro there is an existing python-based test framework (BSD licensed) from the early days that has been kept and used with all of the goodform modules. The earlier version is named 'rmtest' (Redis Modules Test) and I've been working on transitioning it to 'vkmtest' (ValKey Modules Test). Maybe it'll work for the Rust module testing too - you can find the initial version here: https://github.com/goodform/valkey-module-test

I believe the current testing framework in use is https://github.com/RedisLabsModules/RLtest

ashtul · 2024-08-24T06:14:22Z

@hwware @zuiderkwast @KarthikSubbarao
Here is a link to an issue at Redis bloom about the of and rib issues.

RedisBloom/RedisBloom#12

zuiderkwast · 2024-08-28T08:34:59Z

That issue mentions a limit of 512MB for DUMP/RESTORE due to the protocol limits. It's for any type, so not specific to bloom filters. A large string or hash has the same problem. This is configurable though. In valkey.conf, I find this comment:

# In the server protocol, bulk requests, that are, elements representing single
# strings, are normally limited to 512 mb. However you can change this limit
# here, but must be 1mb or greater
#
# proto-max-bulk-len 512mb

(We really should change "mb" to "MB" though, because in the metric standard "m" = milli, "M" = mega and in internet standards define "b" = bit, "B" = byte. Millibit doesn't make much sense.)

ashtul · 2024-09-01T10:40:00Z

@madolson @zuiderkwast @KarthikSubbarao @gkorland

I am working on a new rust bloom filter implementation which I think valkey can benefit from.

It will have the following features:

Scalablability.
Number of hash functions is floored (link).
The filter can be locked.
- No additional items can be added.
- The filter is compressed to release unused memory and increase the false-positive rate to the user's defined rate.

Is there a timeline for the release of the module?

hpatro mentioned this issue May 1, 2024

[NEW] Valkey Modules Bundling #408

Open

madolson added the major-decision-pending Major decision pending by TSC team label Jun 12, 2024

madolson added major-decision-approved Major decision approved by TSC team and removed major-decision-pending Major decision pending by TSC team labels Sep 1, 2024

[NEW] Valkey-Bloom: BloomFilter support for Valkey. #407

[NEW] Valkey-Bloom: BloomFilter support for Valkey. #407

Comments

KarthikSubbarao commented Apr 30, 2024

natoscott commented Apr 30, 2024 • edited Loading

madolson commented Apr 30, 2024

hpatro commented Apr 30, 2024

natoscott commented Apr 30, 2024

madolson commented Apr 30, 2024

madolson commented Apr 30, 2024

natoscott commented Apr 30, 2024

natoscott commented Apr 30, 2024 • edited Loading

madolson commented May 1, 2024

hwware commented May 1, 2024

daniel-house commented May 1, 2024

madolson commented Jun 11, 2024

zuiderkwast commented Jun 11, 2024

PingXie commented Jun 12, 2024

madolson commented Jun 12, 2024

zuiderkwast commented Jun 12, 2024

PingXie commented Jun 12, 2024

KarthikSubbarao commented Jun 12, 2024 • edited Loading

hpatro commented Jun 12, 2024

PingXie commented Jun 12, 2024

hpatro commented Jun 12, 2024 • edited Loading

PingXie commented Jun 12, 2024

hpatro commented Jun 13, 2024

zuiderkwast commented Jun 13, 2024

madolson commented Jun 13, 2024 • edited Loading

madolson commented Jun 13, 2024 • edited Loading

hpatro commented Jun 25, 2024

zuiderkwast commented Jun 25, 2024

PingXie commented Jun 26, 2024

hwware commented Jun 26, 2024

KarthikSubbarao commented Jul 18, 2024 • edited Loading

Issues remaining to close on:

Performance Comparison

hwware commented Jul 19, 2024 • edited Loading

zuiderkwast commented Jul 19, 2024

KarthikSubbarao commented Jul 20, 2024

zuiderkwast commented Jul 20, 2024

hwware commented Jul 23, 2024

KarthikSubbarao commented Jul 25, 2024

ashtul commented Aug 24, 2024 • edited Loading

ashtul commented Aug 24, 2024 • edited Loading

zuiderkwast commented Aug 28, 2024

ashtul commented Sep 1, 2024

natoscott commented Apr 30, 2024 •

edited

Loading

natoscott commented Apr 30, 2024 •

edited

Loading

KarthikSubbarao commented Jun 12, 2024 •

edited

Loading

hpatro commented Jun 12, 2024 •

edited

Loading

madolson commented Jun 13, 2024 •

edited

Loading

madolson commented Jun 13, 2024 •

edited

Loading

KarthikSubbarao commented Jul 18, 2024 •

edited

Loading

hwware commented Jul 19, 2024 •

edited

Loading

ashtul commented Aug 24, 2024 •

edited

Loading

ashtul commented Aug 24, 2024 •

edited

Loading