Skip to content
This repository has been archived by the owner on Sep 22, 2022. It is now read-only.

Can help implement MDBX_SET_UPPERBOUND? #250

Closed
gcxfd opened this issue Dec 10, 2021 · 11 comments
Closed

Can help implement MDBX_SET_UPPERBOUND? #250

gcxfd opened this issue Dec 10, 2021 · 11 comments

Comments

@gcxfd
Copy link
Contributor

gcxfd commented Dec 10, 2021

I'll use an integer key to illustrate.

I have implemented range (1..3) in rust, where the interval is open and closed, that is, contains 1, does not contain 3, and uses MDBX_SET_LOWERBOUND is easy to implement.

Now, I want to implement range (3..1), which is an iterator from 3 to 1, with 3 and no 1.

But I find there is no MDBX_ SET_ UPPERBOUND, it is not easy to locate the first key less than or equal to 3 (also need to locate the last value of the key if it is DUPSORT).

If want to do this in rust, of course can, but it's not elegant. There are many logic to process (for example, if the starting number is greater than all values, MDBX_SET_RANGE returns MDBX_NOTFOUND, which requires MDBX_LAST).

If there is an MDBX_SET_UPPERBOUND Everything will be simple

@gcxfd gcxfd changed the title Can help implement MDBX_ SET_ UPPERBOUND? Can help implement MDBX_SET_UPPERBOUND? Dec 10, 2021
@erthink
Copy link
Owner

erthink commented Dec 11, 2021

Unfortunately, I don't have time to do this right now. Please wait a few days.

@gcxfd
Copy link
Contributor Author

gcxfd commented Dec 12, 2021

Thanks for the reply, I have tried to implement a version by myself, see pull request #251.

I am not in a hurry.
I will finish the other parts first.

@erthink
Copy link
Owner

erthink commented Dec 13, 2021

Please check it.

@gcxfd
Copy link
Contributor Author

gcxfd commented Dec 13, 2021

image
@erthink

I use test code
https://github.com/rmw-lib/mdbx/blob/f3be415a6f4956d947de69ec72e1eabd9b680e69/examples/main.rs

output as the image above

The left side of the screenshot is my code implementation output, and the right side of the screenshot is your code implementation output

I just change c code from my implementation to your implementation

I found that there are some inconsistencies with expectations

Database Test

  1. When the value is empty and DUPSORT, the last element of DUPSORT should be located

Database Test2

  1. range 3..1 why output 9 ?
  2. range 10..1 nothing output

@erthink
Copy link
Owner

erthink commented Dec 13, 2021

Semantic of the lower bound: to the first element in a set that is not less than (i.e. greater or equal to) value, or the end if no such element is found.
Semantic of the upper bound: to the first element in a set that is greater than value, or the end if no such element is found.
No any other (i.e. custom) lower/upper bounds will be implemented inside libmdbx.

Semantic of range 3..1 and range 10..1 is unclear for me, because this are empty ranges by definition, i.e. since begin ≥ end.
If you need inverted ranges -∞…A] [B…+∞ or ranges for descending/reverse order then you should implement ones on the basis of limbdbx' cursor operations like the libfpta do it.

@gcxfd
Copy link
Contributor Author

gcxfd commented Dec 13, 2021

Semantic of the lower bound: to the first element in a set that is not less than (i.e. greater or equal to) value, or the end if no such element is found. Semantic of the upper bound: to the first element in a set that is greater than value, or the end if no such element is found. No any other (i.e. custom) lower/upper bounds will be implemented inside libmdbx.

Semantic of range 3..1 and range 10..1 is unclear for me, because this are empty ranges by definition, i.e. since begin ≥ end. If you need inverted ranges -∞…A] [B…+∞ or ranges for descending/reverse order then you should implement ones on the basis of limbdbx' cursor operations like the libfpta do it.

I did not express myself accurately.

What I need is not MDBX_SET_UPPERBOUND
I need is MDBX_SET_LOWERBOUND_DESC
means in the descending order , the first element less or equal x

for example the database have value (3,0) (3,8) (9,0)
so when I use MDBX_SET_LOWERBOUND_DESC(key 3 , value empty) , I think it will locate at (3,8) not (3,0)
so I can simple use MDBX_PREV to return the descending range

and when I use key 10 , upper bound should be (9,0)

@erthink
Copy link
Owner

erthink commented Dec 13, 2021

think Semantic of the upper bound: should be in the descending order , the first element greater or equal...

This is wrong.
Please study the math/algorithms base of data structures etc.
For instance https://en.cppreference.com/w/cpp/algorithm/upper_bound

@gcxfd
Copy link
Contributor Author

gcxfd commented Dec 13, 2021

sorry, I corrected my description

What I need is not MDBX_SET_UPPERBOUND
I need is MDBX_SET_LOWERBOUND_DESC
means in the descending order , the first element less or equal x

for example the database have value (3,0) (3,8) (9,0)
so when I use MDBX_SET_LOWERBOUND_DESC(key 3 , value empty) , I think it will locate at (3,8) not (3,0)
so I can simple use MDBX_PREV to return the descending range

and when I use key 10 , upper bound should be (9,0)

@erthink
Copy link
Owner

erthink commented Dec 13, 2021

Anyway you can and should implement any custom search operation by available cursor operations, the ones are enough for this (sure).
But nothing like MDBX_SET_LOWERBOUND_DESC will be added, because it is actually extra and algorithmically unneeded operation that will doubt users and mesh the code.

Please study a theoretical basis to be known how do to that you need with currently available ops but without extra ones.

@gcxfd
Copy link
Contributor Author

gcxfd commented Dec 14, 2021

C++ only has upper_bound and lower_bound but not upper_bound_rev and lower_bound_rev because it is easy to achieve this with the C++ interface via the reverse iterator.

Find last element less than x:

auto iter = std::upper_bound(v.rbegin(), v.rend(), x, std::greater<int>());
if(iter == v.rend())
    std::cout<<"no found";
else
    std::cout<<*iter;

Find last element less than equal x:

auto iter = std::lower_bound(v.rbegin(), v.rend(), x, std::greater<int>());
if(iter == v.rend())
    std::cout<<"no found";
else
    std::cout<<*iter;

But the lower_bound interface of mdbx is different from the lower_bound interface of C++.

Based on the existing lower_bound interface of mdbx, it is very troublesome to implement Find last element less than or equal x.

For example, when the data entries in the database are (3,0), (3,1), (4,1), and lower_bound(10,empty), it will return not found. At this time, you need to locate mdbx_last by yourself. When lower_bound(3,empty), lower_bound will return (3,0) instead of (3,1)

So, I think if there is a reverse lower_bound interface, it will be easier for the upper layer to implement reverse iteration.

@erthink
Copy link
Owner

erthink commented Dec 14, 2021

Historically (inherited from LMDB) libmdbx have MDBX_SET_RANGE which works like the std::lower_bound() for keys only and not taking into acctount a values for a DUPSORT-cases. Then the MDBX_GET_BOTH_RANGE is available to continue lower-bounding in a DUPSORT-cases.
These ops don't returns MDBX_RESULT_TRUE for compatibility, since ones were before the MDBX_RESULT_TRUE was introduced.

Additionally libmdbx have the MDBX_LOWERBOUND (and now the MDBX_UPPERBOUND) that also work like std::lower_bound() but look to values in DUPSORT-cases. This op returns MDBX_SUCCESS(0) for exact match and MDBX_RESULT_TRUE(-1) for non-exact succeeded search.

So as I answered earlier:

  • The cursor operations available now are enough to implement the logic you need. Therefore, adding new ones is unreasonable.
  • You should just RTFM and choosing ops to use.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants