-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose support for custom indexing #3469
Comments
very excited for geospatial x blockchain integration (side note, I'd like to integrate whitebox GAT with cosmos one day)
I would do the reverse, given that you can already build custom indices and associated query functionality on the store... we do it in staking to query for validators by pubkey, by power, etc. This approach could easily be expanded to query based on geolocation, or any other arbitrary property :) Glad to walk you through the code with you you for how to do the above. You'll want to program in a custom module With regards to custom tx indexing, I'd imagine it's possible with hooks, however this would be new functionality (more work than just building out in the existing framework) |
Thanks @rigelrozanski a walk through would be great if you have the time. I'm also realizing that it might better to do transaction indexing at the |
I'd like to get a better feel for your intentions.. what is the purpose (examples?) for utilizing custom tx indexing over custom index queries on the store (as I proposed). I'm having trouble thinking of a use case |
I described a bit of detail up in the Problem Definition up above. Basically we would like to have all geospatial data indexed in a database with dedicated geospatial support such as Postges. One use case would be generating map layers on the fly or doing analytics. By custom index queries, is what's happening in For our use cases, I'm pretty sure we want to be able to store things in an external database with robust secondary index support. Emulating something like a spatial index I think would be pretty complex with just the built-in KV store. |
There is no reason indexing to query by coordinates or any arbitrary field cannot be done using store indexing, so my perspective still holds, store indexing should be sufficient for your needs. If you wanted to have an external database which is subscribed to new data entering the blockchain the approach to use would be to subscribe to the existing tagging system (already built ;) ) and further simply query for the record from the store if it met the required condition to include that new data in your external database.
Yeah I think that's the obvious shorterm solution for complex GIS analysis ... the blockchain itself serves as the consensus layer on the data, whereas any number of viewers can be spun up for further interpretation of the data using centralized databases which are just feeding information from the blockchain. However I don't see why with further development GIS tooling cannot be used directly on the blockchain state. - in time.
That is probably the most complicated example but yes hahaha - that is an index.... checkout some of the specs for the other simpler indexes to get a better example: specs: https://github.com/cosmos/cosmos-sdk/blob/develop/docs/spec/staking/state.md#validator here is a simpler example, validators are also indexed by consensus address (even though the core record is indexed by operator address). This index is set for the first time here: From there it can be updated as required if the consensus address changes (currently in the x/staking this capability of updating the consensus address is not developed however) |
It would be really nice to have a driver model here where it is easy to specify which tags and implement outputs for common open source DBs (redis, postgres, mongo, etc...) |
Thanks for all your comments @rigelrozanski. I agree that in the future it will definitely be possible to support a broader range of GIS tooling on-chain. For now, I think it makes sense to use the blockchain as the "consensus layer" of the data as you say and off-load the geospatial querying to more specialized data stores. I've figured out a way that works for our purposes to have nodes optionally index data to PostGIS - which is more or less the industry standard for geospatial indexing - without any modifications needed to baseapp or ABCI. I'm doing this by intercepting the baseapp ABCI methods and optionally forwarding their data to the indexer. It may be a bit unconventional, but in Regen Ledger, keepers can also get a handle to the indexer and do some indexing while they're modifying the state store. I was initially going to use the tag approach as suggested, but in assessing our architecture I realize this will result in a lot of duplication of the logic that is already in the keepers and that some state changes won't easily be reflected by tags. So effectively in our model, the index becomes part of the app state lifecycle on nodes where it is enabled (except that it is never queried directly by nodes like the store which is the consensus view shared by all nodes). Anyway, thus far this approach seems to work. The code for this lives here in case anybody's interested: https://github.com/regen-network/regen-ledger/tree/master/index. I've abstracted a generic Anyway, since this is currently working for us, I'm going to go ahead and close this issue. But please let me know if you think this approach is more generically useful. I know you mentioned a driver approach @jackzampolin where you can specify sets of tags - in this approach all tags are currently indexed, but otherwise I think it's not too far off and I think it would not be a bad idea to support "indexing interceptors" like this directly in |
I'm going to re-open this since after the dev call it sounds like there may be interest in this approach. |
tendermint/tendermint#4466 partially solves this? but this may need a issue in tendermint as well |
@marbar3778 ideally I'd like to have support on the SDK side for listening to changes on the store which is sort of orthogonal to what Tendermint can provide. But upstream improvements to Tendermint are always useful! |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
In a recent call it was said the best solution for this is to not rely on tendermint indexing and provide a custom indexing solution directly from the sdk. |
Yes, I believe we already have an ADR for this too -- https://github.com/cosmos/cosmos-sdk/blob/master/docs/architecture/adr-038-state-listening.md |
closing this in favour of adr038 issue |
Summary
Regen Network would like functionality exposed in the Cosmos SDK to do custom indexing of blockchain transactions and possibly state.
Problem Definition
Regen Ledger is being built as a global ledger for ecological applications. Our data set will form a global ledger of ecological claims. All of these claims will be geo-tagged. To begin with, we would like to have all blockchain data about ecological state indexed in a geospatial data store like Postgis so that we can show a global map of data on the blockchain. Our blockchain will also serve as the backbone for a new decentralized ecological state verification infrastructure for verifying claims that a land owner may make such as having sequestered carbon in soils or preserved forest land. Part of the infrastructure for doing this verification will involve "verification oracles" having access to an index of all claims made on the blockchain up to a certain point in time for a particular piece of land. For instance, field scientists may be taking measurements with a soil sensor that records the results on the blockchain. A verification algorithm may then request to search the blockchain data for all soil sensor reports for that piece of land recorded in the past year.
Proposal
Our need is primarily to expose functionality within Cosmos/Tendermint that lets us implement custom indexing.
Two possible approaches discussed with @jaekwon were using Tendermint's existing
TxIndexer
or possibly the web socket interface. Our assessment is that creating a custom implementation ofTxIndexer
would be preferable because of the possible instability of the web socket connection.It seems that if we could get access to the underlying Tendermint
Node
(when running Tendermint in process), we could access theEventBus
and from there create a newIndexerService
with our customTxIndexer
. If that will work, I don't see any reason to make any other modifications to how indexing works at the Tendermint level. The main change would be creating a hook at the Cosmos level that exposes the underlyingNode
. It looks like the place to do this would be inStartCmd
inserver/start.go
wherestartInProcess
returns theNode
but it gets discarded: https://github.com/cosmos/cosmos-sdk/blob/develop/server/start.go#L41. Let me know if this solution makes sense and how best to expose it and then I can put together a PR.Beyond this, it occurs to me that being able to index directly off changes to the multi-store might be useful. My inclination is to start with just indexing transactions and then revisit indexing off of state later if needed. The way I can see this working is pretty similar to how
TraceKVStore
currently works, except that we'd only be observing writes and there would be no need to base64 encode keys and values. This approach also looks pretty straightforward, but we'll see if it's needed.For Admin Use
The text was updated successfully, but these errors were encountered: