-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ExtraData de-duplication in MySQL backend #1968
Comments
It's an interesting request, though as you say possibly more a CT thing than generic. We have considered options like not storing leaf data in the database. As it's immutable it could be served from edge caches or whatever. That was more for performance than saving disk space though. e.g. you could pack up ranges of leaves and serve them much faster than Not sure we'd want to make this sort of schema change now but It's possible that experiments along these lines could be done with a modified CT personality that stores some sort of cache ID instead of the leaf data. That might be a place to start. |
From a purely Trillian point of view, Having the data together makes it possible (even though it's not the case at the moment) to make Note that if Trillian didn't use revisions for subtrees of logs (which I believe it doesn't need, only maps need them?), you might be looking at both a speedup of proof retrieval and a very sizable reduction in storage requirements (I don't have numbers handy, but if someone told me about 50%, I'd believe it). It would also probably speed up sequencing. |
This discussion has migrated to google/certificate-transparency-go#691, as it was CT-specific. Generalisations are possible, but we tend to think that they should happen on the personality side. I propose to close this issue here. |
A good follow-up request from this one would be optional interleaving |
(This is a rather complicated proposition, so it's more of a speculative request, rather than a concrete proposal.)
The CT personality builds leaves for inclusion that include in the
ExtraData
field the user submitted chain, minus the end-entity certificate. This data has relatively low cardinality and takes up a considerable amount of space in the storage backend.It would be great if there was an option to store the
ExtraData
in a separate table (or something) fromLeafData
so that they could be deduped (based on data hash or something) with a reference to the relevant row inLeafData
in a many-to-one setup. In certain setups this could save >50% of current storage requirements. It seems likely that this optimization is only really relevant to the CT usage of trillian, so I'm not entirely sure if there are issues this could cause for other personalities. This would, likely, incur slightly more expensive MySQL queries.The text was updated successfully, but these errors were encountered: