Skip to content
This repository has been archived by the owner on Mar 27, 2024. It is now read-only.

feat: Enforce EDV document ID format in EDVFormatProvider #2263

Merged

Conversation

DRK3
Copy link
Contributor

@DRK3 DRK3 commented Oct 21, 2020

Signed-off-by: Derek Trider Derek.Trider@securekey.com

also closes #2228

@DRK3
Copy link
Contributor Author

DRK3 commented Oct 21, 2020

An explanation of why this is needed:

While working on the EDV REST provider, I realized an issue with the approach I was taking. The spec says that the ID in a structured document and encrypted document MUST be a Base58-encoded 128-bit random value. This PR provides a solution to that problem while still allowing the client to use a "regular" non-Base58-encoded 128-bit random value.

Here's how it works:

  • The EDVFormatProvider is initialized with a key to be used for HMAC operations.
  • When storing documents, an EDV-compatible ID is randomly generated. This is the ID that is used for the underlying provider. The "user-provided" key (the one that's passed into the EDVFormatProvider) is hashed using an HMAC function. The hashed value is set as an EDV encrypted index
  • When retrieving documents, the "user-provided" key is hashed. This hashed value is then used to query the underlying provider for the original document.

This generic indexing+querying approach should work with the upcoming EDV REST provider and any other providers that support querying.

@sandrask @troyronda And anyone else who reviews this: let me know what you think of this approach.

@@ -38,10 +43,24 @@ var (

type marshalFunc func(interface{}) ([]byte, error)

type MACComputer interface {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this name is a little bit funny. Let me know if you have any suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MACDigester is better as we are usually talking about a message digest when referring to Mac.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@DRK3 DRK3 force-pushed the EnforceEDVDocumentIDFormat branch from 4a5381e to 57eeefe Compare October 21, 2020 22:11
llorllale
llorllale previously approved these changes Oct 22, 2020
Copy link
Contributor

@llorllale llorllale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. Messy workaround necessary - request to remove this document ID constraint from the spec was filed a while ago: https://github.com/decentralized-identity/secure-data-store/issues/112

@DRK3 DRK3 requested a review from llorllale October 22, 2020 20:54
@llorllale llorllale dismissed their stale review October 22, 2020 20:54

failing CI

@@ -734,7 +734,7 @@ func (m *mockStore) Delete(k string) error {
return m.delete(k)
}

func (m *mockStore) Query(query string) (storage.StoreIterator, error) {
func (m *mockStore) Query(indexKey string, indexValue string) (storage.StoreIterator, error) {
return m.query(query)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query argument is not refactored following the signature change above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@DRK3 DRK3 force-pushed the EnforceEDVDocumentIDFormat branch from 57eeefe to e19bd2e Compare October 24, 2020 02:27
@codecov
Copy link

codecov bot commented Oct 24, 2020

Codecov Report

Merging #2263 into master will decrease coverage by 0.02%.
The diff coverage is 86.48%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2263      +/-   ##
==========================================
- Coverage   89.90%   89.87%   -0.03%     
==========================================
  Files         226      226              
  Lines       15103    15158      +55     
==========================================
+ Hits        13579    13624      +45     
- Misses        903      908       +5     
- Partials      621      626       +5     
Impacted Files Coverage Δ
pkg/storage/wrapper/prefix/prefix_wrapper.go 94.44% <ø> (ø)
pkg/storage/edv/formatprovider/formatprovider.go 90.09% <84.84%> (-9.91%) ⬇️
...storage/edv/documentprocessor/documentprocessor.go 86.66% <100.00%> (+0.95%) ⬆️
pkg/storage/mem/mem_store.go 100.00% <100.00%> (+1.08%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f8f6921...b51ab9a. Read the comment docs.

@DRK3 DRK3 force-pushed the EnforceEDVDocumentIDFormat branch from e19bd2e to 4d0b3d4 Compare October 24, 2020 02:33
// DocumentProcessor represents a type that can encrypt and decrypt between
// Structured Documents and Encrypted Documents.
type DocumentProcessor interface {
Encrypt(*edv.StructuredDocument) (*edv.EncryptedDocument, error)
Encrypt(*edv.StructuredDocument, []edv.IndexedAttributeCollection) (*edv.EncryptedDocument, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed these issues:

  • DocumentProcessor is inside the edv/formatprovider package instead of being a first-class storage citizen (like we discussed previously). To recall: the "document processor" encrypts all artifacts into a standard form and would be reused with other storage implementations
  • It looks like the documentprocessor does not encrypt the indexed attribute collection. Does it really need to be an argument? It's not even returned from Decrypt(), so there is an asymmetry here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The DocumentProcessor is used by the EDVFormatProvider, which allows the EDV format to be reused with any other storage implementation. There's no need for a "standard form".
  • The indexed attributes are hashed before being passed in to the DocumentProcessor. This is what allows the documents to be queried. This allows us to still get documents based on a key per the usual provider method signature, so from the outside, an EDVFormatProvider (with whatever underlying provider you want) still works the same as any other provider. StructuredDocuments don't have encrypted indices, so they aren't returned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, I could create some sort of even more general DocumentProcessor class that just assumes JWE as what's being used for encryption. Or I could make it even more general and have it deal with just []byte. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized one potentially confusing thing: the EDV spec calls them "Encrypted Indices", although they're actually HMACed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the EDVFormatProvider will still require the more specific DocumentProcessor that deals with Structured and Encrypted Documents per the spec

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DRK3

Now, I could create some sort of even more general DocumentProcessor class that just assumes JWE as what's being used for encryption. Or I could make it even more general and have it deal with just []byte. What do you think?

Last time we agreed we'd use the EDV format across all stores. The format would be pluggable, thus we'd have an "edv format" and potentially other formats. All would adhere to some common interface. This interface would a first-class citizen under storage or somewhere not far from it.

I guess we can regroup on monday.

Copy link
Contributor

@llorllale llorllale Oct 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output of monday's session will be a GitHub issue with the high-level tasks and design of pluggable encrypted formats that can be reused across all store implementations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrote down general direction here: #2199 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created an issue and added a ToDo above the FormatProvider: #2273

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My next PR will be to generalize this FormatProvider. This will involve moving it out of the edv folder and moving logic around to different places

@DRK3 DRK3 force-pushed the EnforceEDVDocumentIDFormat branch 2 times, most recently from 4a1aad5 to f69d2ff Compare October 24, 2020 17:40
Copy link
Contributor

@llorllale llorllale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's first discuss this PR's direction as indicated here

@DRK3 DRK3 force-pushed the EnforceEDVDocumentIDFormat branch 4 times, most recently from 3cffc81 to 2cd41c3 Compare October 26, 2020 16:37
macCrypto *MACCrypto
indexKeyMACBase64Encoded string
marshal marshalFunc
randomBytesFunc generateRandomBytesFunc
}

func (s formatStore) Put(k string, v []byte) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't formatStore a pointer receiver (i.e., *formatStore)?

when in doubt, use a pointer receiver.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a mistake. I had originally fixed it in this PR (you'll see it if you go back a few force-commits), but since I'll be moving this entire file per #2263 (comment), I figured that it would make more sense to keep this PR small and fix it in the next one. I'll make a note in that issue to fix it so that I don't miss it.

Signed-off-by: Derek Trider <Derek.Trider@securekey.com>
@DRK3 DRK3 force-pushed the EnforceEDVDocumentIDFormat branch from 2cd41c3 to b51ab9a Compare October 26, 2020 17:04
@llorllale llorllale merged commit c70b759 into hyperledger-archives:master Oct 26, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Query method for in-memory store
5 participants