Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(7/8) Develop APIs for ProtectedStorageEntry to enable testing and code refactoring #3583

Merged
merged 30 commits into from
Nov 18, 2019

Conversation

julianknutsen
Copy link
Contributor

@julianknutsen julianknutsen commented Nov 9, 2019

New commits start 54b4b4d

I expect a bit of discussion on this one based on my previous feedback, so I wanted to outline my thoughts here so that the best path forward can be taken with regards to testing, maintainability, and clean programming practices.

This entire stack was based on testing the P2PDataStore and refactoring it so it is more maintainable and easier to add new features in the future.

The module was almost impossible to test when starting out. I decided to write very heavy-handed integration tests while looking at the implementation to verify I covered all the branches and edge cases. The end result #3554 allowed refactoring to be done while ensuring the various combinations of ProtectedStorageEntrys and Payloads still worked correctly.

But in order to do that, the tests needed to create REAL ProtectedStorageEntry & ProtectedMailboxStorageEntry objects from scratch and manipulate them into the various valid and invalid configurations to exercise the code. Real keys, real hashes, a lot of overhead for testing that should be pretty straight forward.

The issue is that too much of the code exists in private helper functions inside the P2PDataStore itself. The only testing that can be done is front door integration testing where the tests know way too much about the implementation.

My solution to the complexity that I present in the next 2 pull requests is to turn ProtectedStorageEntry into a real object with a real API that can be unit tested. Determining whether or not an Entry is valid for an add operation shouldn't depend on the P2PDataStore internal state, or have to be run through the Client API to verify correctness.

If you are at all interested, I urge you to look at the unit tests in the following pull request (ProtectedStorageEntryTest & ProtectedMailboxStorageEntryTest) and see how easy it is to verify the correctness of an add or remove operation versus the path the integration test had to take in the beginning.

In addition to adding the ability to unit test the Entrys, real unit tests can now exist for the P2PDataStore. Instead of constructing a real object, they now just build a mock Entry (something that wasn't possible with the old object relationship). The result is that the unit tests for P2PDatastore can test only the code they need to (what data structures to update, who to signal, etc) and not worry about the implementation specifics of Mailbox vs. Normal Entrys.

The end result is code that has unit tests, has integration tests, is easy to navigate, and has line and branch coverage over 80%. More important than the numbers is that a developer can go in and change the code and get feedback on whether or not it worked. Something that wasn't possible before.

In summary, I would request that anyone who is interested look through the testing and weigh in on the maintainability and testing strategy of this refactor. The reality is that this code wasn't tested, has plenty of bugs, and was hard to maintain and develop against. It is only one of many that likely have the same problem and starting to have a discussion on the practices that everyone wants in Bisq should happen sooner than later.

Pull Request (7/8)

This pull request will introduce the API, test it, and use the new benefits to combine the remove() and removeMailboxData() paths.

Pull Request(8/8)

This pull request will make use of the new objects to create clean mocks that clean up the integration tests and remove a lot of the heavy-handed object creation.

If you got this far, thanks again for taking the time to look.

Fix a bug where remove() was called in the addMailboxData()
failure path.

1. Sender's can't remove mailbox entries. Only
   the receiver can remove it so even if the previous add() failed and
   left partial state, the remove() can never succeed.

2. Even if the sender could remove, this path used remove() instead
   of removeMailboxData() so it wouldn't have succeed anyway.

This patch cleans up the failure path as well as adds a precondition
for the remove() function to ensure future callers don't use them for
ProtectedMailboxStorageEntrys.
Add comments and refactor the body in order to make it easier to
understand.
Refactor addProtectedStorageEntry for more readability and add comments
to help future readers.
Refactor for readability and add comments for future readers.
Refactor for readability and add comments to help future readers.
Refactor for readability and add comments for future readers.
Removed duplicate log messages that are handled inside the various helper methods
and print more verbose state useful for debugging.

Updated potentially misleading comments around hashing collisions
…icates

Now returns false on duplicate sequence numbers. This matches more of
the expected behavior for an add() function when the element previously exists.

The only callers are either P2PService users that always increment the
sequence number or the onMessage() handler which doesn't verify the return
so there will be no visible change other than the increased readability
of the code and deduplication of the code paths.
Now returns false if the sequence number of the refresh matches
the last operation seen for the specified hash. This is a more expected
return value when no state change occurs.

The only callers are either P2PService users that always increment the
sequence number or the onMessage() handler which doesn't verify the return
so there will be no visible change other than the increased readability
of the code and deduplication of the code paths.
…duplicate sequence #s

Remove operations are now only processed if the sequence number
is greater than the last operation seen for a specific payload.

The only creator of new remove entrys is the P2PService layer that always increments
the sequence number. So, this is either left around from a time where
removes needed to work with non-incrementing sequence numbers or just
a longstanding bug.

With the completion of this patch, all operations now require increasing
sequence numbers so it should be easier to reason about the behavior in
future debugging.
Use the DI Clock object already available in P2PDataStore, instead
of calling System.currentTimeMillis() directly. These two functions
have the same behavior and switching over allows finer control
of time in the tests.
Deduplicate some code in the ProtectedStorageEntry constructors in
preparation for passing in a Clock parameter.
Switch from System.currentTimeMills() to
Clock.millis() so dependency injection can
be used for tests that need finer control of time.

This involves attaching a Clock to the resolver
so all fromProto methods have one available when they
reconstruct a message. This uses the Injector for the APP
and a default Clock.systemDefaultZone is used in the manual
instantiations.

Work was already done in bisq-network#3037 to make this possible.

All tests still use the default system clock for now.
Reduces non-deterministic failures of the refreshTTL tests that resulted
from the uncontrollable System.currentTimeMillis().

Now, all tests have extremely fine control over the elapsed time between
calls which makes the current and future tests much better.
Add tests for removing expired entries and optionally purging
the sequence number map. Now possible since these tests have
control over time with the ClockFake.

The remove validation needed to be improved since deletes through
the expire path don't signal HashMap listeners or write sequence numbers.
The original test would take over 5 seconds. Allow tests to set the number
of required entries before purge to a lower value so the tests
can run faster with the same confidence.
The custom code to verify the refreshTTLMessage's signature and update
an entry isn't necessary. Just have the code construct an updated
ProtectedStorageEntry from the existing and new data, verify it,
and add it to the map.

This also allows the removal of the ProtectedStorageEntry APIs
that modify internal state.
The code around validating MailboxStoragePayloads is subtle when
a MailboxStoragePayload is wrapped in a ProtectedStorageEntry. Add tests
to document the current behavior.
Method bodies are copied from P2PDataStore to separate refactoring
efforts and behavior changes.

Identified a bug where a ProtectedMailboxStorageEntry mailbox entry
could be added, but never removed.
Extract code from P2PDataStore, test it, and use new methods.
Now that the objects can answer questions about valid conditions
for add/remove, ask them directly.

This also pushes the logging down into the ProtectedStorageEntry and
ProtectedMailboxStorageEntry and cleans up the message.
Add toString() for ProtectedStorageEntry so log messages have useful
information and clean up the formatting.
Move the signature checks into the objects to clean up the calling code
and make it more testable.

The testing now has to take real hashes so some work was done in the fixtures
to create valid hashable objects.
Move the signature checks into the objects to clean up the calling code
and make it more testable.
This mailbox-only check can now exist inside the object for which it
belongs. This makes it easier to test and moves closer to allowing
the deduplication of the remove() methods.
The current check verifies that the stored Payload.ownerPubKey == stored Entry.ownerPubKey.

This is the same check that was done when the item was originally added
and there is no reason to do it again.
Let the objects compare their metadata instead of doing it for them. This
allows for actual unit testing and paves the way for deduplicating the
remove code paths.

This patch also removes an unnecessary check around comparing the hash
of the stored data to the new data's hash. That check can't fail since
the hash was a requirement for the map lookup in the first place.
Make the remove validation more robust by asserting that the
correct remove message is broadcast. This will provide a better
safety net when combining the remove functions.
Now that all the code is abstracted and tested, the remove()
and removeMailboxData() functions are identical. Combine them and update
callers appropriately.

Now, any caller don't need to know the difference and it removes the
sharp edge originally found in bisq-network#3556
Use a more compact version of string formatting
in log messages

Rename isMetadataEquals to matchesRelevantPubKey
which is more descriptive of the actual check
Copy link
Contributor

@freimair freimair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are correct, this will trigger some discussion. On the one hand, I really like how this clears up the code base, makes everything testable and more readable as well. On the other hand, you rely on an object to report on its own validity. From my side, I like the testing motivation so that is OK with me. We can always revisit this topic in the future.

please note the inline comment. Awaiting your response there.

@julianknutsen
Copy link
Contributor Author

julianknutsen commented Nov 15, 2019

Another way I have thought about this refactor and why I think it is a good path moving forward is that all of these instance methods are equivalent to just testable static methods that operate on final variables. For example, you could just have a bunch of helper functions that look like this:

public static boolean isValidForAddOperation(ProtectedStorageEntry entry) {
  // check signature
  Sig.verify(entry.getPubKey(), hash(entry.getProtectedPayload()))

  // check key matches... etc...
}

BUT, by using them as instance methods you get the benefit of being able to mock them easily for testing which I think outweighs the anxiety around hiding these checks behind an interface. It may make sense in the future to guarantee that these types of functions only operate on final variables so they are equivalent to static methods with test benefits.

Copy link
Contributor

@freimair freimair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utAck

Copy link
Contributor

@ripcurlx ripcurlx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

@ripcurlx ripcurlx merged commit f8372f4 into bisq-network:master Nov 18, 2019
@julianknutsen julianknutsen deleted the entry-oo branch November 18, 2019 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants