Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Response Assembler Refactor #138

Merged
merged 15 commits into from
Dec 23, 2020
Merged

Conversation

acruikshank
Copy link
Contributor

@acruikshank acruikshank commented Dec 21, 2020

Goal

A common complaint in this code base is tracking the flow from one module to another. This is a proposal to simplify that flow, at least from the standpoint of concurrency issues.

Implementation

Each response in go-graphsync touches several independent threads of execution, which creates additional confusion.
In the first thread, a selector traversal is executed. (see responsemanager/queryexecutor) as blocks load, we queue them up into blocks of response data to send over the wire with the PeerResponseSender (see responsemanager/peerresponsemanager/peerresponsesender.go)

In a second thread in PeerResponseSender, we take the blocks of response data and serialize them down to components that go into an actual graphsync wire message, and queue the message data to go over the wire with the MessageQueue
In a third thread in MessageQueue, we read the next message, serialize to protobuf and send it.

The second thread is the source of much difficulty, especially since it's not entirely independent of the third thread. Currently, the PeerResponseSender takes care of chunking up response data so we don't ever go over a certain size for the resulting message, but then has to stay in sync with the MessageQueue so that each block of response data a separate message goes over the wire.

Now that we've added the notification system, this adds a whole other rats nest of complexity, because we have to subscribe to notifications from the MessageQueue, and then republish them in the PeerResponseSender.

The goal of this refactor is simple: get rid of the second thread.

I started by attempting to do exactly that, and convert the AddResponse method of MessageQueue to operate like buildResponse method of PeerResponseSender. I attempted to move the chunking of responses into the MessageQueue while leaving the AddRequest method unchanged.

However, then I found that created a bunch of complexity and potential problems, which led to a different intuition: what if we made messages immutable, and forced them to be build entirely with a builder class when you wanted to mutate them (side note: this will be of much use should we at some point move to serializing these messages with github.com/ipld/go-ipld-prime). So I converted the ResponseBuilder to a general purpose message Builder, and then removed the AddRequest method from message queue so now we only have a single BuildMessage method.

I renamed the PeerResponseManager/PeerResponseSender to the ResponseAssembler as that's now all it does -- provide a nice DSL for assembler responses and it keeps track of block duplicates as it always has.

Few other extras:

  • graphSyncOptions struct in impl/graphsync.go to simplify the construction of the graphsync instance -- it was getting complicated when we had to construct graphsync itself before setting any options.
  • Refactored the peerProcess manager a bit to account for the fact that now the ResponseAssembler not longer has a thread. I may remove that abstraction entirely.

To Do:

  • renaming pass (check naming - possibly improve, clean up old shortened variable names, correct var names referenced in other modules)
  • code commenting pass (make sure public functions documented, add code comments in complicated logic -- i.e. message queue
    Failing tests:
  • github.com/ipfs/go-graphsync/messagequeue HARD
  • github.com/ipfs/go-graphsync/network EASY
  • github.com/ipfs/go-graphsync/peermanager EASY
  • github.com/ipfs/go-graphsync/requestmanager MEDIUM
  • github.com/ipfs/go-graphsync/responsemanager MEDIUM
  • github.com/ipfs/go-graphsync/responsemanager/responseassembler HARD

@hannahhoward hannahhoward force-pushed the chore/response_assembler_refactor branch from c789ca7 to e53444e Compare December 22, 2020 04:25
cleanup response manager test and continue cleaning up names
remove references to old peerresponsemanager, peerresponsesender
move large blocks test to message queue where it belongs, make sure we properly handle queued
messages
@hannahhoward hannahhoward changed the title fix broken tests from response sender refactor Response Assembler Refactor Dec 22, 2020
@hannahhoward hannahhoward marked this pull request as ready for review December 22, 2020 17:17
Copy link
Contributor Author

@acruikshank acruikshank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a nit and a naming suggestion.

type PeerResponseTransactionBuilder interface {
// TransactionBuilder is a limited interface for assembling responses inside a transaction, so that they are included
// in the same message on the protocol
type TransactionBuilder interface {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like ResponseBuilder or MessageBuilder more for this. I can see how this 'adds' to a transaction, but the name suggests it 'builds' a transaction in the sense that a message builder builds a message, but it doesn't really. I can also see TransactionContext, TransactionMutator or TransactionAPI as this represents the limited set of message building functionality available within a transaction.

}

// Transaction Build A Response
func (prm *ResponseAssembler) Transaction(p peer.ID, requestID graphsync.RequestID, transaction Transaction) error {
// Transaction build a response, and queues it for sending in the next outgoing message
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Transaction build a response, and queues it for sending in the next outgoing message
// Transaction builds a response, and queues it for sending in the next outgoing message

Co-authored-by: Alex Cruikshank <169613+acruikshank@users.noreply.github.com>
@hannahhoward hannahhoward force-pushed the chore/response_assembler_refactor branch from c0e2a73 to 2407411 Compare December 23, 2020 21:23
rename transaction builder to response builder
@hannahhoward hannahhoward force-pushed the chore/response_assembler_refactor branch from 2407411 to ab4f23f Compare December 23, 2020 21:30
@hannahhoward hannahhoward merged commit 319ab7e into master Dec 23, 2020
@aschmahmann aschmahmann mentioned this pull request Feb 18, 2021
73 tasks
@mvdan mvdan deleted the chore/response_assembler_refactor branch December 15, 2021 14:16
marten-seemann pushed a commit that referenced this pull request Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants