Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Makefile for automating localnet setup #3718

Merged
merged 14 commits into from
Dec 6, 2019

Conversation

cbeams
Copy link
Contributor

@cbeams cbeams commented Nov 30, 2019

Problem: contributors old and new must read and follow many manual steps
spread across three documents (docs/{build,dev-setup,dao-setup}.md) in
order to get up and running with a local regtest Bisq network deployment
suitable for isolated development and end-to-end testing. This process
is not only manual, but requires considerable trial and error for most
contributors, and can amount to hours of effort. Perhaps most
detrimental is that this friction makes it much less likely that we get
"all hands on deck" to cover test scenarios at release time. Getting up
and running with what this change refers to as a "localnet" should be
among the very first things a new contributor does. It should be fast
and easy, maximizing the contributor's ability to get productive right
away.

Solution: this commit introduces a simple and well-documented makefile
to the root of the source tree. It instructs the user to issue a series
of simple make commands, at the end of which they'll have a fully
functional localnet deployment.

Caveats:

  • No support for Windows unless the user is running Git Bash, Cygwin or
    similar. In any case, the makefile serves as clear documentation
    about what a Windows user would need to do manually, i.e. without the
    benefit of make automating it all.

  • The aforementioned setup documents should be updated to point to this
    makefile instead of explaining everything in prose. The dev-setup.md
    and dao-setup.md documents may actually be candidates for deletion if
    this new approach proves successful.

  • These changes do not include passing the new -peerbloomfilters=1
    option to bitcoin versions 0.19 and above. Those who have already
    upgraded should take care to add that option.

Notes:

  • The introduction of this makefile has no impact on Bisq's use of
    Gradle as a build system. Everything there is as it has been. This
    makefile is a completely optional convenience being added into the
    mix. It has the added benefit of being a "friendly face" to those not
    familiar with the Java / JVM ecosystem. Developers from many
    different backgrounds are familiar with make and makefiles, and they
    may find this one a pleasant and inviting surprise.

Special thanks to @bodymindarts for the inspiration to take this makefile-based approach. The makefile in his bisq-workspace repo plus the pain I was experiencing trying to help out with v1.2.4 testing was what got this ball rolling.

For those interested in putting this to use in your current testing efforts, this PR is branched from the last common commit between the master and release/v1.2.4 branches, so you can merge it into your own local release/v1.2.4 branch or just cherry-pick the single commit. Both should work cleanly.

@cbeams
Copy link
Contributor Author

cbeams commented Nov 30, 2019

I just realized that the commit comment and documentation in the makefile doesn't necessarily make clear how concise the process can be, especially for those comfortable using screen. These are the steps in a nutshell:

$ screen
$ make
$ make deploy

UPDATE: As of the latest changes, this is now as simple as:

$ make deploy

@cbeams
Copy link
Contributor Author

cbeams commented Nov 30, 2019

An additional note/caveat not mentioned in the original message. This implementation works with the dao-setup.zip file, but most if not all are agreed we'd like to see the full automation of a from-scratch genesis tx. @KanoczTomas has been working on this approach, as can be seen in https://github.com/bisq-network/bisq/tree/master/docs/autosetup-regtest-dao. If feasible, it would be desirable to eliminate the pre-fab zip altogether and replace it with equally automated steps for setting everything up from scratch. This may require Bisq's gRPC API to be in place for certain aspects; fortunately that work is underway as we speak!

Problem: contributors old and new must read and follow many manual steps
spread across three documents (docs/{build,dev-setup,dao-setup}.md) in
order to get up and running with a local regtest Bisq network deployment
suitable for isolated development and end-to-end testing. This process
is not only manual, but requires considerable trial and error for most
contributors, and can amount to hours of effort. Perhaps most
detrimental is that this friction makes it much less likely that we get
"all hands on deck" to cover test scenarios at release time. Getting up
and running with what this change refers to as a "localnet" should be
among the very first things a new contributor does. It should be fast
and easy, maximizing the contributor's ability to get productive right
away.

Solution: this commit introduces a simple and well-documented makefile
to the root of the source tree. It instructs the user to issue a series
of simple `make` commands, at the end of which they'll have a fully
functional localnet deployment.

Caveats:

 - No support for Windows unless the user is running Git Bash, Cygwin or
   similar. In any case, the makefile serves as clear documentation
   about what a Windows user would need to do manually, i.e. without the
   benefit of `make` automating it all.

 - The aforementioned setup documents should be updated to point to this
   makefile instead of explaining everything in prose. The dev-setup.md
   and dao-setup.md documents may actually be candidates for deletion if
   this new approach proves successful.

 - These changes do not include passing the new -peerbloomfilters=1
   option to bitcoin versions 0.19 and above. Those who have already
   upgraded should take care to add that option.

Notes:

 - The introduction of this makefile has no impact on Bisq's use of
   Gradle as a build system. Everything there is as it has been. This
   makefile is a completely optional convenience being added into the
   mix. It has the added benefit of being a "friendly face" to those not
   familiar with the Java / JVM ecosystem. Developers from many
   different backgrounds are familiar with make and makefiles, and they
   may find this one a pleasant and inviting surprise.
Copy link
Contributor

@julianknutsen julianknutsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually of the opinion that it is OK if devs don't understand all of the internals of the dao setup and genesis transactions on their first day when they are just trying to get the software up and running to mess around with the features.

This is a great first step and the docs should be updated to recommend this as opposed to the list of manual commands. Scripts like these are easy to bikeshed about, but erring on the side of getting something committed that people can use to save time seems prudent. It is easy to add features later than can be driven by more dev use cases.

The next iteration should probably do everything from scratch so the zip file can be deprecated. There are already issues with the zip file having different default accounts due to the pre-existing data. It is just error-prone to maintain default persistent data.

Going forward it seems like doing everything from scratch and utilizing the gRPC system to automate the "default" pieces that are time-consuming makes a lot of sense. It would be great to see things like default accounts added by a set of gRPC commands when that part of the API becomes available. Internal use cases are good drivers of features because the acceptance criteria is well-defined and the users are developers who can give feedback faster than typical Bisq users.

I've never seen a Makefile used in this type of manner, but as long as the docs help people unfamiliar with make know which commands to run it seems fine.

Makefile Outdated
screen -t seednode2 make seednode2
screen -t alice make alice
screen -t bob make bob
screen -t mediator make mediator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is close to what I have locally just in a .rc file. Does bisq not handle bitcoind not having the rpc server up first? I've never run into a problem, but I always start bitcoind first so maybe I just got lucky?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes bitcoind must run first. If the seed runs as full DAO node bitcoind must run before the seed node.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that is why in the original bisq-workspace I start bitcoind with a script that doesn't complete until bitcoind is responsive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from @chimp1984's point that bitcoind should run before the seednode in any case, the actual reason I introduced the sleep 2 and make block entries here is because if a new block (block 112) doesn't get created prior to starting the desktop nodes, they spin forever on DAO synchronization (which is annoying to look at but also eats up CPU, spins up fans and makes everything feel heavyweight). I believe this is because the desktop nodes need to actually receive a block notification, but it might also be that they need to have at least one confirmation on the genesis tx in block 111, maybe both. In any case, without block 112, there is a strange status message in DAO->BSQ wallet->transactions that reads "Awaiting synchronization... Validated 111 of 0 blocks". So something is off there. To see this behavior for yourself, just comment out the sleep and make block lines and run through the makefile instructions as usual.

Copy link

@bodymindarts bodymindarts Nov 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes a lot of sense. I think it should however be the responsibility of the bitcoind command to spin it up in a way that is consistent for the other commands to consume.

When adding an alternative deploy-tmux command for example that logic has to be duplicated.

Is it still a problem if block 112 isn't present when the desktop client is spun up but appears very soon thereafter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it still a problem if block 112 isn't present when the desktop client is spun up but appears very soon thereafter?

It is a problem in the sense that if 112 doesn't show up quickly, the contributor using the makefile is going to hear their fans spin up as the dao synchronization wait loop eats up CPU (we should obviously try to solve that problem at the root, but in the meantime...).

So if we're talking about the difference between generating block 112 just after deploying bitcoind vs. doing it just after deploying all the nodes, i.e. a difference measured in (sub)seconds, then it's not a problem in practice. Doing something like @KanoczTomas's start_bitcoind wrapper (which I believe you used elsewhere, @bodymindarts) might be a solution, but I've avoided any such shell scripts and/or variables thus far with the intention of making everything that's being done absolutely clear to the reader because it's all in one place (the makefile), free of any indirections or abstractions they need to dig into.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I understand the intention with having it all self-contained in the makefile and like the approach.
It seems to be a trade trade-off between ease-of-use (via a wrapper script) and 'time searching for the exact commands being executed'. One is more focused on the developer who wants a streamlined workflow, the other on a first-timer just seeing how things work.
In general I like being explicit (ie. no wrapper or indirection) and try to achieve that whenever possible, but in my experience (and I use a Makefile in every single project I do) when there are more than a few commands being executed or there are other complications having 1 level of indirection, like:

something:
    scripts/do_something.sh

is an acceptable trade off, especially if it improves the dev-workflow (and active devs are probably the primary user of this interface). For first timers having to jump into the scripts folder for more information should be fine.

Another benefit is having the individual make commands closer together in the Makefile, which means not having to scroll so far when trying to assess what all the main commands are that are used to interface with the project as a dev. This also benefits first-time contributors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, thanks. Where stuff gets unwieldy and actually hurts comprehension, we should shell out to something under scripts/. The current screen voodoo is probably a good example of this. I'd like to try to keep the starting of bitcoind and Bisq nodes script-free, as I think there's value in seeing them all together, explicitly parameterized in one place as we have it now. Let's just do whatever makes sense over time though.

Makefile Outdated
--userDataDir=localnet \
--appName=seednode

seednode2: build
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the second seednode just for more testing of p2p forwarding? My setup has always been fine with just one so curious as to the extra benefit and if it is worth the resources.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one seed node is ok.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One seednode is okay but it leads to annoying error output in all the logs unless you remove one of these lines

Copy link
Contributor Author

@cbeams cbeams Nov 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One seednode is okay but it leads to annoying error output in all the logs

This is the main reason I spin it up. Gets everything closer to a no-broken-windows situation where the operator can assume there shouldn't be any (or many) errors / stack traces showing up in their Bitcoin and Bisq node logs.

Correct solution would perhaps be to do issuing a warning in the 2002 seednode log that the other well-known (3002) seednode cannot be found, and stopping trying after a reasonable several attempts instead of perpetually issuing error messages.

In the meantime, the seednode2 target could be documented as optional and an explanatory note could be added about running to avoid the errors mentioned above.

The other reason I wanted a second seed node was to test what happens when one seednode is a dao fullnode and the other isn't. Seems to work fine, but the current configuration in the makefile spins them both up as fullnodes anyway.

Note that this is the same reason why alice is a dao fullnode but bob and mediator aren't: just to make sure we have this heterogenous setup in the out-of-the box localnet config. It better reflects the actual state of the production network, might help catch any issues.


# Generate a new block on your Bitcoin regtest network. Requires that
# bitcoind is already running. See the `bitcoind` target above.
block:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could use another sentence or so in the preamble. You end up needing to create blocks to test features like governance so helping new users fix common errors like "I created a proposal from Alice, but it isn't visible on Bob. Why not?" may help the onboarding.

Justin Carter added 3 commits November 30, 2019 07:50
Sometimes when running setup something goes wrong and the ./dao-state
dir is still hanging around, requiring manual cleanup nad preventing from simply
re-running the command.
Makefile Show resolved Hide resolved
Makefile Show resolved Hide resolved
Copy link
Contributor

@julianknutsen julianknutsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled this down and used it for a bit of testing. Here are a few UX comments.

Not everyone has bitcoind and bitcoin-cli in the path. Fixed this locally with symlinks, but since this doesn't cover installing bitcoind it is a sharp edge.

make clean && make build should rebuild. I think Justin had comments on this too?

Having a way to reset the node state, but keep the build state, seems like a nice-to-have. Rebuilding everything to reset a test from scratch isn't very efficient.

Ctrl+C inside screen tab closes it. There are quite a few test cases where you just need to take down a node temporarily or restart it. I use the pattern below in my previous localnet .rc file. It may not be optimal but gave me what I needed. The new way requires ctrl+c -> goto tab 0 -> type: screen -t alice make alice instead of ctrl+c -> up arrow -> enter

screen -t bisq-bob
select 4
stuff "/home/julian/bisq/bisq-desktop --userDataDir=/home/julian/dao/dao-setup/ --daoActivated=true --genesisBlockHeight=111 --genesisTxId=30af0050040befd8af25068cc697e418e09c2d8ebd8d411d2240591b9ec203cf --baseCurrencyNetwork=BTC_REGTEST --useDevPrivilegeKeys=true --useLocalhostForP2P=true --nodePort=8888 --appName=bisq-BTC_REGTEST_Bob_dao --fullDaoNode=true --rpcUser=bisq --rpcPassword=bisq --rpcPort=1"443 --rpcBlockNotificationPort=5123

This partially reverts commit e3a3fb5, removing the dependency from
the 'localnet' target to the 'clean-localnet' target. The reason for
this is that a number of higher level targets that deploy nodes, e.g.
the 'alice' and 'bob' targets depend on 'localnet' and, prior to this
reversion, therefore also depended on 'clean-localnet'. The effect was
that every time a node is deployed, the .localnet directory was removed
and re-created, destroying the state of any and all nodes that had been
deployed and modified thus far.

The change in the original commit that removes the temporary 'dao-setup'
directory in case of partial failures has been preserved.

This is a follow-up to #3.
This change follows up on commit 650c589, which:

  1. Renamed the 'localdir' directory to '.localdir' to better follow
  convention with how local data directories are often managed, e.g.
  .git and .gradle.

  2. Introduced the STATE_DIR variable to avoid duplication of the
  '.localdir' string throughout the Makefile, and at least in concept to
  allow this value to be customized via setting an environment variable.

The changes in (1) are preserved, while the changes in (2) have been
backed out. Rationale:

 - The STATE_DIR name introduces a new concept to the reader. They must
   reason about its meaning, and this works against the intention of the
   Makefile, which is to maximize understandability for the uninitiated.

 - The name, if we were to preserve the variable, probably should have
   been something like DATA_DIR_ROOT. 'STATE_DIR' is not conceptually
   incorrect, but industry convention is to refer to such directories as
   "data directories", e.g. Bitcoin Core's `datadir` option, LND's
   `datadir` option and Bisq's `userDataDir` and `appDataDir` options.

 - The variable, whatever its name, introduces a layer of indirection,
   which while convenient to the makefile maintainer, is a barrier to
   comprehension for the reader / contributor. For example, if a user
   wished to copy and paste the recipe for a target, say 'bob' from the
   makefile, with the varible in place, the user would have to figure
   out its correct value and replace it before they could paste and use
   the copied command. Like in the first note above, the idea with the
   makefile is to maximize understanding for the uninitiated, i.e.
   working code as executable documentation. It is reasonable given this
   goal to increase the burden on a few maintainers in order to ease the
   potentially many contributors.

Finally, this change follows up on the renaming of the 'localnet'
directory to '.localnet' by reflecting this change in the name of the
associated target as well. This is order to avoid dependent targets e.g.
'bitcoind', 'alice' or 'bob' constantly re-running the localnet target.
In turn it also adds an 'alias' target named 'localnet' (without the
leading dot) because targets with a leading dot are (I believe) treated
as "implicit targets". In any case, they do not show up in a tab
completion context, so introducing the normally-named alias fixes that.

This is a follow-up to #3.
@cbeams
Copy link
Contributor Author

cbeams commented Dec 2, 2019

make clean && make build should rebuild. I think Justin had comments on this too?

Yes. Changes are incoming that address this.

Having a way to reset the node state, but keep the build state, seems like a nice-to-have. Rebuilding everything to reset a test from scratch isn't very efficient.

Agreed. You may have sync'd an earlier rev of the makefile, but now you'll see that in addition to the global clean that blows away both node and build state, there are also clean-localnet and clean-build that do the same for each.

Ctrl+C inside screen tab closes it. There are quite a few test cases where you just need to take down a node temporarily or restart it.

Right, glad you caught this. I have zombie configured in my .screenrc such that when the process in a given window is killed, the window does not close, but is preserved in a "zombie" state such that ^[ kills it fully or pressing @ resurrects it, in this case by re-running the original command, e.g. make alice, make bitcoind or whatever. This arrangement has been productive enough for me to roll with, but I forgot that this zombie configuration is not a default in screen. Here it is if you're interested:

zombie "^["

More to the point, though, I'm working on improvements to the way screen is invoked by make that will naturally preserve the window when its process dies, allowing the user to get back to the natural ctrl+c -> up arrow -> enter workflow you mentioned and that we all likely want.

Problem: we use soft 4-space tabs throughout the Bisq codebase, and the
new makefile is a break to this rule due to make's default requirement
for hard tabs in recipes.

Solution: This commit updates our Editorconfig settings to reflect this
exception.

For vim users, it is also recommended that you add the following entry
to your .vimrc:

     au FileType make set tw=72 noet cc=72

It will ensure that you wrap (documentation) lines at 72 chars. It also
sets noexpandtab explicitly. Even though .editorconfig should already be
doing this for you when working in Bisq, this more general vim
configuration will ensure you use tabs correctly in any makefile. The
`cc=72` setting adds a visual right margin at 72 characters.

This commit also updates the existing makefile, wrapping lines of
documentation that had exceeded the 72-char margin.
@bodymindarts
Copy link

bodymindarts commented Dec 2, 2019

I'd like to try and formulate an explicit set of use-cases and requirements.
This is just my intuition of how I would structure my dev-workflow:

  • Bring up the a self-contained and consistent multi-node setup to test something
  • Simulate passing of time (via blocks) when testing workflows that require confirmed transactions
  • Be able to iterate on some feature or bug fix quickly
    This probably means, make a change to the source code, rebuild and then re-start just 1 of the nodes running the new source code. Usually when making changes I don't need all the nodes to be updated to try out if my change had the desired effect.
  • Reset the state back to a some known situation for reproducing bugs (so somekind of snapshotting capability. After all the initial localnet setup is essentially loading a snapshot, perhaps this could be generalized).

I have not been able to determine if all of these are supported in a streamlined way and am just writting this as documentation (anyone else can add other workflows they think are important).

I like using tmux instead of screen and will add support for tmux once the screen workflows have stabalized.

Problem: Prior to this change, it was necessary to first create and
attach to a screen session and then to run `make deploy` within it. This
meant extra steps for the user and was generally error-prone.

Solution: Usage of screen has been refined such that a screen session
named 'localnet' is created on the users behalf without any need to
attach to it. Individual node deployment targets such as `make
bitcoind`, `make alice`, et al. are issued to new windows within the
localnet screen session, and the user is free to attach or not whenever
they choose. The result is that a new user can clone the repository and
type nothing more than `make deploy` to get up and running with their
localnet.

This also reverts the changes in commit 97dd342 ("Make build target
phony") for the following reasons:

 - As mentioned in that commit message, Gradle was not deleting the its
   'build' directory when running `gradle clean`, meaning that the
   'build' target was always up-to-date, even after running `make
   clean`. This made it impossible to get a correct rebuild workflow. On
   analysis, howewer, this situation was because of a badly behaving
   Kotlin plugin not cleaning up after itself, leaving a subdirectory at
   build/kotlin and preventing the build directory itself from being
   deleted altogether. To address this, the `make clean` target has been
   updated to `rm -rf build` instead of calling `build gradle`. While
   it's a workaround until we back out the Kotlin changes that caused
   this, it does have the added benefit of being faster than invoking
   `gradle clean`.

 - By making the 'build' target PHONY, this meant that `./gradlew build`
   was getting invoked every time a dependent target was called. For
   example, `make alice` depends on the 'setup' target, which in turn
   depends on the 'build' target. When calling such targets in
   isolation, this arrangement works out fine, because the phony 'build'
   target always runs, invoking `./gradle build`, and the Gradle build
   completes quickly assuming everything is up-to-date. The problem
   arises when calling a number of these targets in rapid succession, as
   we do when calling `make deploy` and running each individual node
   target in its own screen window. This causes contention in two ways.
   The first is that these multiple, simultaneous Gradle processes
   compete for access to an available Gradle daemon, and because each
   process needs its own, it ends up that as many Gradle daemons get
   created as Bisq nodes we need to deploy (5 in total). This is a big
   waste of time and resources. The second way it causes not only
   contention but outright failure is that each of these builds are
   operating in the same directory, and while most aspects of the build
   are in fact up-to-date and therefore not modified in any way, there
   are exceptions to this rule. The result is that build artifacts, e.g.
   jars are getting deleted and rebuilt from underneath competing Gradle
   processes, and all manner of chaos ensues, such as NoClassDefFound
   errors and much more. This change (reverting 'build' back to a
   normal, non-phony target) avoids these problems entirely. When
   running `make deploy`, we run the 'build' target once as a function
   of the 'deploy' target depending on it. At this point, the 'build'
   directory exists, and all subsequent node deployment targets, e.g.
   'alice', 'bob', etc do not re-run the build target because it is
   up-to-date. For workflows where the user definitely wants to rebuild
   prior to redeploying a given node, they can either run `make
   clean-build`, or drop down to issuing Gradle build commands directly,
   e.g. `./gradlew :desktop:build` followed by `make desktop`.
Problem: Bitcoind Core v0.90.0 changed the default value of its
'peerbloomfilters' option from 1 to 0, now disabling them by default.
Bisq requires bloom filters be enabled on the Bitcoin node(s) it
communicates with, so users who are running >= v0.90 would get errors
when attempting to run `make bitcoind` with that target's current
recipe.

Solution: This change explicitly sets the 'peerbloomfilters' option to
1, ensuring it is enabled in any case. Note that this option has existed
in Bitcoin Core since v0.12.0, so there is no real concern for this new
option breaking users that are still on 0.18.x or even much earlier.
In commit 5fb4b21 ("Refine deploy target..."), the 'build' target was
made normal, i.e. non-phony, but on further review it does in fact make
sense to declare 'build' phony, such that it is run no matter the status
of the root-level 'build' directory, but for different reasons.

Previously, we had been considering the presence of 'build' directory as
a reasonable proxy for determining whether the `./gradlew build` had
been run. If the directory was present, we considered the 'build' target
up-to-date. If not, then we would re-run `./gradlew build`. This is all
sensible enough, except for the fact that the root-level 'build'
directory has almost nothing to do with the actual output of `./gradlew
build`. Gradle does output 'build' directories, but in the respective
subdirectory for each module of the project. After `./gradlew build` has
been run, we would see a 'desktop/build' directory, a 'seednode/build'
directory and so forth. It just so happens that a root-level 'build'
directory was getting created at all due to idiosyncracies of a
particular Kotlin plugin.

This commit updates the makefile to better respect this reality by:

 - preserving the 'build' target but marking it once again as PHONY

 - introducing new 'seednode/build' and 'desktop/build' targets that
   trigger './gradlew :seednode:build` and ./gradlew :desktop:build`
   commands respectively.

 - making 'build' depend on these two new targets

In light of this realization of flawed thinking about the root-level
build dir, this change also restores `make clean` to calling `./gradlew
clean` instead of `rm -rf build`.
@cbeams
Copy link
Contributor Author

cbeams commented Dec 2, 2019

Ok, I believe the latest commits here address most if not all of the feedback received so far. Please sync up, take it for another spin and let me know if you run into any problems. If everything works without error, I'd like to call this iteration "good enough", such that the PR can be merged and I can return to gRPC work. Further tweaks and improvements will no doubt be in order, but to the degree they're "nice-to-haves" let's try to manage them as subsequent PRs.

Here's what I consider the basic set of use cases that is supported right now, i.e. what should "just work" when you're using the makefile:

  1. The get started from scratch use case: make deploy in a fresh local bisq clone (or your freshly cleaned clone) should do everything from A to Z: it should build the desktop and seednode binaries, unpack and customize the dao-setup.zip file into the .localnet directory, and it should deploy bitcoind and all necessary Bisq nodes into a screen session named localnet.
  2. The I want to get started from scratch without screen use case: make followed by running each node deployment target, e.g. make bitcoind, make seednode, etc. in a separate window should get you up and running with everything.
  3. The I want to start over with the state of my nodes but I don't want to rebuild the binaries use case. make clean-localnet deploy will take care of your needs. Don't forget to kill all your existing node processes and screen session (if any) first, though.
  4. The I want to iterate on development and redeploy just one desktop node while leaving the rest of my localnet up and running use case. Kill the desktop process in question, e.g. 'alice', make your changes to desktop sources and then run ./gradlew :desktop:build followed by make alice. This will trigger the desktop build and deploy the alice node. (NOTE: edited as per Add Makefile for automating localnet setup #3718 (comment))
  5. ... there are probably other nameable use cases that are supported in the current makefile, but these are the basics. Feel free to add to this list, but again, anything net new should probably be managed separately from this PR.

Remember, this stuff doesn't replace the gradle build and doesn't intend to become a comprehensive abstraction over it. It should handle the basics for new contributors who have enough to figure out without grokking gradle. Once you're into an actual development task, you should feel comfortable using gradle more surgically for whatever you need it to do, just as is expected today.

Makefile Outdated
# create a new screen session named 'localnet'
screen -dmS localnet
# deploy each node in its own named screen window
targets=('bitcoind' 'seednode' 'seednode2' 'alice' 'bob' 'mediator'); \
Copy link
Contributor

@julianknutsen julianknutsen Dec 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Makefile didn't specify the SHELL variable so mine defaulted to /bin/sh which doesn't support the array syntax here. Not sure if there is a shell agnostic way to do it, but adding SHELL=/bin/bash to the Makefile fixed it locally.

julian@dev:~/bisq$ make deploy
# create a new screen session named 'localnet'
screen -dmS localnet
# deploy each node in its own named screen window
targets=('bitcoind' 'seednode' 'seednode2' 'alice' 'bob' 'mediator'); \
for t in "${targets[@]}"; do \
	screen -S localnet -X screen -t $t; \
	screen -S localnet -p $t -X stuff "make $t\n"; \
done;
/bin/sh: 1: Syntax error: "(" unexpected
make: *** [Makefile:156: deploy] Error 2

The iterative single-node behavior also doesn't seem to work quite right. I checked out an older version of the code, killed bob, ran make bob and it just restarted without a build.

I also ran make desktop/build and no changes, but ./gradlew :desktop:build had something to do.

dependency_issue

Even making alice & bob have a dependency on desktop/build instead of just setup didn't work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Makefile didn't specify the SHELL variable so mine defaulted to /bin/sh which doesn't support the array syntax here. Not sure if there is a shell agnostic way to do it, but adding SHELL=/bin/bash to the Makefile fixed it locally.

Good catch, thanks @julianknutsen. Commit 234c228 removes the offending bashism such that /bin/sh should run without error.

The iterative single-node behavior also doesn't seem to work quite right. I checked out an older version of the code, killed bob, ran make bob and it just restarted without a build.

You're right, and I should have mentioned this in #3718 (comment). The rationale for why things need to be this way is laid out in 5fb4b21. Search for the first occurrence of the word 'contention' there. I'm updating the comment above now.

Copy link
Contributor

@julianknutsen julianknutsen Dec 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the commit link and it makes sense w.r.t. contention. Verified the bashism issue is fixed as well and will do 1.2.4 testing today with your latest code and call out any other glaring issues. So expect an ACK by EOD from me if everything works out.

Copy link

@bodymindarts bodymindarts Dec 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbeams thanks for your instructive commit messages!
I understand your reasoning behind removing build from PHONY. I'm just wondering wether there is a better solution. Doing a clean-rebuild takes a very long time and is not practical for quick iteration cycles. Essentially the problem you have solved is a race condition. Couldn't we ensure somehow that the build runs just once before deploying the individual nodes. Perhaps running the individual node commands shouldn't depend on the setup/build commands. That way we could have both a 1 time setup + iteration.
Is there a significant advantage of having all the individual node commands depend on setup/build that I'm missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing a clean-rebuild takes a very long time and is not practical for quick iteration cycles.

Here's what I'm doing in practice as I develop stuff. The following assumes I have already done a make deploy, and that now I'm iterating on something, using my alice desktop node to see my changes live:

# quit my running `alice` node, e.g. with CMD-Q in the UI or ^C on the running process
./gradlew :desktop:build && make alice

This picks up and rebuilds just the changes I've made and then deploys those new bits as alice.

Couldn't we ensure somehow that the build runs just once before deploying the individual nodes.

It does effectively run just once. The individual node deployment targets, e.g. alice and bob depend on build. Note that build is, in the latest commits, actually included once again in PHONY, so it does run every time, but that target just depends on the more specific desktop/build and seednode/build targets, which in turn run only if their respective directories do not already exist. The effect here is that when we deploy more than one node or all of them via make deploy, the {desktop|seednode}/build targets get run once and only once, avoiding the above mentioned race condition and inefficient contention for Gradle resources. If you want to cause a fast (incremental) rebuild, it's simply necessary to drop down to calling gradle directly like I've shown above. calling make clean-build should not be necessary in any case, unless you actually want to blow away all the build directories.

Is there a significant advantage of having all the individual node commands depend on setup/build that I'm missing?

It just ensures that deploying any given node causes build and localnet to run if they have not already done so. So someone can come along in a clean checkout and run only make bitcoind and make seednode and everything will work as they expect, meaning that the .localnet dir will get created and the seednode build will run. If that's already happened, then those targets are no-ops.

This fixes the problem described at [1] by replacing bash-specific array
syntax with a simpler sh-friendly for loop.

[1]: bisq-network#3718 (review)
Problem: previously, in order to completely shut down a running
localnet, users had to attach to their 'localnet' screen and kill (^C)
each process, then quit and kill the entire screen session.

Solution: this change introduces an 'undeploy' target that automates
sending the ^C to each screen window followed by sending screen's 'kill'
command to any remaining windows, thus killing the entire 'localnet'
screen session.

The result is that users may now run the following two commands in
succession any number of times to bring their localnet up and down (to
'deploy' and 'undeploy' their localnet).

    # bring up localnet
    $ make deploy

    # use localnet to test, develop, etc...

    # bring down localnet
    $ make undeploy
@cbeams
Copy link
Contributor Author

cbeams commented Dec 3, 2019

FYI, there is now a make undeploy target as well. From commit message ed40afb:

$ git log -1 ed40afb
commit ed40afb1516a08aa651665ed1d61d60555543c5d
Author: Chris Beams <chris@beams.io>
Date:   Tue Dec 3 11:56:04 2019 +0100

    Add 'make undeploy' target to kill all running nodes
    
    Problem: previously, in order to completely shut down a running
    localnet, users had to attach to their 'localnet' screen and kill (^C)
    each process, then quit and kill the entire screen session.
    
    Solution: this change introduces an 'undeploy' target that automates
    sending the ^C to each screen window followed by sending screen's 'kill'
    command to any remaining windows, thus killing the entire 'localnet'
    screen session.
    
    The result is that users may now run the following two commands in
    succession any number of times to bring their localnet up and down (to
    'deploy' and 'undeploy' their localnet).
    
        # bring up localnet
        $ make deploy
    
        # use localnet to test, develop, etc...
    
        # bring down localnet
        $ make undeploy

julianknutsen
julianknutsen previously approved these changes Dec 3, 2019
Copy link
Contributor

@julianknutsen julianknutsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

This is plenty good enough for the original intention. Chris has spent a lot of time iterating and I think we may be at the point of diminishing returns. Plenty of other high prio work to do.

Do any onboarding docs need to be updated to point to the make commands?

I was able to test this with quite a few node restart and recompile tests today without any issues other than a totally reasonable ubutntu popup message:
undeploy_error_popup

@cbeams
Copy link
Contributor Author

cbeams commented Dec 5, 2019

I think I've addressed pretty much everything in the feedback thus far and there've been multiple reports of things working as expected, so I think this initial cut is ready for a merge. @ripcurlx, your ACK is the required one. I know you were having some kind of issue with the Makefile on your side, I'd be happy to help you with that if you want to get it sorted before the merge.

Thanks again to everyone. The feedback has really helped improve and harden this.

@cbeams
Copy link
Contributor Author

cbeams commented Dec 5, 2019

@julianknutsen wrote:

Do any onboarding docs need to be updated to point to the make commands?

Yes, and I've flagged this work to be done elsewhere. But I should probably make at least some basic change now. I'll do that real quick.

The old dev-setup.md and dao-setup.md docs have been marked as
deprecated for now and may be removed after we've gotten sufficient
feedback on the Makefile-based approach.
cbeams added a commit to cbeams/bisq that referenced this pull request Dec 5, 2019
The previous link format works fine while in the GitHub web interface,
but is not useful when trying to navigate from the filesystem. This
format works well in both contexts.

This minor issue was discovered in the course of documentation updates
for bisq-network#3718.
cbeams added a commit to cbeams/bisq that referenced this pull request Dec 5, 2019
The previous link format works fine while in the GitHub web interface,
but is not useful when trying to navigate from the filesystem. This
format works well in both contexts.

This minor issue was discovered in the course of documentation updates
for PR bisq-network#3718.
@cbeams
Copy link
Contributor Author

cbeams commented Dec 5, 2019

I should probably make at least some basic change now. I'll do that real quick.

Commit 7d16890 has addressed updating documentation in a minimal but hopefully effective fashion. Most resources that new developers would see first, be it CONTRIBUTING.md, the Contributor Checklist or the root-level README.md point to the "developer docs" at docs/README.md where these changes have been made.

@ripcurlx, I think this PR is ready to go now.

@ripcurlx
Copy link
Contributor

ripcurlx commented Dec 6, 2019

@ripcurlx, your ACK is the required one. I know you were having some kind of issue with the Makefile on your side, I'd be happy to help you with that if you want to get it sorted before the merge.

@cbeams It would be great to understand why it is not working with the screen multiplexer in my case on macOS Catalina with Z shell(zsh).

make deploy
# create a new screen session named 'localnet'
screen -dmS localnet
# deploy each node in its own named screen window
for target in \
			bitcoind \
			seednode \
			seednode2 \
			alice \
			bob \
			mediator; do \
		screen -S localnet -X screen -t $target; \
		screen -S localnet -p $target -X stuff "make $target\n"; \
	done;
# give bitcoind rpc server time to start
sleep 5
# generate a block to ensure Bisq nodes get dao-synced
make block
bitcoin-cli \
		-regtest \
		-rpcuser=bisqdao \
		-rpcpassword=bsq \
		getnewaddress \
		| xargs bitcoin-cli \
				-regtest \
				-rpcuser=bisqdao \
				-rpcpassword=bsq \
				generatetoaddress 1
error: Could not connect to the server 127.0.0.1:18443
Make sure the bitcoind server is running and that you are connecting to the correct RPC port.

As using the targets in regular opened tabs works without any problems, I'll merge this PR anyways.

Copy link
Contributor

@ripcurlx ripcurlx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK - Tested it locally and except for #3718 (comment) everything worked as expected.

@cbeams
Copy link
Contributor Author

cbeams commented Dec 6, 2019

It would be great to understand why it is not working with the screen multiplexer in my case

@ripcurlx I'll ping you 1:1 to debug this, thanks.

@ripcurlx ripcurlx merged commit b3e73b1 into bisq-network:master Dec 6, 2019
@cbeams cbeams mentioned this pull request Jan 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants