Skip to content

Using the Admin tool

Cuong Nguyen edited this page Jul 10, 2021 · 26 revisions

Setting up and running a cluster of many machines can be tedious, thus, using the Admin tool is the recommended way to perform these tasks.

One-time setup

On the machine you use to run the Admin tool, install the Python dependencies using the following command. You may want to create a virtualenv to isolate these packages from your system-wide packages.

$ pip install -r tools/requirements.txt

This tool makes use of the Docker SDK for Python and runs each SLOG server in a Docker container so you need to install Docker on each machine that you're going to run SLOG on. You can use the Docker install script to expedite this process.

Important: You must also configure the machine to run Docker without sudo.

If you run a distributed cluster, set up SSH such that you can log into the machines using keys instead of passwords. This is a good guide for how to do that. You may use a single key for all machines.

If you get the key files from your cloud provider (e.g. AWS), you need to add your keys to the ssh-agent. Note that using the keys via the ~/.ssh/config file doesn't work with this tools.

Build a Docker image

From the project root directory, run

$ docker build . -t <image-name>

If you're running on a distributed cluster, you need to push the image to a remote repository so that the machines in your cluster can pull the image. The required image name format for this case is <user-name>/<repo-name> (e.g. ctring/slog). To push the image, log in to Docker Hub with docker login then run

$ docker push <image-name>

If you're running locally, you can follow the above. Alternatively, you can avoid using the remote repository. The start command of the admin tool will need to be followed by --no-pull.

Operate on a local-machine cluster

When running the cluster on the local machine (i.e. the same machine where the admin tool is running from), each SLOG server is a process running in its own Docker container. These containers have their own subnet. You may put arbitrary strings for the IP addresses in the SLOG configuration file and the tool will automatically fill in the actual IP addresses. For example:

replicas: {
    addresses: "machine1",
    addresses: "machine2",
}
replicas: {
    addresses: "machine3",
    addresses: "machine4",
}

Every time you run one of the below commands, a new configuration file with the real IP addresses will be created if it does not yet exist and can be found in /var/tmp/slog.conf. You can use the IP addresses in this new file to send transactions to the cluster.

Start a local-machine cluster

$ python3 tools/admin.py local --start --image <image-name> <path-to-config-file>

You can optionally use the -e flag to set the environment variables. For example:

$ python3 tools/admin.py local --start --image ctring/slog examples/cluster.conf -e GLOG_v=1

Stop a local-machine cluster

$ python3 tools/admin.py local --stop --image <image-name> <path-to-config-file>

Show status of a local-machine cluster

$ python3 tools/admin.py local --status --image <image-name> <path-to-config-file>

Show logs of a machine

Since this is on your local machine, you can simply run the Docker command directly

$ docker logs slog_<r>_<p>

Where <r> is the replica number and <p> is the partition number. You can optionally use the -f option to tail the logs indefinitely. For example, to keep tailing the logs from the server corresponding to partition 0 in replica 0, run

$ docker logs slog_0_0 -f

Operate on a fully-distributed cluster

Make sure the IP addresses are correct in the SLOG configuration file. In all of the following commands, you might need to use the -u option to specify the username used to ssh to the remote machines.

Start a cluster

$ python3 tools/admin.py start --image <image-name> <path-to-config-file>

You can optionally use the -e flag to set the environment variables. A complete example may look like

$ python3 tools/admin.py start --image ctring/slog -u ctring -e GLOG_v=1 examples/cluster.conf

Stop a cluster

$ python3 tools/admin.py stop <path-to-config-file>

Show status of a cluster

$ python3 tools/admin.py status <path-to-config-file>

Show logs of a machine

In below commands, you may optionally use the -f option to tail the logs indefinitely.

Select machine by address

$ python3 tools/admin.py logs -a <address> <path-to-config-file>

Select machine by replica and partition ID

$ python3 tools/admin.py logs -rp <replica> <partition> <path-to-config-file>

Example

# Tail the logs from machine with IP address 192.168.2.11
$ python3 tools/admin.py logs -a 192.168.2.11 -f examples/cluster.conf 

# Get the logs from machine corresponding to replica 1 and partition 2
$ python3 tools/admin.py logs -rp 1 2 examples/cluster.conf