diff --git a/README.md b/README.md index bc8bbcbe..94414949 100644 --- a/README.md +++ b/README.md @@ -119,6 +119,11 @@ 1. `docker-compose build` 1. `docker-compose up -d` +## Cassandra + +Documentation and scripts to deploy and operate cassandra in +production are available on [scripts/cassandra](scripts/cassandra). + ## Backup and restore In the scripts/ folder there are backup and restore scripts for docker postgres. diff --git a/scripts/cassandra/README.md b/scripts/cassandra/README.md new file mode 100644 index 00000000..5dfb215f --- /dev/null +++ b/scripts/cassandra/README.md @@ -0,0 +1,274 @@ +Cassandra Operation +================= + + * [Installing cassandra](#installing-cassandra) + * [Configuring cassandra](#configuring-cassandra) + * [Starting and stopping cassandra](#starting-and-stopping-cassandra) + * [Starting Cassandra](#starting-cassandra) + * [Stopping Cassandra](#stopping-cassandra) + * [Enabling cassandra service to auto-start on boot](#enabling-cassandra-service-to-auto-start-on-boot) + * [Adding a new cassandra node](#adding-a-new-cassandra-node) + * [Migrating cassandra to another host](#migrating-cassandra-to-another-host) + * [Upgrading Cassandra](#upgrading-cassandra) + * [Before the upgrade](#before-the-upgrade) + * [Minor upgrade](#minor-upgrade) + * [Major upgrade](#major-upgrade) + * [Testing the upgrade](#testing-the-upgrade) + * [Performing the major upgrade](#performing-the-major-upgrade) + * [Rollback from upgrade](#rollback-from-upgrade) + * [After the uprade](#after-the-uprade) + * [Backup & Restore](#backup--restore) + * [Backup](#backup) + * [Restore](#restore) + * [Repair data after node failure or backup recovery](#repair-data-after-node-failure-or-backup-recovery) + +# Installing cassandra + +1. Copy the sample `env.example` file to `.env` and edit variables with the proper configuration for this node: + + `cp env.example .env` + +2. Install cassandra version `X.Y.Z` with the following command: + + `./install_cassandra.sh X.Y.Z` + + By default this script runs in dry-run mode so you can double check the install and configuration commands. + Use the -x flag (ie. `./install_cassandra.sh -x X.Y.Z`) to perform the actual installation. + +Cassandra is installed on the `/opt/apache-cassandra-X.Y.Z` directory. A symlink is created from `/opt/cassandra` +to `/opt/apache-cassandra-X.Y.Z`. The cassandra service is installed on systemctl and is disabled by default. +The nodetool and cqlsh commands are placed on `/usr/local/bin`. + +The following variables are set during install: + * CASSANDRA\_HOME=/opt/cassandra + * CASSANDRA\_CONF=/opt/cassandra/conf + * CASSANDRA\_LOG\_DIR=/var/log/cassandra + * cassandra\_storagedir=/var/lib/cassandra + +This installation is targeted at Ubuntu systems and was tested with Ubuntu 20.04. + +# Configuring cassandra + +The installation script will automatically configure cassandra according to the parameters specified on the `.env` file. +If any changes need to be made in the configuration after the `.env` file is updated, run the following command: + + `./configure-cassandra -x` + +Please note the `-x` flag must be specified, otherwise the script will be run in dry-run mode. + +The important parameters to specify are CLUSTER\_NAME, SEEDS, LISTEN\_ADDRESS, RPC\_ADDRESS. + +Please note that any parameters manually specified on /opt/cassandra/conf/cassandra.yaml may be lost when this script is +run so it's important to update the script to take into account new parameters. + +# Starting and stopping cassandra + +## Starting Cassandra + +Start the cassandra service with: + + `sudo service cassandra start` + +Check that cassandra was started without errors by inspecting the log on `/var/log/cassandra/system.log`. + +## Stopping Cassandra + +Before stopping cassandra, it's recommended to drain the node so all data is flushed to disk with: + + `nodetool drain` + +After the node is drain stop the cassandra service with: + + `sudo service cassandra stop` + +## Enabling cassandra service to auto-start on boot + +Use the following command to configure the node to automatically start cassandra if the node is restarted: + + `sudo systemctl enable cassandra.service` + +# Adding a new cassandra node + +New nodes must be added when there are performance issues on the cluster or when the disk capacity reaches around 75%. +In order to add a new node just install cassandra according to the instructions above and make sure to set the +CLUSTER\_NAME and SEEDS to point to the cluster where you want the new node to join. + +# Migrating cassandra to another host + +1. Install and configure the desired version of cassandra in the new server with: + + `./install_cassandra.sh -x `. + + Make sure the new server IP is configured on the `.env` file. + +2. Create the storage directory `/var/lib/cassandra` on the new host with: + + `mkdir -p /var/lib/cassandra` + +3. On the new host, copy the data directory from the old host with rsync: + + `rsync -aczP --stats user@old_host:/var/lib/cassandra/data /var/lib/cassandra/` + +5. Drain and stop the original cassandra node being migrated: + + ``` + nodetool drain + service cassandra stop + ``` + +6. Run rsync again on the new host so the remaining data is copied over: + + `rsync -aczP --stats user@old_host:/var/lib/cassandra/data /var/lib/cassandra/` + +7. Make sure the newly copied data is owned by the `cassandra` user: + + `chown -R cassandra:cassandra /var/lib/cassandra/` + +8. Start cassandra on the new host with: + + `service cassandra start` + +9. Check that the system logs for any errors during startup: + + `tail -f /var/log/cassandra/system.log` + +10. Check that the migrated node IP is seen by other hosts with: + + `nodetool status` + +# Upgrading Cassandra + +## Before the upgrade + +It's recommended to perform a snapshot of the all tables before doing the upgrade as a precaution step in case something goes wrong during the upgrade: + + `nodetool snapshot` + +This will create a snapshot on `/var/lib/cassandra/data//snapshots/`. + +## Minor upgrade + +Performing an upgrade on Cassandra between patch versions is very simple (ie. from version 3.11.2 to 3.11.7). +The upgrade must be performed in a rolling-restart manner one node at a time with the following steps on each node: + +1. Drain and stop cassandra: + + ``` + nodetool drain + service cassandra stop + ``` + +2. Install the new cassandra version with: + + `./install_cassandra.sh ` + +3. Start the cassandra process on the new version: + + `service cassandra start` + +4. Check logs and application to verify everything is running correctly before upgrading the next node. + +## Major upgrade + +### Testing the upgrade + +Since there can be changes in the data format between major versions, it is recommended to test the upgrade in a separate node +before performing an upgrade between major versions (ie. from 3.11 to 4.0). Peform the following steps to test a major upgrade: + +1. Install and configure the desired version of cassandra in the new server with: + + `./install_cassandra.sh -x `. + + Make sure the new server has a different CLUSTER\_NAME specified on the .env file + so it doesn't accidentaly join the old cluster. + +2. Export the kairosdb schema from a running node: + + `cqlsh -u USER -p PASS -e "DESC KEYSPACE kairosdb" NODE_IP > kairosdb_schema.cql` + +3. Start cassandra on the new node with: + + `service cassandra start` + +4. Import the created schema by entering cqlsh and use the following command: + + ``` + cqlsh -u USER -p PASS NODE_IP + SOURCE 'kairosdb_schema.cql' + ``` + +5. Stop the cassandra node: + + `service cassandra stop` + +6. Copy the kairosdb data files from a node in the previous version: + + `rsync -aczP --stats user@old_host:/var/lib/cassandra/data/kairosdb/ /var/lib/cassandra/data/kairosdb` + +5. Start the cassandra service: + + `service cassandra start` + +7. Perform some queries via cqlsh or point the staging applicaton to this cassandra server and verify + the data is being read correctly without errors. + +### Performing the major upgrade + +The steps are the same for performing a minor upgrade, except that after the upgrade is completed +you must run the following command after the upgrade on each node (before moving to next): + + `nodetool upgradesstables` + +This command will ensure data files are upgaded to the newer version and may take a while to run. + +## Rollback from upgrade + +Failed upgrades are very unlikely, but in case it happens, perform the following steps to rollback a node: + +1. Stop cassandra server that was upgraded: + + `service cassandra stop` + +2. Restore the old version with: + + `ln -snf /opt/apache-cassandra- /opt/cassandra` + +3. Replace all the data from `/var/lib/cassandra/data/
` with `/var/lib/cassandra/data/
/snapshots/` for all tables. + +4. Clean all data from `/var/lib/cassandra/data/commitlogs`, `/var/lib/cassandra/data/hints` and `/var/lib/cassandra/data/saved_caches`. + +5. Start cassandra + + `service cassandra start` + +## After the uprade + +If everything goes well with the upgrade, don't forget to clean the snapshot files with: + + `nodetool clearsnapshot` + +# Backup & Restore + +## Backup + +The simplest way to backup a cassandra node is to use the cloud provider's VM snapshot feature. + +## Restore + +After restoring the snapshot VM from your cloud provider, you need to update the node's IP address +on the `.env` file on `LISTEN_ADDRESS`, `RPC_ADDRESS` and `SEEDS` and run the following command: + + `./configure_cassandra.sh -x` + +After that start the cassandra process with: + + `service cassandra start` + +After restoring the node from backup it's recommended to run repair (as instructed below). + +# Repair data after node failure or backup recovery + +If a node is down for longer than 3 hours (max\_hint\_window\_in\_ms), run the following command to +make the node synchronize data with other nodes in the cluster: + + `./repair_node.sh USERNAME PASSWORD NODE_PRIVATE_IP` diff --git a/scripts/cassandra/cassandra-rackdc.properties b/scripts/cassandra/cassandra-rackdc.properties new file mode 100644 index 00000000..f85646e6 --- /dev/null +++ b/scripts/cassandra/cassandra-rackdc.properties @@ -0,0 +1,30 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# These properties are used with GossipingPropertyFileSnitch and will +# indicate the rack and dc for this node +# +# When upgrading from SimpleSnitch, you will need to set your initial machines +# to have rack=rack1 +dc=DC1 +rack=RAC1 + +# Add a suffix to a datacenter name. Used by the Ec2Snitch and Ec2MultiRegionSnitch +# to append a string to the EC2 region name. +#dc_suffix= + +# Uncomment the following line to make this snitch prefer the internal ip when possible, as the Ec2MultiRegionSnitch does. +# prefer_local=true diff --git a/scripts/cassandra/cassandra-settings.service b/scripts/cassandra/cassandra-settings.service new file mode 100644 index 00000000..1ce2722b --- /dev/null +++ b/scripts/cassandra/cassandra-settings.service @@ -0,0 +1,17 @@ +[Unit] +Description=Cassandra Recommended Settings +DefaultDependencies=no +After=sysinit.target local-fs.target +Before=cassandra.service + +[Service] +Type=oneshot +ExecStart=/bin/sh -c 'echo never > /sys/kernel/mm/transparent_hugepage/defrag && \ +echo mq-deadline > /sys/block/sda/queue/scheduler && \ +echo 0 > /sys/class/block/sda/queue/rotational && \ +echo 8 > /sys/class/block/sda/queue/read_ahead_kb && \ +echo 0 > /proc/sys/vm/zone_reclaim_mode && \ +echo 32 > /sys/block/sda/queue/nr_requests' + +[Install] +WantedBy=basic.target diff --git a/scripts/cassandra/cassandra-sysctl.conf b/scripts/cassandra/cassandra-sysctl.conf new file mode 100644 index 00000000..2c8469ea --- /dev/null +++ b/scripts/cassandra/cassandra-sysctl.conf @@ -0,0 +1,9 @@ +# Settings from https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/install/installRecommendSettings.html +net.core.rmem_max = 16777216 +net.core.wmem_max = 16777216 +net.core.rmem_default = 16777216 +net.core.wmem_default = 16777216 +net.core.optmem_max = 40960 +net.ipv4.tcp_rmem = 4096 87380 16777216 +net.ipv4.tcp_wmem = 4096 65536 16777216i +vm.max_map_count = 1048575 diff --git a/scripts/cassandra/cassandra.service b/scripts/cassandra/cassandra.service new file mode 100644 index 00000000..2a92bf5d --- /dev/null +++ b/scripts/cassandra/cassandra.service @@ -0,0 +1,26 @@ +# /usr/lib/systemd/system/cassandra.service + +[Unit] +Description=Cassandra +After=network.target +StartLimitInterval=200 +StartLimitBurst=5 + +[Service] +Type=forking +PIDFile=/var/lib/cassandra/cassandra.pid +User=cassandra +Group=cassandra +Environment="CASSANDRA_INCLUDE=/opt/cassandra/cassandra.in.sh" +PassEnvironment="CASSANDRA_INCLUDE" +ExecStart=/opt/cassandra/bin/cassandra -p /var/lib/cassandra/cassandra.pid +Restart=always +RestartSec=10 +SuccessExitStatus=143 +LimitMEMLOCK=infinity +LimitNOFILE=10000 +LimitNPROC=32768 +LimitAS=infinity + +[Install] +WantedBy=multi-user.target diff --git a/scripts/cassandra/configure_cassandra.sh b/scripts/cassandra/configure_cassandra.sh new file mode 100755 index 00000000..bd640d68 --- /dev/null +++ b/scripts/cassandra/configure_cassandra.sh @@ -0,0 +1,109 @@ +#!/bin/sh +# +# based on https://stackoverflow.com/a/7755563 +# +## Usage: install_cassandra.sh +## +## Configures cassandra on this server according to the configuration defined on the .env file +## +## Options: +## -h, --help Display this message. +## -x, --disable-dry-run By default script is run in dry-run mode, use this argument to disable dry-run +## -c, --conf-dir Cassandra configuration directory (by default /opt/cassandra/conf) +## + +usage() { + [ "$*" ] && echo "$0: $*" + sed -n '/^##/,/^$/s/^## \{0,1\}//p' "$0" + exit 2 +} 2>/dev/null + +parse_args() { + CONF_DIR=/opt/cassandra/conf + while [ $# -gt 0 ]; do + case $1 in + (-x|--disable-dry-run) NO_DRY_RUN=1;; + (-c|--conf-dir) shift; CONF_DIR=$1;; + (-h|--help) usage 2>&1;; + (--) shift; break;; + (-*) usage "$1: unknown option";; + (*) break;; + esac + shift + done + if [ ! $NO_DRY_RUN ]; then + echo "WARNING: Running in dry-run mode. Execute with -x argument to execute commands" + fi +} + +# Run command if not in dry-run mode +run_cmd() { + echo "Executing \"$*\"" + if [ $NO_DRY_RUN ]; then + eval "$*" + fi +} + +check_non_empty() { + VAR_NAME=$1 + VAR_VALUE=$2 + if [ -z $VAR_VALUE ]; then + echo "ERROR: Required variable $VAR_NAME is undefined on .env file" + exit 2 + fi +} + +parse_args $* + +# Need to run as root +if [ $USER != "root" ]; then + echo "Script must be executed as root" + exit 2 +fi + +# Check if environment file is present +if [ ! -f ".env" ]; then + echo "Environment file \".env\" is not present. Make a copy of .env.example and adjust parameters." + exit 2 +fi + +. ./.env + +if [ ! -d $CONF_DIR ]; then + echo "ERROR: Cassandra configuration directory not found at $CONF_DIR." + exit 2 +fi + +# Set cluster name +check_non_empty CLUSTER_NAME $CLUSTER_NAME +run_cmd "sed -i 's/cluster_name:.*/cluster_name: $CLUSTER_NAME/g' $CONF_DIR/cassandra.yaml" + +# Set seeds +check_non_empty SEEDS $SEEDS +run_cmd "sed -i 's/- seeds:.*/- seeds: \"$SEEDS\"/g' $CONF_DIR/cassandra.yaml" + +# Set listen_address +check_non_empty LISTEN_ADDRESS $LISTEN_ADDRESS +run_cmd "sed -i 's/listen_address:.*/listen_address: $LISTEN_ADDRESS/g' $CONF_DIR/cassandra.yaml" + +# Set rpc_address +check_non_empty RPC_ADDRESS $RPC_ADDRESS +run_cmd "sed -i 's/start_rpc:.*/start_rpc: true/g' $CONF_DIR/cassandra.yaml" +run_cmd "sed -i 's/rpc_address:.*/rpc_address: $RPC_ADDRESS/g' $CONF_DIR/cassandra.yaml" + +# Set authenticator +check_non_empty AUTHENTICATOR $AUTHENTICATOR +run_cmd "sed -i 's/authenticator:.*/authenticator: $AUTHENTICATOR/g' $CONF_DIR/cassandra.yaml" + +# Set snitch +check_non_empty ENDPOINT_SNITCH $ENDPOINT_SNITCH +run_cmd "sed -i 's/endpoint_snitch:.*/endpoint_snitch: $ENDPOINT_SNITCH/g' $CONF_DIR/cassandra.yaml" + +# Set heap size +run_cmd "sed -i 's/#MAX_HEAP_SIZE=.*/MAX_HEAP_SIZE=$MAX_HEAP_SIZE/g' $CONF_DIR/cassandra-env.sh" +run_cmd "sed -i 's/#HEAP_NEWSIZE=.*/HEAP_NEWSIZE=$HEAP_NEWSIZE/g' $CONF_DIR/cassandra-env.sh" + +# Override cassandra-rackdc.properties file +run_cmd "cp cassandra-rackdc.properties $CONF_DIR" + +echo "Cassandra successfully configured." diff --git a/scripts/cassandra/env.example b/scripts/cassandra/env.example new file mode 100644 index 00000000..851738da --- /dev/null +++ b/scripts/cassandra/env.example @@ -0,0 +1,10 @@ +STORAGE_DIR=/var/lib/cassandra +LOG_DIR=/var/log/cassandra +CLUSTER_NAME=clustername # Cluster name +SEEDS=127.0.0.1 # CSV list of nodes on current cluster +LISTEN_ADDRESS=127.0.0.1 # Private IP of this node +RPC_ADDRESS=127.0.0.1 # Private IP of this node +AUTHENTICATOR=PasswordAuthenticator +ENDPOINT_SNITCH=GossipingPropertyFileSnitch +MAX_HEAP_SIZE=2048M +HEAP_NEWSIZE=512M diff --git a/scripts/cassandra/install_cassandra.sh b/scripts/cassandra/install_cassandra.sh new file mode 100755 index 00000000..14a8558e --- /dev/null +++ b/scripts/cassandra/install_cassandra.sh @@ -0,0 +1,170 @@ +#!/bin/sh +# +# based on https://stackoverflow.com/a/7755563 +# +## Usage: install_cassandra.sh [options] CASSANDRA_VERSION +## +## Where version X.Y.Z (ie. 2.1.10, 3.11.2, etc) +## +## Installs and configures cassandra on this server with recommended system settings +## +## Latest version numbers are available on https://cassandra.apache.org/download/ +## +## Options: +## -h, --help Display this message. +## -x, --disable-dry-run By default script is run in dry-run mode, use this argument to disable dry-run +## + +usage() { + [ "$*" ] && echo "$0: $*" + sed -n '/^##/,/^$/s/^## \{0,1\}//p' "$0" + exit 2 +} 2>/dev/null + +parse_args() { + while [ $# -gt 0 ]; do + case $1 in + (-x|--disable-dry-run) NO_DRY_RUN=1;; + (-h|--help) usage 2>&1;; + (--) shift; break;; + (-*) usage "$1: unknown option";; + (*) break;; + esac + shift + done + if [ $# != 1 ]; then + echo "Wrong number of arguments: $#" + usage + fi + CASSANDRA_VERSION=$1 + if [ ! $NO_DRY_RUN ]; then + echo "WARNING: Running in dry-run mode. Execute with -x argument to execute commands" + fi +} + +# Run command if not in dry-run mode +run_cmd() { + echo "Executing \"$*\"" + if [ $NO_DRY_RUN ]; then + eval "$*" + fi +} + +INITIAL_PARAMS=$* +parse_args $* + +# Need to run as root +if [ $USER != "root" ]; then + echo "Script must be executed as root" + exit 2 +fi + +# Check if environment file is present +if [ ! -f ".env" ]; then + echo "Environment file \".env\" is not present. Make a copy of .env.example and adjust parameters." + exit 2 +fi + +. ./.env + +BASE_DIR="/opt" +DOWNLOAD_URL=http://archive.apache.org/dist/cassandra/$CASSANDRA_VERSION/apache-cassandra-$CASSANDRA_VERSION-bin.tar.gz +TARBALL=`basename $DOWNLOAD_URL` +VERSION_DIR="$BASE_DIR/apache-cassandra-$CASSANDRA_VERSION" +CASSANDRA_DIR="$BASE_DIR/cassandra" + +# Check if version is valid +curl -s --head $DOWNLOAD_URL | head -n 1 | grep "HTTP/1.[01] [23].." > /dev/null # from https://stackoverflow.com/a/2924444 +if [ $? -ne 0 ]; then + echo "ERROR: Version $CASSANDRA_VERSION is not valid or cannot download tarball from $DOWNLOAD_URL." + exit 2 +fi + +# Check if cassandra is already installed +if [ -d $VERSION_DIR ]; then + echo "ERROR: version $CASSANDRA_VERSION is already installed." + exit 2 +fi + +# Check if environment file is present +if [ ! -f ".env" ]; then + echo "Environment file \".env\" is not present. Make a copy of env.example and adjust parameters." + exit 2 +fi + +echo "Installing cassandra $CASSANDRA_VERSION on $VERSION_DIR" + +# Download tarball +run_cmd "wget $DOWNLOAD_URL" + +# Unpack tarball +run_cmd "tar xvzf $TARBALL -C $BASE_DIR" + +# Make /opt/cassandra point to the current version +run_cmd "ln -snf $VERSION_DIR $CASSANDRA_DIR" + +# Fix cassandra logdir to use $CASSANDRA_LOG_DIR variable (necessary on 2.X series) +run_cmd "sed -i 's/\$CASSANDRA_HOME\/logs/\$CASSANDRA_LOG_DIR/' $CASSANDRA_DIR/bin/cassandra" + +# Set CASSANDRA_HOME and CASSANDRA_CONF on /opt/cassandra/cassandra.in.sh +run_cmd "sed '/.*limitations under the License.*/a CASSANDRA_HOME=$CASSANDRA_DIR\nCASSANDRA_CONF=$CASSANDRA_DIR/conf\nCASSANDRA_LOG_DIR=$LOG_DIR' \ +$CASSANDRA_DIR/bin/cassandra.in.sh > $CASSANDRA_DIR/cassandra.in.sh" + +# Set cassandra storage dir +ESCAPED_STORAGE_DIR=`echo $STORAGE_DIR| sed 's/\//\\\\\//g'` +run_cmd "sed -i 's/cassandra_storagedir=.*/cassandra_storagedir=\"$ESCAPED_STORAGE_DIR\"/g' $CASSANDRA_DIR/cassandra.in.sh" + +# Remove tarball +run_cmd "rm $TARBALL" + +# If the cassandra user exists, means cassandra was installed previously, no need to perform system configuration below. +if id -u cassandra > /dev/null 2>&1; then + echo "Cassandra installed, will now configure." + ./configure_cassandra.sh $INITIAL_PARAMS + exit 0 +fi + +# System configuration (only on first install) + +# Install dependecies (From https://cassandra.apache.org/doc/latest/getting_started/installing.html) +run_cmd "apt install -y openjdk-8-jdk" +run_cmd "apt install -y ntp" +run_cmd "apt install -y python2" +run_cmd "ln -snf /usr/bin/python2 /usr/bin/python" + +# Add cassandra user if it does not exist +run_cmd "useradd cassandra" + +# Install nodetool on /usr/local/bin +run_cmd "ln -snf $CASSANDRA_DIR/bin/nodetool /usr/local/bin/nodetool" + +# Install cqlsh on /usr/local/bin +run_cmd "ln -snf $CASSANDRA_DIR/bin/cqlsh /usr/local/bin/cqlsh" + +# Create storage and log dirs and make cassandra user own them +run_cmd "mkdir -p $STORAGE_DIR" +run_cmd "mkdir -p $LOG_DIR" +run_cmd "chown -R cassandra:cassandra $STORAGE_DIR" +run_cmd "chown -R cassandra:cassandra $LOG_DIR" + +# Recommended production settings (from https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/install/installRecommendSettings.html) + +# Apply sysctl settings +run_cmd "cp cassandra-sysctl.conf /etc/sysctl.d/" +run_cmd "sysctl -p" + +# Disable swap +run_cmd "swapoff --all" +run_cmd "sed -i '/swap/d' /etc/fstab" + +# Install cassandra service +run_cmd "cp cassandra-settings.service /usr/lib/systemd/system/" +run_cmd "cp cassandra.service /usr/lib/systemd/system/" +run_cmd "systemctl daemon-reload" +run_cmd "systemctl start cassandra-settings.service" +run_cmd "systemctl enable cassandra-settings.service" + +echo "Cassandra installed, will now configure" +./configure_cassandra.sh $INITIAL_PARAMS + +echo "Cassandra $CASSANDRA_VERSION successfully installed and configured. A reboot is required to apply all system changes." diff --git a/scripts/cassandra/repair_node.sh b/scripts/cassandra/repair_node.sh new file mode 100755 index 00000000..12c6136f --- /dev/null +++ b/scripts/cassandra/repair_node.sh @@ -0,0 +1,53 @@ +#!/bin/sh +# +# based on https://stackoverflow.com/a/7755563 +# +## Usage: repair_node.sh USERNAME PASSWORD NODE_PRIVATE_IP +## +## Repairs all ranges of this node +## +## Options: +## -h, --help Display this message. +## + +usage() { + [ "$*" ] && echo "$0: $*" + sed -n '/^##/,/^$/s/^## \{0,1\}//p' "$0" + exit 2 +} 2>/dev/null + +parse_args() { + CONF_DIR=/opt/cassandra/conf + while [ $# -gt 0 ]; do + case $1 in + (-h|--help) usage 2>&1;; + (--) shift; break;; + (-*) usage "$1: unknown option";; + (*) break;; + esac + shift + done + if [ $# != 3 ]; then + echo "Wrong number of arguments: $#" + usage + fi + USERNAME=$1 + PASSWORD=$2 + IP=$3 +} + +parse_args $* + +SIZE_ESTIMATES_CSV="size_estimates.csv" +REPAIR_SCRIPT="repair.sh" + +# Get nodes primary ranges via system.size_estimates table +cqlsh -u $USERNAME -p $PASSWORD -e "COPY system.size_estimates to '$SIZE_ESTIMATES_CSV'" $IP +cat $SIZE_ESTIMATES_CSV | awk -F',' '{ print "nodetool repair -st "$3" -et "$4 }' | sort | uniq > $REPAIR_SCRIPT +rm $SIZE_ESTIMATES_CSV + +echo "Will start repair" +bash -x $REPAIR_SCRIPT + +echo "Completed repair of `wc -l $REPAIR_SCRIPT` ranges successfully." +rm $REPAIR_SCRIPT