Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that binary logs for PITR are in a shared directory #541

Open
wants to merge 27 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
d259730
Ensure that binary logs for PITR are restored to a shared location
mattlord Mar 12, 2024
e2e5e8b
Add ability to specify flags for mysqlctld
mattlord Mar 16, 2024
9992074
Merge remote-tracking branch 'origin/main' into point-in-time-recovery
mattlord Mar 26, 2024
10a9524
It's a tablet flag not a mysqlctld one
mattlord Mar 26, 2024
af6ebcb
Merge remote-tracking branch 'origin/main' into point-in-time-recovery
mattlord Mar 26, 2024
48ba49f
Move the flag handling code
mattlord Mar 27, 2024
411993c
Only set flag on 2.12+
mattlord Mar 27, 2024
79bf28d
Add PITR restore to backup test
mattlord Mar 27, 2024
8277844
Use recent vtctldclient
mattlord Mar 27, 2024
9d2b978
Use different date command
mattlord Mar 27, 2024
db311ed
Use different date format
mattlord Mar 27, 2024
63abf80
Test incremental backup and restore
mattlord Mar 27, 2024
425b725
WiP
mattlord Mar 28, 2024
75c701d
attempt to fix tests
shlomi-noach Mar 28, 2024
f09d7be
more debug output
shlomi-noach Mar 28, 2024
7716135
more debug info
shlomi-noach Mar 28, 2024
3452359
list backups as debug step
shlomi-noach Mar 28, 2024
b1a0cb3
GetBackups
shlomi-noach Mar 28, 2024
919119b
Get test working locally
mattlord Mar 28, 2024
49b9332
Ignore vtdataroot path created when running local tests
mattlord Apr 10, 2024
5c809cc
Merge remote-tracking branch 'origin/main' into point-in-time-recovery
mattlord Apr 10, 2024
2aeb36b
Fix backup test; use percona image
mattlord Apr 10, 2024
573b44f
Merge remote-tracking branch 'origin/main' into point-in-time-recovery
mattlord Apr 30, 2024
2f7768a
Use /vt/bin for mysqlbinlog and add it to PATH
mattlord May 1, 2024
34b93ad
Merge remote-tracking branch 'origin/main' into point-in-time-recovery
mattlord Jul 12, 2024
ce28c33
Add mysqlbinlog to $VTROOT/bin in backup pods
mattlord Jul 12, 2024
5e7df2b
Testing...
mattlord Jul 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions docs/api/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -1410,6 +1410,49 @@ <h3 id="planetscale.com/v2.LockserverStatus">LockserverStatus
</tr>
</tbody>
</table>
<h3 id="planetscale.com/v2.MysqlctldSpec">MysqlctldSpec
</h3>
<p>
<p>MysqlctldSpec configures the local mysqlctld gRPC server within a tablet.</p>
</p>
<table class="table table-striped">
<thead class="thead-dark">
<tr>
<th>Field</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<code>resources</code></br>
<em>
<a href="https://v1-18.docs.kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#resourcerequirements-v1-core">
Kubernetes core/v1.ResourceRequirements
</a>
</em>
</td>
<td>
<p>Resources specify the compute resources to allocate for just the MySQL Control Daemon.</p>
</td>
</tr>
<tr>
<td>
<code>extraFlags</code></br>
<em>
map[string]string
</em>
</td>
<td>
<p>ExtraFlags can optionally be used to override default flags set by the
operator, or pass additional flags to mysqlctld. All entries must be
key-value string pairs of the form &ldquo;flag&rdquo;: &ldquo;value&rdquo;. The flag name should
not have any prefix (just &ldquo;flag&rdquo;, not &ldquo;-flag&rdquo;). To set a boolean flag,
set the string value to either &ldquo;true&rdquo; or &ldquo;false&rdquo;.</p>
</td>
</tr>
</tbody>
</table>
<h3 id="planetscale.com/v2.MysqldExporterSpec">MysqldExporterSpec
</h3>
<p>
Expand Down
13 changes: 13 additions & 0 deletions pkg/apis/planetscale/v2/vitessshard_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -315,6 +315,19 @@ type MysqldSpec struct {
ConfigOverrides string `json:"configOverrides,omitempty"`
}

// MysqlctldSpec configures the local mysqlctld gRPC server within a tablet.
type MysqlctldSpec struct {
// Resources specify the compute resources to allocate for just the MySQL Control Daemon.
Resources corev1.ResourceRequirements `json:"resources"`

// ExtraFlags can optionally be used to override default flags set by the
// operator, or pass additional flags to mysqlctld. All entries must be
// key-value string pairs of the form "flag": "value". The flag name should
// not have any prefix (just "flag", not "-flag"). To set a boolean flag,
// set the string value to either "true" or "false".
ExtraFlags map[string]string `json:"extraFlags,omitempty"`
}
Comment on lines +318 to +329
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I am missing something, this does not seem to be used anywhere. I think we should remove it to avoid unrequired changes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove it if it indeed is not needed after I do some testing to confirm that things are now working.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coming back on what I said. I think it would be nice to keep that around so that people can use custom mysqlctld flags.


// MysqldExporterSpec configures the local MySQL exporter within a tablet.
type MysqldExporterSpec struct {
// Resources specify the compute resources to allocate for just the MySQL Exporter.
Expand Down
23 changes: 23 additions & 0 deletions pkg/apis/planetscale/v2/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion pkg/operator/vttablet/mysqlctld.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ const (
vtRootInitScript = `set -ex
mkdir -p /mnt/vt/bin
cp --no-clobber /vt/bin/mysqlctld /mnt/vt/bin/
cp --no-clobber $(command -v mysqlbinlog) /mnt/vt/bin/ || true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The directory /mnt/.../ is shared across all the containers in the pod I am assuming? In which case, this line would resolve what you wrote in the PR's description:

/tmp is not a shared mount point within the pod so when mysqlbinlog subsequently tries to read them from within the mysqld container it cannot find them in its container's /tmp directory and it fails with an error

Copy link
Collaborator Author

@mattlord mattlord Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is simply about copying the mysqlbinlog binary from the vitess/lite container image to the mysqlctld/vtbackup container (if it's not already there), as it looks like we'll need to keep that around in the lite image because the MySQL images do not contain that binary and it's needed for PITR.

mkdir -p /mnt/vt/config
if [[ -d /vt/config/mycnf ]]; then
cp --no-clobber -R /vt/config/mycnf /mnt/vt/config/
Expand Down Expand Up @@ -68,7 +69,7 @@ func init() {

securityContext := &corev1.SecurityContext{}
if planetscalev2.DefaultVitessRunAsUser >= 0 {
securityContext.RunAsUser = pointer.Int64Ptr(planetscalev2.DefaultVitessRunAsUser)
securityContext.RunAsUser = pointer.Int64(planetscalev2.DefaultVitessRunAsUser)
}

// Use an init container to copy only the files we need from the Vitess image.
Expand Down
23 changes: 20 additions & 3 deletions pkg/operator/vttablet/pod.go
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,23 @@ func UpdatePod(obj *corev1.Pod, spec *Spec) {
key = strings.TrimLeft(key, "-")
vttabletAllFlags[key] = value
}
// Ensure that binary logs are restored to/from a location that all containers
// in the pod can access if no location was explicitly provided.
if _, ok := vttabletAllFlags["builtinbackup-incremental-restore-path"]; !ok {
// This flag only exists in 2.12.0 and later as it includes vitess
// 20.0 as a dependency.
/*
parts := strings.Split(version.Version, ".")
if len(parts) > 0 {
major, _ := strconv.Atoi(parts[0])
minor, _ := strconv.Atoi(parts[1])
if major >= 2 && minor >= 12 {
vttabletAllFlags["builtinbackup-incremental-restore-path"] = vtDataRootPath
}
}
*/
vttabletAllFlags["builtinbackup-incremental-restore-path"] = vtDataRootPath
}
mysql.UpdateMySQLServerVersion(vttabletAllFlags, spec.Images.Mysqld.Image())

// Compute all operator-generated env vars first.
Expand All @@ -108,7 +125,7 @@ func UpdatePod(obj *corev1.Pod, spec *Spec) {

securityContext := &corev1.SecurityContext{}
if planetscalev2.DefaultVitessRunAsUser >= 0 {
securityContext.RunAsUser = pointer.Int64Ptr(planetscalev2.DefaultVitessRunAsUser)
securityContext.RunAsUser = pointer.Int64(planetscalev2.DefaultVitessRunAsUser)
}

vttabletLifecycle := &spec.Vttablet.Lifecycle
Expand Down Expand Up @@ -315,13 +332,13 @@ func UpdatePod(obj *corev1.Pod, spec *Spec) {
obj.Spec.SecurityContext = &corev1.PodSecurityContext{}
}
if planetscalev2.DefaultVitessFSGroup >= 0 {
obj.Spec.SecurityContext.FSGroup = pointer.Int64Ptr(planetscalev2.DefaultVitessFSGroup)
obj.Spec.SecurityContext.FSGroup = pointer.Int64(planetscalev2.DefaultVitessFSGroup)
}

if spec.Vttablet.TerminationGracePeriodSeconds != nil {
obj.Spec.TerminationGracePeriodSeconds = spec.Vttablet.TerminationGracePeriodSeconds
} else {
obj.Spec.TerminationGracePeriodSeconds = pointer.Int64Ptr(defaultTerminationGracePeriodSeconds)
obj.Spec.TerminationGracePeriodSeconds = pointer.Int64(defaultTerminationGracePeriodSeconds)
}

// In both the case of the user injecting their own affinity and the default, we
Expand Down
9 changes: 7 additions & 2 deletions pkg/operator/vttablet/vtbackup_pod.go
Original file line number Diff line number Diff line change
Expand Up @@ -122,18 +122,23 @@ func NewBackupPod(key client.ObjectKey, backupSpec *BackupSpec, mysqldImage stri

podSecurityContext := &corev1.PodSecurityContext{}
if planetscalev2.DefaultVitessFSGroup >= 0 {
podSecurityContext.FSGroup = pointer.Int64Ptr(planetscalev2.DefaultVitessFSGroup)
podSecurityContext.FSGroup = pointer.Int64(planetscalev2.DefaultVitessFSGroup)
}
securityContext := &corev1.SecurityContext{}
if planetscalev2.DefaultVitessRunAsUser >= 0 {
securityContext.RunAsUser = pointer.Int64Ptr(planetscalev2.DefaultVitessRunAsUser)
securityContext.RunAsUser = pointer.Int64(planetscalev2.DefaultVitessRunAsUser)
}

var containerResources corev1.ResourceRequirements
// Make a copy of Resources since it contains pointers.
update.ResourceRequirements(&containerResources, &tabletSpec.Mysqld.Resources)

vtbackupAllFlags := vtbackupFlags.Get(backupSpec)
// Ensure that binary logs are restored to/from a location that all containers
// in the pod can access if no location was explicitly provided.
if _, ok := vtbackupAllFlags["builtinbackup-incremental-restore-path"]; !ok {
vtbackupAllFlags["builtinbackup-incremental-restore-path"] = vtDataRootPath
}
mysql.UpdateMySQLServerVersion(vtbackupAllFlags, mysqldImage)
pod := &corev1.Pod{
ObjectMeta: metav1.ObjectMeta{
Expand Down
1 change: 1 addition & 0 deletions test/endtoend/backup_restore_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,7 @@ verifyVtGateVersion "20.0.0"
checkSemiSyncSetup
takeBackup "commerce/-"
verifyListBackupsOutput
restoreBackup "$(vtctldclient GetTablets --keyspace commerce --tablet-type replica --shard '-' | head -1 | awk '{print $1}')"
takedownShard
resurrectShard
checkSemiSyncSetup
Expand Down
73 changes: 67 additions & 6 deletions test/endtoend/utils.sh
Original file line number Diff line number Diff line change
Expand Up @@ -83,24 +83,85 @@ function removeBackupFiles() {

# takeBackup:
# $1: keyspace-shard for which the backup needs to be taken
declare INCREMENTAL_RESTORE_TIMESTAMP=""
function takeBackup() {
keyspaceShard=$1
initialBackupCount=$(kubectl get vtb --no-headers | wc -l)
finalBackupCount=$((initialBackupCount+1))

# issue the backupShard command to vtctldclient
vtctldclient BackupShard "$keyspaceShard"
# Issue the BackupShard command to vtctldclient.
vtctldclient BackupShard "${keyspaceShard}"

for i in {1..600} ; do
out=$(kubectl get vtb --no-headers | wc -l)
echo "$out" | grep "$finalBackupCount" > /dev/null 2>&1
if [ $? -eq 0 ]; then
echo "Backup created"
if echo "${out}" | grep -c "${finalBackupCount}" >/dev/null; then
echo "Full backup created"
break
fi
sleep 3
done

sleep 10

mysql 'commerce:-' -e "flush binary logs" || echo "flushing binlogs failed"
mattlord marked this conversation as resolved.
Show resolved Hide resolved

# Now perform an incremental backup.
insertWithRetry
INCREMENTAL_RESTORE_TIMESTAMP=$(date -u "+%Y-%m-%dT%H:%M:%SZ")
sleep 1

mysql 'commerce:-' -e "flush binary logs" || echo "flushing binlogs failed"

sleep 10

vtctldclient BackupShard --incremental-from-pos=auto "${keyspaceShard}"
let finalBackupCount=${finalBackupCount}+1

mysql 'commerce:-' -e "flush binary logs" || echo "flushing binlogs failed"

for i in {1..600} ; do
out=$(kubectl get vtb --no-headers | wc -l)
if echo "${out}" | grep -c "${finalBackupCount}" >/dev/null; then
echo "Incremental backup created"
return 0
fi
sleep 3
done
echo -e "ERROR: Backup not created - $out. $backupCount backups expected."

echo -e "ERROR: Backups not created - ${out}. ${backupCount} backups expected."
exit 1
}

# restoreBackup:
# $1: tablet alias for which the backup needs to be restored
function restoreBackup() {
tabletAlias=$1
if [[ -z "${tabletAlias}" ]]; then
echo "Tablet alias not provided as restore target"
exit 1
fi

# Issue the PITR restore command to vtctldclient.
# This should restore the last full backup, followed by applying the
# binary logs to reach the desired timestamp.
if ! vtctldclient RestoreFromBackup --restore-to-timestamp "${INCREMENTAL_RESTORE_TIMESTAMP}" "${tabletAlias}"; then
echo "ERROR: failed to perform incremental restore"
exit 1
fi

cell="${tabletAlias%-*}"
uid="${tabletAlias##*-}"

for i in {1..600} ; do
out=$(kubectl get pods --no-headers -l "planetscale.com/cell=${cell},planetscale.com/tablet-uid=${uid}" | grep "Running" | wc -l)
if echo "$out" | grep -c "1" >/dev/null; then
echo "Tablet ${tabletAlias} restore complete"
return 0
fi
sleep 3
done

echo -e "ERROR: restored tablet ${tabletAlias} did not become healthy after the restore."
exit 1
}

Expand Down
5 changes: 3 additions & 2 deletions tools/get-e2e-test-deps.sh
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,9 @@ fi
if ! command -v vtctldclient &> /dev/null
then
echo "Downloading vtctldclient..."
version=16.0.1
file=vitess-${version}-d1ba625.tar.gz
version="19.0.1"
hash="3a43ab8"
file="vitess-${version}-${hash}.tar.gz"
wget https://github.com/vitessio/vitess/releases/download/v${version}/${file}
tar -xzf ${file}
cd ${file/.tar.gz/}
Expand Down
Loading