-
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v5.0.0
: backup/restore apps, overhaul TTS, add new node to existing cluster + 🐛 fixes
#210
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
jessebot
changed the title
Feature: restore app + some 🐛 fixes
Features: restore app, add new node to existing cluster + some 🐛 fixes
Apr 14, 2024
jessebot
changed the title
Features: restore app, add new node to existing cluster + some 🐛 fixes
Features: restore app, add new node to existing cluster + 🐛 fixes
Apr 19, 2024
jessebot
changed the title
Features: restore app, add new node to existing cluster + 🐛 fixes
Features: restore app, overhaul TTS, add new node to existing cluster + 🐛 fixes
Apr 19, 2024
jessebot
changed the title
Features: restore app, overhaul TTS, add new node to existing cluster + 🐛 fixes
New Major Version: backup/restore apps, overhaul TTS, add new node to existing cluster + 🐛 fixes
May 9, 2024
… more restore job checking to fail
…smol_k8s_lab.utils.value_from.extract_secret
… change verison back to v5
just some docs to do, and then we're all set :D |
jessebot
changed the title
New Major Version: backup/restore apps, overhaul TTS, add new node to existing cluster + 🐛 fixes
May 15, 2024
v5.0.0
: backup/restore apps, overhaul TTS, add new node to existing cluster + 🐛 fixes
jessebot
added a commit
that referenced
this pull request
May 15, 2024
… cluster + 🐛 fixes (#210) * actually apply the seaweedfs appset after restoring the seaweedfs PVCs * fix subproc calls for recovery job checking * refine postgres recovery job checking a bit more * add some color to success and failure reporting in the logs and allow more restore job checking to fail * drastically simply how we check that the recovery job is done by just waiting on it * put restore into it's own tab * make sure if we can't get the deployment immediately for nextcloud, we keep trying * add restic to required docs * fix cap header for matrix * fixing matrix restores and parametizing more of nextcloud restores * switch to using argocd as an object * updating poetry lock file * fix typo of recusion to recursion * start fleshing out the new backup button to do restic pvc backups * add cnpg backups to default supported backups * rig up backup button for on demand backups ❇️ * databases exist outside of nextcloud * simplify nextcloud occ commands in backup.py * change name from postgresql to postgres * overhaul of backups and restores. backups via the tui should work for nextcloud now * move value_from function out of tui widget and into general utils as smol_k8s_lab.utils.value_from.extract_secret * update value from function to do more error checking * use backups section instead of secret keys but still update secret keys in appset secret plugin secret * move repetitive backup processing to value_from lib and have nextcloud use it * setup backups and restores for matrix * finish up initial setting up of backup and restore functions for zitadel, mastodon, matrix, nextcloud, and home assistant * catch error of unable to get serverVersion when docker is not enabled and the cluster is k3d or kind. we now log an error and suggest enabling docker but set platform and version to unkown * should be serverVersion not semverVersion * generate audio for macos 64 bit arm, and unknown cluster versions * make backup jobs unique by giving them timestamps * update poetry lock * finally found the perfect kubectl cmd to make backup button finish the backup button wasn't finishing because the job wait command was timing out. set timeout to 15m because backup can take a really long time depending on how much data you are backing up and what your connection speeds are * split off trigger_backup into its own worker method * change where we we declare the hostname for zitadel * fix up disabling of displayed rows for restore widget * fix home assistant header and vouch's too, but also clean up unused keycloak stuff in vouch * fix display of snapshot grid on start if we have restores disabled * cache getting restore_enabled and snapshots out of dicts into self for restores widget * change RestoreAppConfig to RestoreApp and change references to restic_snapshot_ids to snapshots within that widget * update backup tooltips and try to speed up mounting * never print output from create secret unless there's an error * fix variable names for vouch and comment out more keycloak stuff * update how we do zitadel headers so we talk about explicitly syncing vs setting up zitadel * quietly do backups in the background via the backup widget * display an orange loading indicator while we do the backup in the background * fix more places where we don't need spinner if this is called from the tui * fix notfiy spacing and add tooltip to loading indicator for backups in tui * speed up input widget a tad * clean up names of smol-k8s-lab generated backups and further clean up backup notifications in tui * fix color of header rows * fix schedule name input for backup widget * fix OAuth typo * add backup credentials to default generated home assistant credentials * fix home assistant s3 backups credentials * fix tool tips for s3 backups section in tui to display key instead of value * catch issue where sometimes cnpg restore is not possible at all * fix issue where we were using _ instead of - for home assistant backups and restores * catch more issues with _ vs - * create home assistant pvc with new pvc capacity * update constants for smol-tts to output audio to a config directory * allow for running using integrated macos gpu when on arm64 machine types, else, check for cuda, and if not cuda use cpu for torch * update poetry.lock for a mac * only generate audio file if the old one doesn't exist * update smol-tts to do more checking before regenerating an mp3 * fix underscore to hyphen issue, again, with home assistant * fix restic repo password for prcoess backup vals func * fix backup schedule appset secret plugin updates * always apply the external secrets for home assistant restores * fix allt he references to external_secrets_appset.yaml to be external_secrets_argocd_appset.yaml * udpate to use pyglet instead of playsound or playaudio * update poetry lock * switching to pyglet everywhhere * add a plain non-k8up restic restore job and a recreate_pvc function to share between that and the seaweedfs pvc creation * add timestamps to restore jobs and mount_path to plain restic restore job * add a wait section to restore plain restic job function * reload home assistant deployment after we restore it and template out the home assistant namespace for restores * allow always restoring home assistant, even if it's already installed and running * optimize getting deployments and pod names and always use defined argocd namespace for appset secret plugin * fix create_resitc_restore_job typo to be create_restic_restore_job * switch to using sync argocd app instead of refreshing deployment for appset secret plugin * need to pass in HOME to get restic snapshots, need to pass in namespace to put the restore job in the right place * fix where we get home assistant namespace and fix occurances of tolerations_ to be toleration_ for all variables * adding namespace to getting pod names and making sure to not get list index of pods unless it list is populated * fix where we get sensitive values, and make sure we get restic_repo_password with a default value * switching to pygame for audio because nothing else is consistent * debian: verify pygame is now working appropriately for audio in the tui * remove commented cruft * add delete app button * fix delete button spacing * fix restore_seaweedfs call for nextcloud and allow rollout check to fail * add some more logging for syncing and deleting apps * add some error catching for if we can't find a nextcloud pod, and use our K8s lib for getting the pod * add some more logging and checking around restores and use re-usable function for restoring app PVCs for matrix and nextcloud * catch issue where sometimes a snapshot ID is only numbers, so we convert the int to a str * restores: label values file with app name, reuse barman object for cnpg restore, remove trailing slash from s3 bucket destination for cnpg * name the cnpg cluster the same as the end result when recovering * allow anything with postgres-cluster to grab the cnpg-cluster targetRevision from argocd * don't require getting pod to finish with return for nextcloud, add a timeout of 30 minutes to the postgres restore job * allow extra labels for getting pod name * update how we fix maintainence mode for nextcloud after restore * make recovery backup and scheduled backup sections for cnpg {} instead of [] and use copy of barman_obj for recovery * fix incorrect username used for restores of cnpg * clean up unused values for cnpg operator * simplify the restore dict updates after restore for cnpg cluster * try installing alsa for linux ci * add docs about installing alsa on debian * attempt to get alsa working via ci * only mess with secrets if matrix's restore is enabled * fix post restore job for cnpg * fix matrix namespace declaration * simplify updating matrix pvc during restores by templating the pvc name * set externalClusters to [] after restore of cnpg cluster * add wal parallel back to backups and compress the restore dict a bit * adding gzip and maxparallel 8 for wal archive for cnpg restores untested with matrix or nextcloud * try just disabling mixer if audio device not enabled * add log message of no audio device found * remove gzip from wal archives for cnpg restores * adjust wall archives to be 4 at once instead of 8 * max said they would order pizza when this was working :fingers_crossed: * move minio_lib to utils and add get_object and list_object methods, then make sure we pass in the backup id to restores * always make sure the final wal archive is there for backups of cnpg databases * update ArgoCD to have optional k8s requirement * add backup credentials getting * always grab the s3 endpoint if cnpg restores are enabled * only use ArgoCD in apps_screen if this is an existing cluster * fix namespace missing from backup * add .decode('utf-8') to get str of pgsql s3 creds * don't show backup now button unless this is an existing cluster * update backups to always check for end wal file for cnpg, and clean up backups tui * check in attempts to make restores work again * wait an additional 30 seconds on that backup just in case * wait for s3 to be up before applying recovery job for cnpg operator, and always download the backup.info * retry syncs if they fail * immediately install the argocd appset plugin before argo is fully managed by itself * update install the argocd appset plugin * add more logging for restores and call it restore_cnpg_cluster instead of restore_postgresql * maybe fix appset secret plugin url * fix missing updates of s/restore_postgresql/restore_cnpg_cluster/ * updating argocd appset plugin to create the argocd project ahead of time * actually break out of loop checking for s3 being up for cnpg restores * make sure to get the correct source_repos for the project, and properly template all the namespaces too * fix unexpected key error for vouch * switching back to immediate restore and adding more backup safety checks * change how we do backups for cnpg to always wait till we can consistently get the correct wal, this time for real, we hope * add a bit of a pause between checks for the backup.info file in s3 for cnpg restores * use new kwargs format for helm class * accomadate postgres and pvc schedule settings * switch from s3_user and s3_pass to secretAccessKey and accessKeyId * finish up standardizing s3 credentials * don't check in any logs we generate locally * simply restores everywhere and always take postgres schedule for restores as a variable * add a basic wait command for kubernetes and make sure we wait for seaweedfs to be fully up before continuing restore process * fix default config to run postgres backups at midnight and file backups at midnight ten * keep trying after a wait fails to find resources for smol-k8s-lab * allow waits to fail for k8s and set loglevel to warn for argocd app wait/sync * try to fix calls to helm lib * add comment about what we're doing in argo setup func * don't show the hello from pygame message * always ignore the main en and nl dirs * always ignore the full audio files * update appimage creation process * linting and commenting * add tar and untar commands * print how long it took in both english and dutch, change order of checking which options were passed in for smol-tts * add a keys section and update unknown verison text * add project name for argocd tests * add argocd_config['argo']['cluster'] for all the ci tests * speed up init values loading by swiching away from a collapsible * clean up colors * change green to explicit hex * fix mastodon restore error of too many arguments * fix space typo for argocd app sync command * optimize tui loading for apps screen a bit more * add first app is audio for selection list * add first app is audio for selection list * cleaning up and refactoring for speed and audio in tui * remove a layer of vertical scroll container for apps screen * update how we deal with unfound audio files; also add additional phrases; also fix the scroll bars and nested containers * tidying help text screenshot * move k9s to be run command and move (and subproc) under utils.run also do minor refactor of both smol-k8s-lab and tui-config screens * clean up run command some more and upgrade textual * cleaning up screenshots * finally finish up final_command styling and option selection * change all - commands to have spaces instead for run_command * actually insert the final command * fix option evaluation for final run command * fix ci tests to include final command test and make sure we accept same window as option for window behavior * update credentials screen sizing and screenshot * change size of apps config modify globals button * add id to modify globals button so we can use it for tcss queries. add new screnshot of apps screen * update apps screen screenshot again * cleaning up a bit * more cleaning * update existing clsuter screenshot * add new start screen with existing cluster example * update start screen screenshot * add better logging password config and run command screenshot * fix modal screen buttons for some font types * update new node widget and screen * adding new screenshots for new node widget and new nodes screen * add modify global params modal screen screenshot * update make screenshots script * add modify node modal screen * tidy up the audio for node modification screens * add delete node modal screenshot * docs: add new apps screen screenshots, linting, replace jessebot with small-hack org * linting and updating descriptions * update the add remote node screenshot and alt text * update tui screenshots and config file examples * add cluster parameter to all apps and change ref to revision anywhere that was left over * update the backup sections of all the backup supported apps and also all the sensitive values for all the supported apps, and update the libraries and format of default landing page * do a minor clean up of all the experimental apps * add new input names for k3d and kind node inputs for audio * finish up generating audio for all of the distro screen for both kind and k3d * update networking tab audio to be 'networking options tab' * add backup and restore tabs for audio generation * add some more phrases for backups * fix saying app bug * fix how we say PVC * fix more backups input audio * more troubleshooting of restic repo password audio * fix restic repo input audio generation * update s3 configuration collapsible audio generation * regenerate many input fields audio * add more input to the ends of things * update audio widget to process node datatable and always say input after input id is read * add button as a default thing we say and remove button from ending of all other phrases * add button say method * update screen descriptions for config screens * remove word button from focused so we don't try to say it twice * switch to saying drop down menu if we find a select * switch to special switch method * add switch phrase * try to say split better and add window behavior select phrase * clean up more input fields to reduce words needed * add some more links for accessibility * adding the audio files finally * change all the refs of feature branch back to regular main branch and change verison back to v5 * switch from valueFrom to value_from to be consistent * update docs for both nextcloud and matrix backup and restores * add a basic roadmap * update help image * add more roadmap stuff * prep for appimage test * add logo for smol-k8s-lab, why not * updating deps * add latest audio tarball * update appimage config yaml for testing * note that brew is still wonky and disable generating audio on tag * update home assistant and zitadel backups and restores and clean up typos in matrix and nextcloud
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Features
Sensitive Values Overhaul
value_from
map for bothapps.APP_NAME.init.values
and any value underapps.APP_NAME.backup
apps.APP_NAME.init.sensitive_values
list has been removedNOTE: the TUI doesn't support setting sensitive values via value_from right now. It will just pull your sensitive value and change it to dots. This feature will come at a later date.
Backups and Restores
currently only supported for a handful of apps (nextcloud, matrix, mastodon, home assistant, and zitadel), but more coming soon!
StorageClassName
calledapps_global_config.pvc_storage_class
in the yaml (used by default for nextcloud and matrix right now)Overhaul the text to speech features to be their own widget called SmolAudio
smol_k8s_lab.tui.accessibility.text_to_speech.speech_program
creates a config/audio/en.yaml
andconfig/audio/nl.yaml
for custom titles, descriptions, phrases, and common wordssmol-tts
for generating text to speech audio files for each languageNew nodes for k3s after cluster is already up
Release process
We will now attempt to release an appimage each time we release :) This will assist in ensuring the brew install goes more smoothly in the future. This has never been done for this project before, so we expect some initial growing pains on this. Please be patient as we get a consistent appimage.
Misc
3.12
!run_command
section that allows you to run commands either during smol-k8s-lab's app config phase, or after it. It looks like this:Bug Fixes
BW_HOST
,BW_SESSION
apps_global_config.external_secrets
was set to bitwarden, the bitwarden credentials were requested even if password manager was disabled AND the external secrets operator app was disabledMisc changes
outstanding tasks
This PR will be merged in conjuction with: small-hack/argocd-apps#695