Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix parser for root (only) dataset names - investigate #619

Merged
merged 6 commits into from
Jan 9, 2024

Conversation

jimklimov
Copy link
Contributor

@jimklimov jimklimov commented Jan 8, 2024

Follows up from #585 and a Gitter discussion https://matrix.to/#/!XZJhhFzueFpkcSXLHR:gitter.im/$gVcghYo80LIz2zIdyJQyfxRYJkyb07mQjoB5bGH5zTU?via=gitter.im&via=inf.ethz.ch&via=matrix.org (and later posts during the day)

The crux of it is that znapzend v0.21.2 (current release) got confused when a backup schedule was defined on a root dataset of a pool AND the custom tsformat included colons to separate hours-minutes-seconds:

[2024-01-08 09:57:54.87309] [154415] [debug] === getDataSetProperties():
        Collected: $VAR1 = {
          'pre_znap_cmd' => 'off',
          'mbuffer' => 'off',
          'enabled' => 'on',
          'src_plan' => '1months=>1weeks,1years=>1months,10years=>6months',
          'mbuffer_size' => '128M',
          'zend_delay' => '0',
          'post_znap_cmd' => 'off',
          'src' => 'bpool',
          'dst_dst_0_plan' => '1months=>1week,1years=>1month,10years=>6months',
          'recursive' => 'on',
          'tsformat' => 'znapzend-auto-%Y-%m-%dT%H:%M:%SZ',
          'dst_dst_0' => 'znapzend:pond/export/DUMP/ci-deb/bpool'
        };

...

[2024-01-08 09:57:54.89115] [154583] [info] creating recursive snapshot on bpool
# ssh -o batchMode=yes -o ConnectTimeout=30 bpool@znapzend-auto-2024-01-08T10 zfs snapshot -r '22:13Z'
ssh: Could not resolve hostname znapzend-auto-2024-01-08t10: Name or service not known
# ssh -o batchMode=yes -o ConnectTimeout=30 22 zfs list -H -o name -t snapshot 13Z

So a request to snapshot a local bpool dataset as bpool@znapzend-auto-2024-01-08T10:22:13Z got interpreted by the generic routine as user=bpool, host=znapzend-auto-2024-01-08T10, dsname=22:13Z and an absent snap. The latter part also got further parsed for the subsequent command into host=22 and dsname=13Z, it seems.

This visibly strikes in sub createSnapshot {...}, sub destroySnapshots {...} and probably other consumers of sub $splitHostDataSet.

If the "DST" definition remains (per help/man) as [[user@]host:]dataset where dataset is fixed for DST but may be dsname[@snap] for general parsing (and snap may have some but not all sorts of funny characters) -- we don't really have good criteria (for regex or beside it) to tell apart a user@host from partial dataset@snap strings, in some cases, it seems.

There are various ideas in that thread that can be pursued as separate PRs. This one lays the foundations for such pursuits, by adding a few run-time sanity checks (e.g. to avoid destructive actions with bogus values), and some self-test code to gauge success of different solution attempts. A large helper in this effort is the t/znapzend-lib-splitter.t script which calls whatever implementations we have in ZFS.pm and runs them against a matrix of known remote, dataset and snapname strings (concatenated into what can be seen in production from configs and ZFS queries), to see if they get parsed back properly.

@jimklimov jimklimov changed the title Fix parser rootds investigate Fix parser rootds- investigate Jan 8, 2024
Copy link

github-actions bot commented Jan 8, 2024

@check-spelling-bot Report

Unrecognized words, please review:

  • a'u
  • aaaa
  • aaaf
  • aab
  • aabee
  • aabf
  • aac
  • aacba
  • aacc
  • aacea
  • aacf
  • aad
  • aaddeab
  • aadf
  • aae
  • aaea
  • aaeaececace
  • aaec
  • aaed
  • aaf
  • aafd
  • aaff
  • abbae
  • abcc
  • abcd
  • abdbae
  • abe
  • abeda
  • abee
  • abf
  • abfc
  • abff
  • aca
  • acaa
  • acaafa
  • acab
  • acaf
  • acb
  • acbba
  • acbce
  • acbef
  • accabf
  • accb
  • accbf
  • accd
  • acd
  • acdd
  • acddf
  • acea
  • acebe
  • acecd
  • acf
  • ada
  • adab
  • adacfa
  • adb
  • adbaa
  • adbb
  • adbc
  • adbe
  • adcbbd
  • adcce
  • addc
  • addes
  • adedd
  • adeea
  • adf
  • adfd
  • aea
  • aeaba
  • aeabccdf
  • aeac
  • aeaefea
  • aeb
  • aebcb
  • aebdf
  • aec
  • aeca
  • aecf
  • aed
  • aeddafc
  • aee
  • aeeecb
  • aef
  • aefb
  • aefbba
  • afad
  • afafb
  • afafba
  • afbb
  • afbd
  • afc
  • afcabf
  • afcb
  • afdcfd
  • afe
  • affaff
  • albundy
  • amd
  • amoser
  • antoneliasson
  • Aqs
  • arglist
  • asciidoc
  • assignements
  • atime
  • atj
  • autocreation
  • baaa
  • babc
  • bace
  • badb
  • baf
  • bafdc
  • bafe
  • baffde
  • bashism
  • bashrc
  • bba
  • bbac
  • bbb
  • bbba
  • bbbdf
  • bbbe
  • bbc
  • bbcb
  • bbce
  • bbd
  • bbdb
  • bbdc
  • bbdeb
  • bbe
  • bbeb
  • bbf
  • bbfc
  • bbfe
  • bbff
  • bca
  • bcabc
  • bcadf
  • bcaf
  • bcafdd
  • bcb
  • bcc
  • bccaed
  • bccb
  • bcccb
  • bccd
  • bcda
  • bcdd
  • bceb
  • bcecb
  • bcee
  • bcfccc
  • bda
  • bdaa
  • bdacb
  • bdaebf
  • bdaf
  • bdb
  • bdbbf
  • bdc
  • bdce
  • bdcf
  • bdd
  • bdda
  • bddcf
  • bddd
  • bdeff
  • bdf
  • bdfcfe
  • bdfd
  • bdfdd
  • bdfece
  • bea
  • beabbdd
  • beb
  • bebb
  • beced
  • bedbf
  • bedc
  • beddf
  • beeb
  • beec
  • beed
  • befa
  • befbaa
  • Benter
  • bentertain
  • bfa
  • bfab
  • bfad
  • bfb
  • bfbb
  • bfbf
  • bfbfc
  • bfc
  • bfca
  • bfcc
  • bfce
  • bfd
  • bfdfe
  • bfdff
  • bfe
  • bff
  • bffe
  • bfff
  • blockquotes
  • booleanish
  • bossert
  • buffersize
  • bugfix
  • bugfixes
  • buildpackage
  • bulletpoints
  • caa
  • caab
  • cabf
  • cac
  • cacb
  • cacbb
  • cadb
  • caddb
  • cae
  • caebfe
  • caee
  • cafc
  • cafcfab
  • cafdd
  • cafddfe
  • cafeccaeb
  • cange
  • canmount
  • cba
  • cbaac
  • cbae
  • cbaec
  • cbb
  • cbbd
  • cbbe
  • cbc
  • cbd
  • cbdc
  • cbdd
  • cbe
  • cbea
  • cbeb
  • cbee
  • cbefa
  • cbefc
  • cbfc
  • cbfd
  • cbfe
  • cca
  • ccab
  • ccac
  • ccae
  • ccba
  • ccbe
  • ccc
  • cccd
  • cccf
  • ccd
  • ccdb
  • ccdc
  • ccdcc
  • ccdfd
  • cce
  • cced
  • ccf
  • ccff
  • ccffef
  • cda
  • cdaaa
  • cdab
  • cdad
  • cdaeb
  • cdb
  • cdbcdeea
  • cdbfe
  • cdc
  • cdcb
  • cdce
  • cdd
  • cdda
  • cddb
  • cddd
  • cdde
  • cddf
  • cde
  • cdec
  • cdee
  • cdefe
  • cdfaafc
  • cdfe
  • cdff
  • cdffaed
  • cea
  • ceaf
  • ceafe
  • ceb
  • cec
  • ced
  • cedde
  • ceeb
  • ceebab
  • ceecf
  • ceef
  • cef
  • cefae
  • cefba
  • cfa
  • cfabe
  • cfafde
  • cfb
  • cfbd
  • cfbf
  • cfc
  • cfcbcfd
  • cfe
  • cfed
  • cff
  • cffc
  • cfff
  • changelog
  • chrigel
  • chrisridd
  • christo
  • chroot
  • cmdfail
  • cmds
  • codebase
  • concating
  • conmplete
  • coprs
  • copypasted
  • coredumps
  • cpanminus
  • crfl
  • crlf
  • cron
  • CVS
  • daa
  • daad
  • daae
  • daba
  • dabc
  • dabf
  • dac
  • daccb
  • dadb
  • dadc
  • daded
  • dadfbd
  • daed
  • daf
  • dafb
  • dafeeee
  • datarootdir
  • datastream
  • dba
  • dbab
  • dbacc
  • dbacf
  • dbad
  • dbb
  • dbba
  • dbbb
  • dbbff
  • dbc
  • dbca
  • dbcb
  • dbcbcc
  • dbce
  • dbdbe
  • dbdd
  • dbe
  • dbecd
  • dbf
  • dbfa
  • dbfb
  • dbfcdacf
  • dcaa
  • dcac
  • dcafee
  • dcbe
  • dcc
  • dccba
  • dccc
  • dcccbb
  • dccdcc
  • dccf
  • dcd
  • dcda
  • dcdaa
  • dcdc
  • dcdcae
  • dce
  • dceb
  • dcebde
  • dcee
  • dcf
  • dcfcdf
  • ddad
  • ddae
  • ddb
  • ddc
  • ddcc
  • ddcf
  • ddd
  • dddacf
  • dddb
  • dddd
  • ddddfef
  • dddf
  • dde
  • ddeb
  • ddec
  • ddf
  • ddfaddc
  • ddfd
  • ddfe
  • deaa
  • deaab
  • deac
  • debb
  • debd
  • debhelper
  • decf
  • decfcd
  • deconstructing
  • ded
  • dede
  • dedup
  • deeaefda
  • defbf
  • defcf
  • deffd
  • definedness
  • dependecy
  • dependeny
  • deref
  • destorying
  • dfa
  • dfaa
  • dfacff
  • dfb
  • dfbde
  • dfc
  • dfca
  • dfcd
  • dfd
  • dfda
  • dfde
  • dfe
  • dfeba
  • dfec
  • dfed
  • dfee
  • dff
  • dffd
  • dglushenok
  • dickson
  • dirs
  • DISTRIBUTIONNAME
  • docu
  • dominik
  • domnik
  • dpkg
  • dse
  • DTDs
  • Dungen
  • dylan
  • eaa
  • eaab
  • eab
  • eabe
  • eabedbaeacc
  • eac
  • eacaf
  • eade
  • eae
  • eaedb
  • eaf
  • eafae
  • eafd
  • eafff
  • eba
  • ebad
  • ebbad
  • ebbb
  • ebbc
  • ebbf
  • ebbfe
  • ebc
  • ebca
  • ebcb
  • ebd
  • ebdbcfc
  • ebdd
  • ebe
  • ebea
  • ebebe
  • ebf
  • ebfe
  • eca
  • ecaac
  • ecae
  • ecbc
  • ecc
  • eccc
  • ecd
  • ecdceef
  • ecde
  • ecdsa
  • ece
  • eced
  • eceeb
  • ecf
  • eda
  • edaad
  • edac
  • edaf
  • edb
  • edbaac
  • edbc
  • edbd
  • edbe
  • edc
  • edcfa
  • edd
  • eddac
  • eddec
  • eddfe
  • ede
  • edeba
  • edec
  • ededad
  • edede
  • edef
  • edf
  • edfa
  • edffb
  • edffd
  • edouard
  • edu
  • eea
  • eeabba
  • eeac
  • eead
  • eeaf
  • eeb
  • eebae
  • eec
  • eeca
  • eecaa
  • eecbcccc
  • eed
  • eedb
  • eedbe
  • eede
  • eee
  • eeea
  • eeebfcc
  • eeefc
  • eef
  • eefb
  • efa
  • efaa
  • efac
  • efadde
  • efaf
  • efb
  • efbade
  • efbd
  • efbdc
  • efbf
  • efc
  • efcbf
  • efcd
  • efd
  • efdd
  • efddbf
  • efe
  • efebb
  • effc
  • effec
  • Eliasson
  • Elmar
  • ENOMEM
  • environement
  • eol
  • epruesse
  • erroring
  • extist
  • faa
  • faab
  • faac
  • faad
  • fabbc
  • fabd
  • fabdbf
  • fabe
  • fabeffc
  • faca
  • fadfb
  • fadfde
  • faeb
  • faecf
  • faef
  • faf
  • fafd
  • failsafes
  • fba
  • fbae
  • fbb
  • fbba
  • fbbb
  • fbc
  • fbca
  • fbcb
  • fbcf
  • fbd
  • fbde
  • fbdffce
  • fbe
  • fbea
  • fbed
  • fbf
  • fbfc
  • fbfcaa
  • fca
  • fcaba
  • fcb
  • fcbee
  • fcbf
  • fcc
  • fccb
  • fccd
  • fcd
  • fcddf
  • fce
  • fceba
  • fcec
  • fcef
  • fcf
  • fcff
  • fda
  • fdadb
  • fdadf
  • fdae
  • fdb
  • fdbbc
  • fdbd
  • fdc
  • fdcbeca
  • fdcc
  • fdd
  • fddc
  • fddccfab
  • fddea
  • fddfda
  • fde
  • fdeb
  • fdeccfcf
  • fdef
  • fdefcb
  • fdf
  • fdfb
  • fdfbf
  • fdfc
  • fea
  • fead
  • feaf
  • feb
  • fecbcc
  • fedc
  • fedd
  • fede
  • fedoraproject
  • feeaf
  • feedad
  • feeeeb
  • feefb
  • fefd
  • fefdcbfa
  • ffaee
  • ffb
  • ffbf
  • ffbfa
  • ffc
  • ffca
  • ffcc
  • ffcdc
  • ffd
  • ffdb
  • ffde
  • ffdea
  • ffdf
  • ffdfba
  • ffe
  • ffea
  • ffed
  • fff
  • fffaf
  • fffcdbe
  • fffd
  • fffe
  • flaged
  • flixman
  • freenode
  • FRONTEND
  • frubar
  • generatable
  • ghanima
  • grantwwu
  • greggbg
  • griffith
  • guyz
  • HAARG
  • hardcoded
  • hardcoding
  • hashpointers
  • haystask
  • healthian
  • homedir
  • hotfix
  • howtogeek
  • I'u
  • ico
  • iki
  • ilm
  • implem
  • incrementals
  • informatique
  • inhmode
  • initialising
  • instanciating
  • invokations
  • irc
  • issuecomment
  • jamesmarsh
  • jenkins
  • jimkilmov
  • JMo
  • jsoref
  • justinscholz
  • karssen
  • kauffman
  • keygen
  • kngnt
  • Kuzmarski
  • lauri
  • lckarssen
  • Lennart
  • leoj
  • logbias
  • lotheac
  • machanics
  • Makefiles
  • malc
  • manpade
  • manuel
  • metaworx
  • morphsen
  • nahall
  • nameing
  • noreply
  • Nyman
  • Oostendorp
  • oss
  • OSX
  • parseable
  • Phlogi
  • png
  • pobox
  • polyomica
  • poolname
  • poolrootfs
  • prebuild
  • previosuly
  • primarycache
  • Proxmox
  • Pruesse
  • pullrequests
  • pulsewidth
  • rageltman
  • rbash
  • rczei
  • refactor
  • refquota
  • refreservation
  • regen
  • regexes
  • regexp
  • remotehost
  • renard
  • repoen
  • respinn
  • Ridd
  • rsa
  • rsync
  • rueegg
  • rwilkey
  • schould
  • secondarycache
  • sempervictus
  • shapshot
  • shaun
  • shess
  • shlibs
  • simplifie
  • smv
  • snaptime
  • snyman
  • softprops
  • somecommand
  • Soref
  • spashot
  • spellcheck
  • spellchecker
  • spikings
  • stringify
  • subdir
  • subfolder
  • substvars
  • supress
  • svn
  • sylvain
  • symlinking
  • syslogstyle
  • testbird
  • testmode
  • Thu
  • timewarp
  • Tirkkonen
  • tisc
  • tiscarabee
  • trialen
  • truobleshooting
  • tuxera
  • uchicago
  • unneccessary
  • usecase
  • useradd
  • usermod
  • whitelist
  • Wiedenroth
  • wiedi
  • wip
  • workaround
  • wouter
  • xtrue
  • yandex
  • Zends
  • zet
  • zfsonlinux
  • zie
  • znapdest
  • znapzendztats
  • ztatz
Previously acknowledged words that are now absent aix Autotools bashisms CBuilder Cwd cygwin DBD ev Fcntl fh forkcall gh Gregy gz Ip JB JBERGER LEONT Mkbootstrap nf nh oi Pipely qq qw RCAPUTO README rr rw SUBDIRS SZ Ubuntu ve VOS wu wx xargs xf yy ZL
Some files were were automatically ignored

These sample patterns would exclude them:

^AUTHORS$
^debian/znapzend\.links\.in$

You should consider adding them to:

.github/workflows//spelling/excludes.txt

File matching is via Perl regular expressions.

To check these files, more of their words need to be in the dictionary than not. You can use patterns.txt to exclude portions, add items to the dictionary (e.g. by adding them to allow.txt), or fix typos.

To accept these unrecognized words as correct (and remove the previously acknowledged and now absent words), run the following commands

... in a clone of the null repository
on the master branch:

update_files() {
perl -e '
my @expect_files=qw('".github/workflows//spelling/whitelist.txt"');
@ARGV=@expect_files;
my @stale=qw('"$patch_remove"');
my $re=join "|", @stale;
my $suffix=".".time();
my $previous="";
sub maybe_unlink { unlink($_[0]) if $_[0]; }
while (<>) {
if ($ARGV ne $old_argv) { maybe_unlink($previous); $previous="$ARGV$suffix"; rename($ARGV, $previous); open(ARGV_OUT, ">$ARGV"); select(ARGV_OUT); $old_argv = $ARGV; }
next if /^(?:$re)(?:(?:\r|\n)*$| .*)/; print;
}; maybe_unlink($previous);'
perl -e '
my $new_expect_file=".github/workflows//spelling/whitelist.txt";
use File::Path qw(make_path);
use File::Basename qw(dirname);
make_path (dirname($new_expect_file));
open FILE, q{<}, $new_expect_file; chomp(my @words = <FILE>); close FILE;
my @add=qw('"$patch_add"');
my %items; @items{@words} = @words x (1); @items{@add} = @add x (1);
@words = sort {lc($a)."-".$a cmp lc($b)."-".$b} keys %items;
open FILE, q{>}, $new_expect_file; for my $word (@words) { print FILE "$word\n" if $word =~ /\w/; };
close FILE;
system("git", "add", $new_expect_file);
'
(cat '.github/workflows//spelling/excludes.txt' - <<EOF
$should_exclude_patterns
EOF
) |grep .|
sort -f |
uniq > '.github/workflows//spelling/excludes.txt.temp' &&
mv '.github/workflows//spelling/excludes.txt.temp' '.github/workflows//spelling/excludes.txt'
}

comment_json=$(mktemp)
curl -L -s -S \
  --header "Content-Type: application/json" \
  "https://api.github.com/repos/oetiker/znapzend/issues/comments/1881510638" > "$comment_json"
comment_body=$(mktemp)
jq -r .body < "$comment_json" > $comment_body
rm $comment_json

patch_remove=$(perl -ne 'next unless s{^</summary>(.*)</details>$}{$1}; print' < "$comment_body")
  

patch_add=$(perl -e '$/=undef;
$_=<>;
s{<details>.*}{}s;
s{^#.*}{};
s{\n##.*}{};
s{(?:^|\n)\s*\*}{}g;
s{\s+}{ }g;
print' < "$comment_body")
  

should_exclude_patterns=$(perl -e '$/=undef;
$_=<>;
exit unless s{(?:You should consider excluding directory paths|You should consider adding them to).*}{}s;
s{.*These sample patterns would exclude them:}{}s;
s{.*\`\`\`([^`]*)\`\`\`.*}{$1}m;
print' < "$comment_body" | grep . || true)

update_files
rm $comment_body
git add -u

@jimklimov jimklimov changed the title Fix parser rootds- investigate Fix parser for root (only) dataset names - investigate Jan 8, 2024
…a pool [oetiker#585]

Signed-off-by: Jim Klimov <jimklimov@gmail.com>
…Set [oetiker#585]

Signed-off-by: Jim Klimov <jimklimov@gmail.com>
…) and splitDataSetSnapshot() [oetiker#585]

Signed-off-by: Jim Klimov <jimklimov@gmail.com>
…ent is actually a "dataset@snapname", return an undef snapname in the array if not [oetiker#585]

Signed-off-by: Jim Klimov <jimklimov@gmail.com>
@oetiker oetiker merged commit 64f26ce into oetiker:master Jan 9, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants