Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(PDB-5739) Reduce factsets "stable||volatile" merging #3972

Merged
merged 20 commits into from
May 14, 2024

Conversation

rbrw
Copy link
Contributor

@rbrw rbrw commented May 2, 2024

Without more potentially substantial changes to the query engine, we can't drop all the merging (e.g. the facts projection from the inventory endpoint) and switch to independent stable and volatile indexes, so for now, just consider some incremental changes along those lines.
cf. #3955

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@rbrw rbrw force-pushed the pdb-5739-reduce-fact-merging branch 2 times, most recently from f83ec9e to d8f2300 Compare May 7, 2024 16:37
rbrw added 4 commits May 9, 2024 14:23
Previously this function could try to call .getSQLState on unrelated
exceptions.
Noticed failing schema check in command-test output.
Since the other endpoints have to manufacture a synthetic
representation of the data (i.e. flattened key/values, etc.) and so
are likely to be more expensive, at least when a large number of
factsets is involved.
@rbrw rbrw force-pushed the pdb-5739-reduce-fact-merging branch 2 times, most recently from 329680e to e9de9b1 Compare May 13, 2024 22:49
rbrw added 16 commits May 14, 2024 11:58
Since we've observed that the || operation can be expensive, and we
don't actually want a full factset map here (we want the individual
top-level subtrees), rewrite it as a union.  This shouldn't change
anything material since pdb ensures the two key sets are disjoint.

This also assumes that it's not important to be compatible with the
existing stable||volatile index here.  The existing query could only
have been using the index if postgres is clever enough to see all the
way through the name -> key (via jsonb_each/.*) operations to identify
cases where the index might be relevant.
Since we've observed that the || operation can be expensive, and we
don't actually want a full factset map here (we want the individual
top-level subtrees), rewrite it as a union.  This shouldn't change
anything material since pdb ensures the two key sets are disjoint.

This also assumes that it's not important to be compatible with the
existing stable||volatile index here.  The existing query could only
have been using the index if postgres is clever enough to see all the
way through the recursive CTE, name -> key (via jsonb_each/.*)
operations to identify cases where the index might be relevant.
Since we've observed that the || operation can be expensive, and we
don't actually want a full factset map here (we want the individual
top-level subtrees), rewrite it as a union.  This shouldn't change
anything material since pdb ensures the two key sets are disjoint.

This also assumes that it's not important to be compatible with the
existing stable||volatile index here.  The existing query could only
have been using the index if postgres is clever enough to see all the
way through the json_agg -> json_build_object -> name -> key (via
jsonb_each/.*) operations to identify cases where the index might be
relevant.
Since we've observed that the || operation can be expensive, and we
don't actually want a full factset map here (we want the individual
top-level subtrees), rewrite it as a union.  This shouldn't change
anything material since pdb ensures the two key sets are disjoint.

This also assumes that it's not important to be compatible with the
existing stable||volatile index here.  The existing query could only
have been using the index if postgres is clever enough to see through
the key (via jsonb_each/.*) operation to identify cases where the
index might be relevant.
@rbrw rbrw force-pushed the pdb-5739-reduce-fact-merging branch from e9de9b1 to 9df5ae2 Compare May 14, 2024 17:00
@austb austb marked this pull request as ready for review May 14, 2024 17:48
@austb austb requested review from a team as code owners May 14, 2024 17:48
@austb austb merged commit a368ab3 into puppetlabs:main May 14, 2024
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants