-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: implement information_schema._pg_index_position #69911
sql: implement information_schema._pg_index_position #69911
Conversation
6033017
to
29b12be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should really do #69495 in v22.1. Until then, these queries really hurt if this gets called more than once in a transaction because we'll pull all of system.descriptor
every time this function is invoked. It's bad enough that we have to pull it once per transaction.
One approach to make things not suck would be to allocate unique OIDs per index and index those somewhere. That's a bigger project. That feels like a pretty big thing. I'm decreasingly inclined to pursue that as the next step. It's too many migrations and is pretty painful to leverage.
Instead I often think about a protocol to lease all of the descriptors with a "one version invariant". By that I mean there'd be some sort of lease on some resource which means that no schema changes can happen while that lease is outstanding. We'd need some way to pre-empt that sort of lease so that sessions fall back to reading the table per transaction. This is a very similar thing to what we'd want for low-latency changefeeds.
Given that pg_catalog
is database scoped, we could have a mechanism to lease all of the descriptors in a database and do that at the node level. That way we could still avoid any need to keep all descriptors for all databases in RAM at any point.
All this being said, I don't think these are the sorts of visions we're going to realize this release due to being seriously short on resources.
Reviewable status: complete! 0 of 0 LGTMs obtained
Could you remind me why we're discussing leasing, whether session-scoped or database-scoped, and not a more simple form of coordination-free caching? I'd imagine that if the concern is pulling all descriptors every time one of these builtins is invoked (per row in many cases), a session or even statement-scoped cache would provide a large win. You mention "It's bad enough that we have to pull it once per transaction.", which is fair, but also, maybe obscuring the lower hanging fruit. Am I thinking about this correctly? Do we have examples of where the descriptor lookups hurt the most? |
29b12be
to
bdaa65b
Compare
We have a transaction-scoped cache, hence the first paragraph and "It's bad enough that we have to pull it once per transaction." The problem is that it hangs off of the |
Got it, that makes sense. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just tiny nits
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/sql/sem/builtins/pg_builtins.go, line 1921 at r2 (raw file):
// (or NULL if not there). // // NOTE: this could be defined as a user-defined function, like
glad you thought so too! i had commented the same w.r.t. other builtins on #58356 (comment)
pkg/sql/sem/builtins/pg_builtins.go, line 1942 at r2 (raw file):
ReturnType: tree.FixedReturnType(types.Int), Fn: func(ctx *tree.EvalContext, args tree.Datums) (tree.Datum, error) { r, err := ctx.InternalExecutor.QueryRow(
similar to the pg_my_temp_schema PR - could you add a benchmark for this?
pkg/sql/sem/builtins/pg_builtins.go, line 1953 at r2 (raw file):
return nil, err } if len(r) == 0 {
nit: would it be more clear if this check were if r == nil
? (asking since it may not be clear that tree.Datums is a slice of Datums.) up to you
pkg/sql/sem/builtins/pg_builtins.go, line 1956 at r2 (raw file):
return tree.DNull, nil } return r[0], nil
nit: return tree.MustBeDInt(r[0]), nil
pkg/sql/sem/builtins/pg_builtins.go, line 1958 at r2 (raw file):
return r[0], nil }, Info: notUsableInfo,
nit: should the Info have a real description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah and one more nit: could you mention this new function in a release note?
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
bdaa65b
to
dfb8e4a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you mention this new function in a release note?
Done.
TFTRs!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)
pkg/sql/sem/builtins/pg_builtins.go, line 1942 at r2 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
similar to the pg_my_temp_schema PR - could you add a benchmark for this?
Done.
BenchmarkORMQueries/information_schema._pg_index_position-16 13 103740333 ns/op 15.00 roundtrips
pkg/sql/sem/builtins/pg_builtins.go, line 1953 at r2 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
nit: would it be more clear if this check were
if r == nil
? (asking since it may not be clear that tree.Datums is a slice of Datums.) up to you
We seem to use the len(r) == 0
check everywhere else in this file, so I'll keep it here for consistency.
pkg/sql/sem/builtins/pg_builtins.go, line 1956 at r2 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
nit:
return tree.MustBeDInt(r[0]), nil
DInt
(the return type of MustBeDInt
) doesn't implement Datum
, *DInt
does. We could eat an allocation here, but it doesn't seem worth it.
pkg/sql/sem/builtins/pg_builtins.go, line 1958 at r2 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
nit: should the Info have a real description?
I don't think so. These information_schema._pg_xxx
functions are just about as internal as they get. They aren't even documented by Postgres. So I feel comfortable not documenting this.
69913: sql: implement information_schema.{_pg_truetypid, _pg_truetypmod, _pg_char_max_length} r=nvanbenschoten a=nvanbenschoten Needed for #69010. This PR adds implementations for the following three builtin functions `information_schema._pg_truetypid` `information_schema._pg_truetypmod` `information_schema._pg_char_max_length` The first two functions return the "true" type ID and modifier, disregarding indirection introduced by domain types. The third returns the maximum character length of a type with the provided ID and modifier. The builtins are implemented as user-defined functions in Postgres here: https://github.com/postgres/postgres/blob/master/src/backend/catalog/information_schema.sql Combined with #69909 and #69911, this PR unlocks these two gnarly introspection queries in PostgREST: - https://github.com/PostgREST/postgrest/blob/b05898d17f8e33c8c82fc1d05a30eb3044999668/src/PostgREST/DbStructure.hs#L538 - https://github.com/PostgREST/postgrest/blob/b05898d17f8e33c8c82fc1d05a30eb3044999668/src/PostgREST/DbStructure.hs#L709 Release justification: None, waiting for v22.1. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Needed for cockroachdb#69010. This commit implements the `information_schema._pg_index_position` builtin function. Given an index's OID and an underlying-table column number, `information_schema._pg_index_position` return the column's position in the index (or NULL if not there). The function is implemented as a user-defined function in Postgres here: https://github.com/postgres/postgres/blob/master/src/backend/catalog/information_schema.sql Release note (sql change): The `information_schema._pg_index_position` builtin function is now supported, which improves compatibility with PostgreSQL. Release justification: None, waiting for v22.1.
dfb8e4a
to
a459229
Compare
bors r+ |
Build succeeded: |
Needed for #69010.
This commit implements the
information_schema._pg_index_position
builtinfunction. Given an index's OID and an underlying-table column number,
information_schema._pg_index_position
return the column's position in theindex (or NULL if not there).
The function is implemented as a user-defined function in Postgres here:
https://github.com/postgres/postgres/blob/master/src/backend/catalog/information_schema.sql
Release note (sql change): The
information_schema._pg_index_position
builtin function is now supported, which improves compatibility with
PostgreSQL.
Release justification: None, waiting for v22.1.