Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: implement information_schema._pg_index_position #69911

Merged
merged 1 commit into from
Sep 14, 2021

Conversation

nvanbenschoten
Copy link
Member

@nvanbenschoten nvanbenschoten commented Sep 8, 2021

Needed for #69010.

This commit implements the information_schema._pg_index_position builtin
function. Given an index's OID and an underlying-table column number,
information_schema._pg_index_position return the column's position in the
index (or NULL if not there).

The function is implemented as a user-defined function in Postgres here:
https://github.com/postgres/postgres/blob/master/src/backend/catalog/information_schema.sql

Release note (sql change): The information_schema._pg_index_position
builtin function is now supported, which improves compatibility with
PostgreSQL.

Release justification: None, waiting for v22.1.

@nvanbenschoten nvanbenschoten requested a review from a team September 8, 2021 03:41
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@nvanbenschoten nvanbenschoten changed the title sql: implement information_schema._pg_char_max_length sql: implement information_schema._pg_index_position Sep 8, 2021
@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/pg_has_role branch 2 times, most recently from 6033017 to 29b12be Compare September 8, 2021 05:11
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should really do #69495 in v22.1. Until then, these queries really hurt if this gets called more than once in a transaction because we'll pull all of system.descriptor every time this function is invoked. It's bad enough that we have to pull it once per transaction.

One approach to make things not suck would be to allocate unique OIDs per index and index those somewhere. That's a bigger project. That feels like a pretty big thing. I'm decreasingly inclined to pursue that as the next step. It's too many migrations and is pretty painful to leverage.

Instead I often think about a protocol to lease all of the descriptors with a "one version invariant". By that I mean there'd be some sort of lease on some resource which means that no schema changes can happen while that lease is outstanding. We'd need some way to pre-empt that sort of lease so that sessions fall back to reading the table per transaction. This is a very similar thing to what we'd want for low-latency changefeeds.

Given that pg_catalog is database scoped, we could have a mechanism to lease all of the descriptors in a database and do that at the node level. That way we could still avoid any need to keep all descriptors for all databases in RAM at any point.

All this being said, I don't think these are the sorts of visions we're going to realize this release due to being seriously short on resources.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@nvanbenschoten
Copy link
Member Author

Could you remind me why we're discussing leasing, whether session-scoped or database-scoped, and not a more simple form of coordination-free caching? I'd imagine that if the concern is pulling all descriptors every time one of these builtins is invoked (per row in many cases), a session or even statement-scoped cache would provide a large win. You mention "It's bad enough that we have to pull it once per transaction.", which is fair, but also, maybe obscuring the lower hanging fruit. Am I thinking about this correctly? Do we have examples of where the descriptor lookups hurt the most?

@ajwerner
Copy link
Contributor

ajwerner commented Sep 8, 2021

Could you remind me why we're discussing leasing, whether session-scoped or database-scoped, and not a more simple form of coordination-free caching?

We have a transaction-scoped cache, hence the first paragraph and "It's bad enough that we have to pull it once per transaction." The problem is that it hangs off of the extraTxnState which is not shared by the internal executor, hence the reference to #69495.

@nvanbenschoten
Copy link
Member Author

The problem is that it hangs off of the extraTxnState which is not shared by the internal executor, hence the reference to #69495.

Got it, that makes sense. Thanks.

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just tiny nits

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/sql/sem/builtins/pg_builtins.go, line 1921 at r2 (raw file):

	// (or NULL if not there).
	//
	// NOTE: this could be defined as a user-defined function, like

glad you thought so too! i had commented the same w.r.t. other builtins on #58356 (comment)


pkg/sql/sem/builtins/pg_builtins.go, line 1942 at r2 (raw file):

			ReturnType: tree.FixedReturnType(types.Int),
			Fn: func(ctx *tree.EvalContext, args tree.Datums) (tree.Datum, error) {
				r, err := ctx.InternalExecutor.QueryRow(

similar to the pg_my_temp_schema PR - could you add a benchmark for this?


pkg/sql/sem/builtins/pg_builtins.go, line 1953 at r2 (raw file):

					return nil, err
				}
				if len(r) == 0 {

nit: would it be more clear if this check were if r == nil ? (asking since it may not be clear that tree.Datums is a slice of Datums.) up to you


pkg/sql/sem/builtins/pg_builtins.go, line 1956 at r2 (raw file):

					return tree.DNull, nil
				}
				return r[0], nil

nit: return tree.MustBeDInt(r[0]), nil


pkg/sql/sem/builtins/pg_builtins.go, line 1958 at r2 (raw file):

				return r[0], nil
			},
			Info:       notUsableInfo,

nit: should the Info have a real description?

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah and one more nit: could you mention this new function in a release note?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)

Copy link
Member Author

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you mention this new function in a release note?

Done.

TFTRs!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)


pkg/sql/sem/builtins/pg_builtins.go, line 1942 at r2 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

similar to the pg_my_temp_schema PR - could you add a benchmark for this?

Done.

BenchmarkORMQueries/information_schema._pg_index_position-16         	      13	 103740333 ns/op	        15.00 roundtrips

pkg/sql/sem/builtins/pg_builtins.go, line 1953 at r2 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

nit: would it be more clear if this check were if r == nil ? (asking since it may not be clear that tree.Datums is a slice of Datums.) up to you

We seem to use the len(r) == 0 check everywhere else in this file, so I'll keep it here for consistency.


pkg/sql/sem/builtins/pg_builtins.go, line 1956 at r2 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

nit: return tree.MustBeDInt(r[0]), nil

DInt (the return type of MustBeDInt) doesn't implement Datum, *DInt does. We could eat an allocation here, but it doesn't seem worth it.


pkg/sql/sem/builtins/pg_builtins.go, line 1958 at r2 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

nit: should the Info have a real description?

I don't think so. These information_schema._pg_xxx functions are just about as internal as they get. They aren't even documented by Postgres. So I feel comfortable not documenting this.

craig bot pushed a commit that referenced this pull request Sep 14, 2021
69913: sql: implement information_schema.{_pg_truetypid, _pg_truetypmod, _pg_char_max_length} r=nvanbenschoten a=nvanbenschoten

Needed for #69010.

This PR adds implementations for the following three builtin functions
`information_schema._pg_truetypid`
`information_schema._pg_truetypmod`
`information_schema._pg_char_max_length`

The first two functions return the "true" type ID and modifier, disregarding indirection introduced by domain types. The third returns the maximum character length of a type with the provided ID and modifier.

The builtins are implemented as user-defined functions in Postgres here: https://github.com/postgres/postgres/blob/master/src/backend/catalog/information_schema.sql

Combined with #69909 and #69911, this PR unlocks these two gnarly introspection queries in PostgREST:
- https://github.com/PostgREST/postgrest/blob/b05898d17f8e33c8c82fc1d05a30eb3044999668/src/PostgREST/DbStructure.hs#L538
- https://github.com/PostgREST/postgrest/blob/b05898d17f8e33c8c82fc1d05a30eb3044999668/src/PostgREST/DbStructure.hs#L709

Release justification: None, waiting for v22.1.

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Needed for cockroachdb#69010.

This commit implements the `information_schema._pg_index_position` builtin
function. Given an index's OID and an underlying-table column number,
`information_schema._pg_index_position` return the column's position in the
index (or NULL if not there).

The function is implemented as a user-defined function in Postgres here:
https://github.com/postgres/postgres/blob/master/src/backend/catalog/information_schema.sql

Release note (sql change): The `information_schema._pg_index_position`
builtin function is now supported, which improves compatibility with
PostgreSQL.

Release justification: None, waiting for v22.1.
@nvanbenschoten
Copy link
Member Author

bors r+

@craig
Copy link
Contributor

craig bot commented Sep 14, 2021

Build succeeded:

@craig craig bot merged commit c7d1ab7 into cockroachdb:master Sep 14, 2021
@nvanbenschoten nvanbenschoten deleted the nvanbenschoten/pg_has_role branch September 14, 2021 03:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants