Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: Map SHOW statements onto queries from information_schema #9966

Closed
nvanbenschoten opened this issue Oct 13, 2016 · 25 comments
Closed

sql: Map SHOW statements onto queries from information_schema #9966

nvanbenschoten opened this issue Oct 13, 2016 · 25 comments
Labels
C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. E-intermediate Intermediate complexity, needs a contributor with 3-6 months of past contribution experience. help wanted Help is requested / needed by the one who filed the issue to fix it.

Comments

@nvanbenschoten
Copy link
Member

There is a good amount of duplication between metadata returned from SHOW statements and metadata in the information_schema. It would be nice to map SHOW statements directly onto queries from the information_schema. This would probably require adding a few more information_schema tables.

@nvanbenschoten nvanbenschoten added the C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. label Oct 13, 2016
@knz knz added help wanted Help is requested / needed by the one who filed the issue to fix it. E-intermediate Intermediate complexity, needs a contributor with 3-6 months of past contribution experience. labels Oct 13, 2016
@yznming
Copy link
Contributor

yznming commented Oct 18, 2016

@nvanbenschoten Maybe I can help to fix it.
Shell we just cleanup the SHOW statements that fetch data from system tables as follow:

  • SHOW COLUMNS
  • SHOW CONSTRAINTS
  • SHOW CREATE TABLE
  • SHOW DATABASES
  • SHOW GRANTS
  • SHOW INDEX
  • SHOW TABLES

@nvanbenschoten
Copy link
Member Author

Hi @yznming, that's correct. The goal here is for each of the statements in sql/show.go to be substituted with SELECT statements from information_schema.

Your help would be very welcomed! As a first step, I would recommend writing out here the SQL query you intend to perform for each SHOW statement (ie. SHOW TABLES -> SELECT x, y, z FROM information_schema.tables WHERE ...). Documentation on all existing information_schema tables might come in handy, as some of these SHOW statements might require introducing new information_schema tables to CockroachDB first. Once we have a good idea of what SQL queries we're going to need for each of the SHOW STATEMENT, then we can look into what will be required to hook them up.

@knz
Copy link
Contributor

knz commented Oct 19, 2016

@yznming yes that's the idea.

@yznming
Copy link
Contributor

yznming commented Oct 19, 2016

@nvanbenschoten @knz Here is my analysis. I would appreciate if you could suggest more advice here, thanks.

  • SHOW COLUMNS:
SELECT COLUMN_NAME, DATA_TYPE,  IS_NULLABLE != 'NO', COLUMN_DEFAULT  
FROM information_schema.columns WHERE TABLE_NAME=$1 AND TABLE_SCHEMA=$2

Missing a record for rowid column if create table without specifying primary key.

  • SHOW CONSTRAINTS: under development.
  • SHOW CREATE TABLE: remain unchanged
  • SHOW DATABASES: need a virtual table for databases
  • SHOW GRANTS ON DATABASE:
SELECT TABLE_SCHEMA,GRANTEE, PRIVILEGE_TYPE FROM 
information_schema.schema_privileges WHERE TABLE_SCHEMA='$1'
  • SHOW GRANTS ON DATABASE FOR user_name:
SELECT TABLE_SCHEMA,GRANTEE, PRIVILEGE_TYPE FROM 
information_schema.schema_privileges WHERE TABLE_SCHEMA='$1' AND GRANTEE='$2'
  • SHOW GRANTS ON TABLE:
SELECT TABLE_NAME,GRANTEE, PRIVILEGE_TYPE FROM 
information_schema.table_privileges WHERE TABLE_SCHEMA='$1' AND TABLE_NAME='$2'
  • SHOW GRANTS ON DATABASE FOR user_name:
SELECT TABLE_NAME,GRANTEE, PRIVILEGE_TYPE FROM information_schema.table_privileges 
WHERE TABLE_SCHEMA='$1' AND TABLE_NAME='$2' AND GRANTEE='$3'
  • SHOW INDEX: Need a virtual table provides Unique, Seq, Direction, Storing
SELECT TABLE_NAME, CONSTRAINT_NAME, COLUMN_NAME, FROM 
information_schema.key_column_usage where TABLE_name='$1' and TABLE_SCHEMA='$2';
  • SHOW TABLES:
SELECT TABLE_NAME FROM information_schema.tables WHERE TABLE_SCHEMA = $1

@nvanbenschoten
Copy link
Member Author

This looks like a great start, nice job! A few pointers that might help as you continue working on the last few statments are:

  • I think you're looking for the table information_schema.schemata for the SHOW DATABASES statement, which is already implemented
  • Ideally these should be drop in replacements for the current implementations so that we don't even need to touch the tests. To accomplish this, we may need to alias some of the select renders (ie. SELECT TABLE_NAME FROM information_schema.tables ... could be replaced with SELECT TABLE_NAME AS Tables FROM information_schema.tables ...)
  • Our current SHOW statements, by the nature of their implementation, generally have a deterministic row ordering. We may need to mess with ORDER BY clauses to reproduce this (again, so that this change is a direct drop in replacement)

Let me know if you have any questions or if there are any statements you're finding particularly tricky.

@yznming
Copy link
Contributor

yznming commented Oct 20, 2016

@nvanbenschoten thanks a lot. still have a few questions:

  • How to get Unique, Seq, Direction, Storing for SHOW INDEX
  • To execute the statements, should start with makeInternalPlanner, and then run the query with planer.queryRow(), right?

@knz
Copy link
Contributor

knz commented Oct 21, 2016

To execute the statements it's simpler than that. See this comment:
https://reviewable.io/reviews/cockroachdb/cockroach/10088#-KUT9QC0T01aJlGi4IMs

Something like:

func (p *planner) ShowUsers(...) (planNode, error) {
   stmt, err := parser.ParseOneTraditional("SELECT ... ")
   if err != nil { return nil, err }
   return p.newPlan(stmt, true)
}

In cases where they are no variable parts in the SQL syntax you can even optimize further:

var stmtShow Statement
func init() {
   stmtShow, _ = parser.ParseOne(...)
}
func (p *planner) ShowUsers(...) (planNode, error) {
   return p.newPlan(stmtShow, true)
}

@yznming
Copy link
Contributor

yznming commented Oct 21, 2016

@knz it works. Thank you very much.

@nvanbenschoten
Copy link
Member Author

@yznming w.r.t your question about Unique, Seq, etc., it may be possible that the information_schema is missing information we want to expose. If that is indeed the case then we can look into extending the schema a bit with extension columns, which seems to be well precedented in other databases. For now, would you mind just double checking that each piece of metadata is nowhere in the information_schema spec, and if not, documenting what we'll need to add?

@yznming
Copy link
Contributor

yznming commented Oct 24, 2016

@nvanbenschoten currently, I think we need to extend the information_schema as follow:

  1. The information_schema.columns misses rowid column if table is created without specifying primary key.
  2. The information_schema.key_column_usage misses Unique, Direction and Storing columns.
  3. The information_schema.table_constraints misses Column(s) and Details columns.
  4. Need a information_schema.view table. Result of SHOW TABLES is union of table and view.
  5. The type TimestampTZ should be display as Timestamp WITH TIME ZONE

@nvanbenschoten
Copy link
Member Author

  1. This was an intentional decision for the information_schema, however I'm starting to second guess it. @knz do you think we should expose Hidden columns in the information_schema? Either way, information_schema and SHOW statements should behave the same way in this regard.

  2. Unique: check out information_schema.table_constraints

    Direction: based on this answer, it seems like a prime candidate for extension

    Storing: also seems like a good extension column

  3. Would a combination of check_constraints and constraint_column_usage help?

  4. information_schema.views seems straightforward to implement.

  5. I wouldn't worry about that, we can change the tests for minor issues.

@knz
Copy link
Contributor

knz commented Oct 24, 2016

Regarding point 1 I am not in favor of adding hidden columns (Also apparently PG doesn't show them there either)

@yznming
Copy link
Contributor

yznming commented Oct 24, 2016

@nvanbenschoten Is that means what we need to do next step:

  1. add Unique to information_schema.table_constraints
  2. add Storing to information_schema.key_column_usage
  3. implement information_schema tables:STATISTICS, check_constraints, constraint_column_usage, view

@yznming
Copy link
Contributor

yznming commented Oct 24, 2016

@nvanbenschoten BTW, I'm going to order the result of SHOW statements as follow, do you have any ideas:

  1. SHOW DATABASE: ordered by database name
  2. SHOW TABLE: ordered by table name
  3. SHOW COLUMNS: ordered by ORDINAL_POSITION
  4. SHOW CONSTRAINTS: ordered by (Database, User, Privileges) or (Table, User, Privileges)

@nvanbenschoten
Copy link
Member Author

@yznming Those sorting orders sound fine to me.

You brought up information_schema.statistics, which is not actually part of the standard, but is included in MySQL for the purpose of indexes. I'd be ok with introducing this table and extending it with Unique, Direction, and Storing. Would you want to help implement this?

@yznming
Copy link
Contributor

yznming commented Oct 25, 2016

@nvanbenschoten Of course. I think these information_schema tables can be finished in another PR related to #8675.

@nvanbenschoten
Copy link
Member Author

Great, there are a few previous changes related to that issue which demonstrate what adding a new information_schema table entails. Feel free to ping me if you have any questions.

@jseldess
Copy link
Contributor

@nvanbenschoten, I'm seeing some difference in the way show tables and information_schema.tables handles privileges, and I'm not sure if it's intentional or not: SHOW TABLES shows a user all tables/views in the db, even those the user doesn't have access to, whereas SELECT * FROM information_schema.tables; shows the user only those tables/views the user has access to.

Example:

$ cockroach sql
# Welcome to the cockroach SQL interface.
# All statements must be terminated by a semicolon.
# To exit: CTRL + D.

root@:26257> create database info_schema_check;
CREATE DATABASE

root@:26257> set database = info_schema_check;
SET

root@:26257> create table t1 (a int);
CREATE TABLE

root@:26257> create view v1 as select a from t1;
CREATE VIEW

root@:26257> grant select on t1 to jseldess;
GRANT

root@:26257> SHOW GRANTS ON t1;
+-------+----------+------------+
| Table |   User   | Privileges |
+-------+----------+------------+
| t1    | jseldess | SELECT     |
| t1    | root     | ALL        |
+-------+----------+------------+
(2 rows)

root@:26257> show grants on v1;
+-------+------+------------+
| Table | User | Privileges |
+-------+------+------------+
| v1    | root | ALL        |
+-------+------+------------+
(1 row)

root@:26257> ^D

~/src/github.com/cockroachdb/cockroach$ cockroach sql --user=jseldess
# Welcome to the cockroach SQL interface.
# All statements must be terminated by a semicolon.
# To exit: CTRL + D.

jseldess@:26257> set database = info_schema_check;
SET

jseldess@:26257> show tables;
+-------+
| Table |
+-------+
| t1    |
| v1    |
+-------+
(2 rows)

jseldess@:26257> SELECT * FROM information_schema.tables where table_schema = 'info_schema_check';
+---------------+-------------------+------------+------------+---------+
| TABLE_CATALOG |   TABLE_SCHEMA    | TABLE_NAME | TABLE_TYPE | VERSION |
+---------------+-------------------+------------+------------+---------+
| def           | info_schema_check | t1         | BASE TABLE |       3 |
+---------------+-------------------+------------+------------+---------+
(1 row)

@yznming
Copy link
Contributor

yznming commented Oct 28, 2016

@nvanbenschoten
if we drop a table in a transaction, the table would be marked as dropped. Shall we keep this behavior?

root@:26660> begin;
BEGIN
root@:26660> drop table t;
DROP TABLE
root@:26660> show tables;
+-------------+
|    Table    |
+-------------+
| t (dropped) |
+-------------+
(1 row)

@nvanbenschoten
Copy link
Member Author

@jseldess Thanks for pointing that out. I don't think the discrepancy was intentional, but luckily the work @yznming is doing should unify this behavior.

@yznming What behavior would querying information_schema.tables have if you didnt try to handle that situation at all? Would it be displayed as a normal table, or would it not show up at all?

@yznming
Copy link
Contributor

yznming commented Nov 1, 2016

@nvanbenschoten if the table is drop , it would not show up:

root@:26660> begin;
BEGIN
root@:26660> drop table t;
DROP TABLE
root@:26660> select table_name from information_schema.tables where table_name='t';
+-------------+
|    Table    |
+-------------+
+-------------+
(0 row)

@nvanbenschoten
Copy link
Member Author

@yznming That seems fine for now.

@sploiselle
Copy link
Contributor

sploiselle commented Nov 17, 2016

After this refactor, adding support for SHOW VIEWS would be great (#10771)

@nvanbenschoten
Copy link
Member Author

@sploiselle: that might be something @yznming would be interested in working on after this change, and will certainly take advantage of the major improvements he's made to the SHOW infrastructure, but we shouldn't try to add new functionality in during such an involved refactor.

@nvanbenschoten
Copy link
Member Author

Closed by #10196.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. E-intermediate Intermediate complexity, needs a contributor with 3-6 months of past contribution experience. help wanted Help is requested / needed by the one who filed the issue to fix it.
Projects
None yet
Development

No branches or pull requests

5 participants