-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ShowVariable Statement #3455
Conversation
builder.add_table( | ||
&catalog_name, | ||
INFORMATION_SCHEMA, | ||
SETTINGS, | ||
TableType::View, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add SETTINGS
into information_schema
so that
❯ show tables;
+---------------+--------------------+------------+------------+
| table_catalog | table_schema | table_name | table_type |
+---------------+--------------------+------------+------------+
| datafusion | information_schema | tables | VIEW |
| datafusion | information_schema | views | VIEW |
| datafusion | information_schema | columns | VIEW |
| datafusion | information_schema | settings | VIEW |
+---------------+--------------------+------------+------------+
4 rows in set. Query took 0.004 seconds.
struct InformationSchemaSettingsBuilder { | ||
names: StringBuilder, | ||
settings: StringBuilder, | ||
} | ||
|
||
impl InformationSchemaSettingsBuilder { | ||
fn new() -> Self { | ||
Self { | ||
names: StringBuilder::new(), | ||
settings: StringBuilder::new(), | ||
} | ||
} | ||
|
||
fn add_setting(&mut self, name: impl AsRef<str>, setting: impl AsRef<str>) { | ||
self.names.append_value(name.as_ref()); | ||
self.settings.append_value(setting.as_ref()); | ||
} | ||
} | ||
|
||
impl From<InformationSchemaSettingsBuilder> for MemTable { | ||
fn from(value: InformationSchemaSettingsBuilder) -> MemTable { | ||
let schema = Schema::new(vec![ | ||
Field::new("name", DataType::Utf8, false), | ||
Field::new("setting", DataType::Utf8, false), | ||
]); | ||
|
||
let InformationSchemaSettingsBuilder { | ||
mut names, | ||
mut settings, | ||
} = value; | ||
|
||
let schema = Arc::new(schema); | ||
let batch = RecordBatch::try_new( | ||
schema.clone(), | ||
vec![Arc::new(names.finish()), Arc::new(settings.finish())], | ||
) | ||
.unwrap(); | ||
|
||
MemTable::try_new(schema, vec![vec![batch]]).unwrap() | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implement InformationSchemaSettingsBuilder
only two columns, name
and setting
supported now
this is what postgresql has
show all;
name | setting | description
----------------------------------------+--------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------
allow_system_table_mods | off | Allows modifications of the structure of system tables.
application_name | psql | Sets the application name to be reported in statistics and logs.
archive_cleanup_command | | Sets the shell command that will be executed at every restart point.
archive_command | (disabled) | Sets the shell command that will be called to archive a WAL file.
archive_mode | off | Allows archiving of WAL files using archive_command.
archive_timeout | 0 | Forces a switch
/// Configuration options | ||
pub config_options: ConfigOptions, | ||
pub config_options: Arc<RwLock<ConfigOptions>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change SessionConfig.config_options
from ConfigOptions
to Arc<RwLock<ConfigOptions>
since
- we need to share this with
CatalogWithInformationSchema
- allow to set config
pub fn set(self, key: &str, value: ScalarValue) -> Self { | ||
self.config_options.write().set(key, value); | ||
self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config_options.write()
to unlock the write lock and set the value
pub fn batch_size(&self) -> usize { | ||
self.config_options | ||
.read() | ||
.get_u64(OPT_BATCH_SIZE) | ||
.try_into() | ||
.unwrap() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config_options.read()
to unlock the read lock to get the value
//#[tokio::test] | ||
//async fn show_unsupported() { | ||
// let ctx = SessionContext::with_config(SessionConfig::new()); | ||
// | ||
// let err = plan_and_collect(&ctx, "SHOW SOMETHING_UNKNOWN") | ||
// .await | ||
// .unwrap_err(); | ||
// | ||
// assert_eq!(err.to_string(), "This feature is not implemented: SHOW SOMETHING_UNKNOWN not implemented. Supported syntax: SHOW <TABLES>"); | ||
//} | ||
|
||
// FIXME | ||
// currently we cannot know whether a variable exists or not while the collect, this will output 0 row instead | ||
// one way to fix this is to generate a ConfigOptions and get options' key to compare | ||
// however config.rs is currently in core lib, could not be used by datafusion_sql due to the dependency cycle | ||
#[tokio::test] | ||
async fn show_unsupported() { | ||
let ctx = SessionContext::with_config(SessionConfig::new()); | ||
async fn show_non_existing_variable() { | ||
let ctx = | ||
SessionContext::with_config(SessionConfig::new().with_information_schema(true)); | ||
|
||
let err = plan_and_collect(&ctx, "SHOW SOMETHING_UNKNOWN") | ||
let result = plan_and_collect(&ctx, "SHOW SOMETHING_UNKNOWN") | ||
.await | ||
.unwrap_err(); | ||
.unwrap(); | ||
|
||
assert_eq!(err.to_string(), "This feature is not implemented: SHOW SOMETHING_UNKNOWN not implemented. Supported syntax: SHOW <TABLES>"); | ||
assert_eq!(result.len(), 0); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently we cannot know whether a variable exists or not, it will always show 0 rows instead while trying to show non-existing variable
❯ show a;
0 rows in set. Query took 0.003 seconds.
one way to solve this is to generate a ConfigOptions
here, however config.rs is currently in core lib, could not be used by datafusion_sql due to the dependency cycle
datafusion/sql/src/planner.rs
Outdated
fn show_variable_to_plan(&self, variable: &[Ident]) -> Result<LogicalPlan> { | ||
let variable = ObjectName(variable.to_vec()).to_string(); | ||
Err(DataFusionError::NotImplemented(format!( | ||
"SHOW {} not implemented. Supported syntax: SHOW <TABLES>", | ||
variable | ||
))) | ||
|
||
if !self.has_table("information_schema", "settings") { | ||
return Err(DataFusionError::Plan( | ||
"SHOW [VARIABLE] is not supported unless information_schema is enabled" | ||
.to_string(), | ||
)); | ||
} | ||
|
||
let query = if variable.to_lowercase() == "all" { | ||
String::from("SELECT name, setting FROM information_schema.settings") | ||
} else { | ||
format!( | ||
"SELECT name, setting FROM information_schema.settings WHERE name = '{}'", | ||
variable | ||
) | ||
}; | ||
|
||
let mut rewrite = DFParser::parse_sql(&query)?; | ||
assert_eq!(rewrite.len(), 1); | ||
|
||
self.statement_to_plan(rewrite.pop_front().unwrap()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
show [Variable]
will be converted to the select
statement
Codecov Report
@@ Coverage Diff @@
## master #3455 +/- ##
==========================================
+ Coverage 85.71% 85.74% +0.03%
==========================================
Files 298 299 +1
Lines 54882 55224 +342
==========================================
+ Hits 47040 47354 +314
- Misses 7842 7870 +28
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THanks @waitingkuo -- I think this PR could be merged as is but I have a question about tests and a suggestion on naming that I think are worth considering
@@ -436,15 +444,40 @@ async fn information_schema_show_table_table_names() { | |||
); | |||
} | |||
|
|||
//#[tokio::test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you mean to leave this commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i thought we might have a way to check whether a variable exists or not in the future
const INFORMATION_SCHEMA: &str = "information_schema"; | ||
const TABLES: &str = "tables"; | ||
const VIEWS: &str = "views"; | ||
const COLUMNS: &str = "columns"; | ||
const SETTINGS: &str = "settings"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think settings
is an "offical" information_schema (if there is such a thing) table. Perhaps we could call this table something like df_settings
to make it clear that this table is a datafusion extension?
https://www.postgresql.org/docs/current/information-schema.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no it's not. postgresl has this
willy=# select name, setting from pg_settings limit 1;
name | setting
-------------------------+---------
allow_system_table_mods | off
(1 row)
df_settings sounds good. should we leave it in information_schema, or move it out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
df_settings sounds good. should we leave it in information_schema, or move it out?
I don't have any preference -- whatever is easier at first (information_schema
seems reasonable to me)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb updated
❯ show tables;
+---------------+--------------------+-------------+------------+
| table_catalog | table_schema | table_name | table_type |
+---------------+--------------------+-------------+------------+
| datafusion | information_schema | tables | VIEW |
| datafusion | information_schema | views | VIEW |
| datafusion | information_schema | columns | VIEW |
| datafusion | information_schema | df_settings | VIEW |
+---------------+--------------------+-------------+------------+
4 rows in set. Query took 0.003 seconds.
❯ select * from information_schema.df_settings;
+-------------------------------------------------+---------+
| name | setting |
+-------------------------------------------------+---------+
| datafusion.execution.coalesce_batches | true |
| datafusion.explain.physical_plan_only | false |
| datafusion.execution.coalesce_target_batch_size | 4096 |
| datafusion.execution.batch_size | 8192 |
| datafusion.optimizer.skip_failed_rules | true |
| datafusion.optimizer.filter_null_join_keys | false |
| datafusion.explain.logical_plan_only | false |
+-------------------------------------------------+---------+
8 rows in set. Query took 0.002 seconds.
a504c83
to
0aa1055
Compare
const INFORMATION_SCHEMA: &str = "information_schema"; | ||
const TABLES: &str = "tables"; | ||
const VIEWS: &str = "views"; | ||
const COLUMNS: &str = "columns"; | ||
const DF_SETTINGS: &str = "df_settings"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename settings -> df_settings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me - thanks @waitingkuo
Benchmark runs are scheduled for baseline = 9fbee1a and contender = 0388682. 0388682 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #3364
Rationale for this change
To Support ShowVariable
What changes are included in this PR?
Support
SHOW ALL;
This will show all the options in
ConfigOptions
Support
SHOW [VARIABLE];
Are there any user-facing changes?
Show [Variable];
works now