Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support identifiers with . in them #1449

Merged
merged 6 commits into from
Dec 15, 2021
Merged

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Dec 14, 2021

Which issue does this PR close?

Closes #1432

Rationale for this change

Field names containing period such as f.c1 cannot be named in SQL query

What changes are included in this PR?

  1. Directly construct Expr::Column references rather than trying to parse a string using col() when converting SQL

Are there any user-facing changes?

columns with periods can now be used

@github-actions github-actions bot added datafusion Changes in the datafusion crate sql SQL Planner labels Dec 14, 2021
// interpret names with '.' as if they were
// compound indenfiers, but this is not a compound
// identifier. (e.g. it is "foo.bar" not foo.bar)
Ok(Expr::Column(Column {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the core change -- col() attempts to parse a string as a potentially compound identifier. For a SQLExpr::Identifier this should not be done and the value should be used as an unqualified relation

@@ -1062,8 +1062,7 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
}

let field = schema.field(field_index - 1);
let col_ident = SQLExpr::Identifier(Ident::new(field.qualified_name()));
self.sql_expr_to_logical_expr(&col_ident, schema)?
Expr::Column(field.qualified_column())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change avoids trying to reparse the string as a column name, and instead just creates the Column reference directly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍, FYI @hntd187

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much better than what I had :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are all learning together!

@alamb alamb requested a review from houqp December 14, 2021 22:51
@alamb
Copy link
Contributor Author

alamb commented Dec 14, 2021

FYI @liukun4515

Copy link
Member

@houqp houqp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simple and elegant fix, thanks @alamb !

@houqp houqp added the bug Something isn't working label Dec 14, 2021
@alamb alamb mentioned this pull request Dec 14, 2021
let actual = execute_to_batches(&mut ctx, &sql).await;
let expected = vec![
"+--------+",
"| f.c1 |",
Copy link
Member

@jimexist jimexist Dec 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if it is okay to create column "...."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is "ok" from the point of view of DataFusion should run the query. I don't think anyone should actually do it 😆

Added a test for this case in 3e4418c

Copy link
Contributor

@liukun4515 liukun4515 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @alamb

ctx.register_table("test", Arc::new(table))?;

// referring to the unquoted column is an error
let sql = format!(r#"SELECT f1.c1 from test"#);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clippy: useless_format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix

@@ -1062,8 +1062,7 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
}

let field = schema.field(field_index - 1);
let col_ident = SQLExpr::Identifier(Ident::new(field.qualified_name()));
self.sql_expr_to_logical_expr(&col_ident, schema)?
Expr::Column(field.qualified_column())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍, FYI @hntd187

@alamb alamb merged commit 0052667 into apache:master Dec 15, 2021
@alamb alamb deleted the alamb/fix_identifier branch December 15, 2021 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working datafusion Changes in the datafusion crate sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Field names containing period such as f.c1 cannot be named in SQL query
6 participants