-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-48668][SQL] Support ALTER NAMESPACE ... UNSET PROPERTIES in v2 #47038
Conversation
case class UnsetNamespaceProperties( | ||
namespace: LogicalPlan, | ||
propertyKeys: Seq[String], | ||
ifExists: Boolean) extends UnaryCommand { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably use RunnableCommand
to simplify the v2 command implementation. We can add a UnaryRunnableCommand
, and then do
case class UnsetNamespacePropertiesCommand(...) extends UnaryRunnableCommand {
...
def run(...) {
val ResolvedIdentifier(catalog, ident) = child
...
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let me give it a try, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan I have improved it in the way you suggested.
@@ -51,6 +51,7 @@ trait RunnableCommand extends Command { | |||
} | |||
|
|||
trait LeafRunnableCommand extends RunnableCommand with LeafLike[LogicalPlan] | |||
trait UnaryRunnableCommand extends RunnableCommand with UnaryLike[LogicalPlan] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because RunnableCommand
is defined in the module sql/core
, it seems that adding UnaryRunnableCommand
here is more appropriate.
@@ -2659,12 +2659,12 @@ private[sql] object QueryCompilationErrors extends QueryErrorsBase with Compilat | |||
} | |||
|
|||
def unsetNonExistentPropertiesError( | |||
properties: Seq[String], table: TableIdentifier): Throwable = { | |||
properties: Seq[String], nameParts: Seq[String]): Throwable = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reuse UNSET_NONEXISTENT_PROPERTIES
@@ -98,3 +83,20 @@ class ResolveCatalogs(val catalogManager: CatalogManager) | |||
} | |||
} | |||
} | |||
|
|||
object ResolveCatalogs { | |||
def resolveNamespace( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extract method resolveNamespace
from class ResolveCatalogs
to object ResolveCatalogs
and reuse it in UnsetNamespacePropertiesCommand
@@ -4231,7 +4231,7 @@ | |||
}, | |||
"UNSET_NONEXISTENT_PROPERTIES" : { | |||
"message" : [ | |||
"Attempted to unset non-existent properties [<properties>] in table <table>." | |||
"Attempted to unset non-existent properties [<properties>] in relation <relationId>." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to reuse it in the namespace
scene
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but namespace is not a relation... maybe just table or namespace <name>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, okay.
* ALTER (DATABASE|SCHEMA|NAMESPACE) ... UNSET (DBPROPERTIES|PROPERTIES) [IF EXISTS] ...; | ||
* }}} | ||
*/ | ||
case class UnsetNamespacePropertiesCommand( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to define UnsetNamespacePropertiesCommand
in module sql/core
because it inherits UnaryRunnableCommand extends RunnableCommand
and cannot be accessed in the module sql/catalyst
(sql/core
depends on sql/catalyst
)
@@ -1098,4 +1098,24 @@ class SparkSqlAstBuilder extends AstBuilder { | |||
|
|||
(ctx.LOCAL != null, finalStorage, Some(DDLUtils.HIVE_PROVIDER)) | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to put the parse process of UnsetNamespacePropertiesCommand
here, because UnsetNamespacePropertiesCommand
can only be defined within the module sql/core
, and AstBuilder
cannot access the class UnsetNamespacePropertiesCommand
within the module
override def run(sparkSession: SparkSession): Seq[Row] = { | ||
val ResolvedIdentifier(catalog, ident) = child | ||
val ns = ResolveCatalogs.resolveNamespace( | ||
catalog, ident.namespace.toSeq :+ ident.name, fetchMetadata = true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, we should leave this to the analyzer. We should put UnresolvedNamespace
inside UnsetNamespacePropertiesCommand
, instead of UnresolvedIdentifier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let me fix it.
Also, I found an issue during the migration of Unify v1 and v2 ALTER TABLE .. UNSET TBLPROPERTIES IF EXISTS tests
.
The command (AlterTableExec
) for ALTER TABLE ... UNSET TBLPROPERTIES ...
in v2 seems to have ignored whether the IF EXISTS
parameter.
I think we should fix it. WDYH?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, we should leave this to the analyzer. We should put
UnresolvedNamespace
insideUnsetNamespacePropertiesCommand
, instead ofUnresolvedIdentifier
Thanks, updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should fix it. WDYH?
Yea we should fix, but in a separated PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let me do it in another separate PR.
@@ -4411,6 +4411,11 @@ | |||
"<functionName> with AES-<mode> does not support initialization vectors (IVs)." | |||
] | |||
}, | |||
"ALTER_NAMESPACE_PROPERTY" : { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename SET_NAMESPACE_PROPERTY
to ALTER_NAMESPACE_PROPERTY
, We will reuse it in SparkSqlParser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In SQL, SET also means update, I think it's fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because in the following two cases, it is UNSUPPORTED_FEATURE.SET_NAMESPACE_PROPERTY
, and I am concerned that it may cause misunderstandings.
test reserved properties
sql(s"CREATE NAMESPACE $ns")
val sqlText = s"ALTER NAMESPACE $ns SET PROPERTIES ('$key'='dummyVal')"
sql(s"CREATE NAMESPACE $ns")
val sqlText = s"ALTER NAMESPACE $ns UNSET PROPERTIES ('$key')"
Okay, let me restore it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are the corresponding errors for table properties?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The corresponding errors for table properties: UNSUPPORTED_FEATURE.SET_TABLE_PROPERTY
, eg:
ALTER TABLE ... SET PROPERTIES ('$reservedKey'='...')"
ALTER TABLE ... UNSET PROPERTIES ('$reservedKey')"
sql(s"ALTER NAMESPACE $ns SET PROPERTIES ('$key'='dummyVal')") | ||
} | ||
assert(exception.getMessage.contains(s"$key is a reserved namespace property")) | ||
val sqlText = s"ALTER NAMESPACE $ns SET PROPERTIES ('$key'='dummyVal')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this PR,
Only let's use checkErrorMatchPVals
to check for error-conditions
import org.apache.spark.sql.errors.QueryCompilationErrors | ||
|
||
/** | ||
* A command that ALTER NAMESPACE UNSET PROPERTIES command. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please fix the grammar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
.../src/main/scala/org/apache/spark/sql/execution/command/UnsetNamespacePropertiesCommand.scala
Show resolved
Hide resolved
@@ -25,7 +25,7 @@ license: | | |||
`DATABASE`, `SCHEMA` and `NAMESPACE` are interchangeable and one can be used in place of the others. An error message | |||
is issued if the database is not found in the system. | |||
|
|||
### ALTER PROPERTIES | |||
### ALTER SET PROPERTIES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since the entire page is for ALTER DATABASE
, I think SET PROPERTIES
is a better section title.
@@ -43,6 +43,23 @@ ALTER { DATABASE | SCHEMA | NAMESPACE } database_name | |||
|
|||
Specifies the name of the database to be altered. | |||
|
|||
### ALTER UNSET PROPERTIES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also add examples for this new syntax, in the ### Examples
section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
* **database_name** | ||
|
||
Specifies the name of the database to be altered. | ||
|
||
### ALTER LOCATION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
/** | ||
* Create a [[UnsetNamespacePropertiesCommand]] command. | ||
* | ||
* For example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* For example: | |
* Expected format: |
* For example: | ||
* {{{ | ||
* ALTER (DATABASE|SCHEMA|NAMESPACE) database | ||
* UNSET (DBPROPERTIES | PROPERTIES) ('comment', 'key'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* UNSET (DBPROPERTIES | PROPERTIES) ('comment', 'key'); | |
* UNSET (DBPROPERTIES | PROPERTIES) ('key1', 'key2'); |
@cloud-fan |
thanks, merging to master! |
### What changes were proposed in this pull request? The pr aims to support `ALTER NAMESPACE ... UNSET PROPERTIES` in `v2`. ### Why are the changes needed? - For `table` and `view`, we can `add`, `update`, or `delete` table's `properties` using the following SQL: ``` ALTER (TABLE | VIEW) ... SET TBLPROPERTIES ... ALTER (TABLE | VIEW) ... UNSET TBLPROPERTIES (IF EXISTS)? ... ``` https://github.com/apache/spark/blob/3469ec6b41967b1b4c7b2549174ed0c199815977/sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4#L148-L151 - But at the SQL level, there is only `SET` syntax for `namespace`, not `UNSET` syntax ``` ALTER namespace ... SET (DBPROPERTIES | PROPERTIES) ... ``` https://github.com/apache/spark/blob/3469ec6b41967b1b4c7b2549174ed0c199815977/sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4#L106-L107 - In addition, the underlying `SupportsNamespaces` interface supports deleting properties. I propose adding SQL syntax to facilitate users to use SQL instead of relying solely on APIs to manipulate the properties of `namespace` https://github.com/apache/spark/blob/3469ec6b41967b1b4c7b2549174ed0c199815977/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/SupportsNamespaces.java#L127-L137 https://github.com/apache/spark/blob/3469ec6b41967b1b4c7b2549174ed0c199815977/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/NamespaceChange.java#L59-L61 ### Does this PR introduce _any_ user-facing change? Yes, end users can delete the properties of `namespace` through SQL. ### How was this patch tested? Add new UT. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47038 from panbingkun/SPARK-48668. Authored-by: panbingkun <panbingkun@baidu.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
} | ||
checkError(e, | ||
errorClass = "SCHEMA_NOT_FOUND", | ||
parameters = Map("schemaName" -> s"`$ns`")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just noticed one thing: why does the error message omit the catalog name? For table not found, the error message contains the user-given table name in the SQL statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me investigate it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
v1 catalog
spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
Line 316 in 4c99c4d
requireDbExists(dbName)
spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
Line 253 in 4c99c4d
throw new NoSuchDatabaseException(db) -
v2 catalog
spark/sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/InMemoryTableCatalog.scala
Lines 206 to 213 in 4c99c4d
override def loadNamespaceMetadata(namespace: Array[String]): util.Map[String, String] = { Option(namespaces.get(namespace.toSeq)) match { case Some(metadata) => metadata.asJava case _ if namespaceExists(namespace) => util.Collections.emptyMap[String, String] case _ => throw new NoSuchNamespaceException(namespace)
From the user's perspective (when he uses multiple catalogs
for federated
queries and encounters some tables that do not exist), we should fix it, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, do we also need to fix similar scenarios (when table
not exist) below together?
spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
Lines 257 to 267 in 4c99c4d
private def requireTableExists(name: TableIdentifier): Unit = { | |
if (!tableExists(name)) { | |
throw new NoSuchTableException(db = name.database.get, table = name.table) | |
} | |
} | |
private def requireTableNotExists(name: TableIdentifier): Unit = { | |
if (tableExists(name)) { | |
throw new TableAlreadyExistsException(db = name.database.get, table = name.table) | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan Initial PR, just attach catalog name
before the namespace
. If this way is ok, I will continue to add table related as well.
#47276
Thanks!
…pace` to `catalog.namespace` ### What changes were proposed in this pull request? The pr aims to change the value of `SCHEMA_NOT_FOUND` from `namespace` to `catalog.namespace`. ### Why are the changes needed? As discussing #47038 (comment), we should provide more friendly and clear prompt error message. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Update existed UT & Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #47276 from panbingkun/db_with_catalog. Authored-by: panbingkun <panbingkun@baidu.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…pace` to `catalog.namespace` ### What changes were proposed in this pull request? The pr aims to change the value of `SCHEMA_NOT_FOUND` from `namespace` to `catalog.namespace`. ### Why are the changes needed? As discussing apache#47038 (comment), we should provide more friendly and clear prompt error message. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Update existed UT & Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47276 from panbingkun/db_with_catalog. Authored-by: panbingkun <panbingkun@baidu.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…pace` to `catalog.namespace` ### What changes were proposed in this pull request? The pr aims to change the value of `SCHEMA_NOT_FOUND` from `namespace` to `catalog.namespace`. ### Why are the changes needed? As discussing apache#47038 (comment), we should provide more friendly and clear prompt error message. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Update existed UT & Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47276 from panbingkun/db_with_catalog. Authored-by: panbingkun <panbingkun@baidu.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
### What changes were proposed in this pull request? The pr aims to support `ALTER NAMESPACE ... UNSET PROPERTIES` in `v2`. ### Why are the changes needed? - For `table` and `view`, we can `add`, `update`, or `delete` table's `properties` using the following SQL: ``` ALTER (TABLE | VIEW) ... SET TBLPROPERTIES ... ALTER (TABLE | VIEW) ... UNSET TBLPROPERTIES (IF EXISTS)? ... ``` https://github.com/apache/spark/blob/3469ec6b41967b1b4c7b2549174ed0c199815977/sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4#L148-L151 - But at the SQL level, there is only `SET` syntax for `namespace`, not `UNSET` syntax ``` ALTER namespace ... SET (DBPROPERTIES | PROPERTIES) ... ``` https://github.com/apache/spark/blob/3469ec6b41967b1b4c7b2549174ed0c199815977/sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4#L106-L107 - In addition, the underlying `SupportsNamespaces` interface supports deleting properties. I propose adding SQL syntax to facilitate users to use SQL instead of relying solely on APIs to manipulate the properties of `namespace` https://github.com/apache/spark/blob/3469ec6b41967b1b4c7b2549174ed0c199815977/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/SupportsNamespaces.java#L127-L137 https://github.com/apache/spark/blob/3469ec6b41967b1b4c7b2549174ed0c199815977/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/NamespaceChange.java#L59-L61 ### Does this PR introduce _any_ user-facing change? Yes, end users can delete the properties of `namespace` through SQL. ### How was this patch tested? Add new UT. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47038 from panbingkun/SPARK-48668. Authored-by: panbingkun <panbingkun@baidu.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…pace` to `catalog.namespace` ### What changes were proposed in this pull request? The pr aims to change the value of `SCHEMA_NOT_FOUND` from `namespace` to `catalog.namespace`. ### Why are the changes needed? As discussing apache#47038 (comment), we should provide more friendly and clear prompt error message. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Update existed UT & Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47276 from panbingkun/db_with_catalog. Authored-by: panbingkun <panbingkun@baidu.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…pace` to `catalog.namespace` ### What changes were proposed in this pull request? The pr aims to change the value of `SCHEMA_NOT_FOUND` from `namespace` to `catalog.namespace`. ### Why are the changes needed? As discussing apache#47038 (comment), we should provide more friendly and clear prompt error message. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Update existed UT & Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47276 from panbingkun/db_with_catalog. Authored-by: panbingkun <panbingkun@baidu.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
The pr aims to support
ALTER NAMESPACE ... UNSET PROPERTIES
inv2
.Why are the changes needed?
table
andview
, we canadd
,update
, ordelete
table'sproperties
using the following SQL:spark/sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4
Lines 148 to 151 in 3469ec6
SET
syntax fornamespace
, notUNSET
syntaxspark/sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4
Lines 106 to 107 in 3469ec6
SupportsNamespaces
interface supports deleting properties. I propose adding SQL syntax to facilitate users to use SQL instead of relying solely on APIs to manipulate the properties ofnamespace
spark/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/SupportsNamespaces.java
Lines 127 to 137 in 3469ec6
spark/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/NamespaceChange.java
Lines 59 to 61 in 3469ec6
Does this PR introduce any user-facing change?
Yes, end users can delete the properties of
namespace
through SQL.How was this patch tested?
Add new UT.
Was this patch authored or co-authored using generative AI tooling?
No.