-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-40360] ALREADY_EXISTS and NOT_FOUND exceptions #37887
[SPARK-40360] ALREADY_EXISTS and NOT_FOUND exceptions #37887
Conversation
f863c6e
to
5d19e8b
Compare
f60a91a
to
6017e6e
Compare
Can one of the admins verify this patch? |
37a838e
to
ecdf639
Compare
974e83e
to
573538e
Compare
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/identifiers.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala
Outdated
Show resolved
Hide resolved
a1ab9d6
to
0f84136
Compare
0f84136
to
07da9c8
Compare
@MaxGekk @cloud-fan
What I have NOT done and need help with:
|
be3e31b
to
962689a
Compare
thanks, merging to master! |
|
||
class IndexAlreadyExistsException(message: String, cause: Option[Throwable] = None) | ||
extends AnalysisException(message, cause = cause) | ||
extends AnalysisException(errorClass = "INDEX_NOT_FOUND", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be INDEX_ALREADY_EXISTS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been fixed in another PR as far as I remember. Please, check the master branch.
### Description Supports new error messages. In `SparkAdapter.get_columns_in_relation`, it checks the error message when the specified table or view doesn't exist: https://github.com/dbt-labs/dbt-spark/blob/c87b6b2c48bcefb0ce52cd64984d3129d6f14ea0/dbt/adapters/spark/impl.py#L223 but, Spark will change the error message in the future release (apache/spark#37887), which causes the function to raise the `dbt.exceptions.RuntimeException` instead of returning an empty list. The function should also check whether the error message contains `[TABLE_OR_VIEW_NOT_FOUND]` or not. This will be reverted once dbt-labs/dbt-spark#515 is resolved.
### Description Supports new error messages. In `SparkAdapter.get_columns_in_relation`, it checks the error message when the specified table or view doesn't exist: https://github.com/dbt-labs/dbt-spark/blob/c87b6b2c48bcefb0ce52cd64984d3129d6f14ea0/dbt/adapters/spark/impl.py#L223 but, Spark will change the error message in the future release (apache/spark#37887), which causes the function to raise the `dbt.exceptions.RuntimeException` instead of returning an empty list. The function should also check whether the error message contains `[TABLE_OR_VIEW_NOT_FOUND]` or not. This will be reverted once dbt-labs/dbt-spark#515 is resolved.
### Description Supports new error messages. In `SparkAdapter.get_columns_in_relation`, it checks the error message when the specified table or view doesn't exist: https://github.com/dbt-labs/dbt-spark/blob/c87b6b2c48bcefb0ce52cd64984d3129d6f14ea0/dbt/adapters/spark/impl.py#L223 but, Spark will change the error message in the future release (apache/spark#37887), which causes the function to raise the `dbt.exceptions.RuntimeException` instead of returning an empty list. The function should also check whether the error message contains `[TABLE_OR_VIEW_NOT_FOUND]` or not. This will be reverted once dbt-labs/dbt-spark#515 is resolved.
### Description Supports new error messages. In `SparkAdapter.get_columns_in_relation`, it checks the error message when the specified table or view doesn't exist: https://github.com/dbt-labs/dbt-spark/blob/c87b6b2c48bcefb0ce52cd64984d3129d6f14ea0/dbt/adapters/spark/impl.py#L223 but, Spark will change the error message in the future release (apache/spark#37887), which causes the function to raise the `dbt.exceptions.RuntimeException` instead of returning an empty list. The function should also check whether the error message contains `[TABLE_OR_VIEW_NOT_FOUND]` or not. This will be reverted once dbt-labs/dbt-spark#515 is resolved.
### Description Supports new error messages. In `SparkAdapter.get_columns_in_relation`, it checks the error message when the specified table or view doesn't exist: https://github.com/dbt-labs/dbt-spark/blob/c87b6b2c48bcefb0ce52cd64984d3129d6f14ea0/dbt/adapters/spark/impl.py#L223 but, Spark will change the error message in the future release (apache/spark#37887), which causes the function to raise the `dbt.exceptions.RuntimeException` instead of returning an empty list. The function should also check whether the error message contains `[TABLE_OR_VIEW_NOT_FOUND]` or not. This will be reverted once dbt-labs/dbt-spark#515 is resolved.
### Description Supports new error messages. In `SparkAdapter.get_columns_in_relation`, it checks the error message when the specified table or view doesn't exist: https://github.com/dbt-labs/dbt-spark/blob/c87b6b2c48bcefb0ce52cd64984d3129d6f14ea0/dbt/adapters/spark/impl.py#L223 but, Spark will change the error message in the future release (apache/spark#37887), which causes the function to raise the `dbt.exceptions.RuntimeException` instead of returning an empty list. The function should also check whether the error message contains `[TABLE_OR_VIEW_NOT_FOUND]` or not. This will be reverted once dbt-labs/dbt-spark#515 is resolved.
### Description Supports new error messages. In `SparkAdapter.get_columns_in_relation`, it checks the error message when the specified table or view doesn't exist: https://github.com/dbt-labs/dbt-spark/blob/c87b6b2c48bcefb0ce52cd64984d3129d6f14ea0/dbt/adapters/spark/impl.py#L223 but, Spark will change the error message in the future release (apache/spark#37887), which causes the function to raise the `dbt.exceptions.RuntimeException` instead of returning an empty list. The function should also check whether the error message contains `[TABLE_OR_VIEW_NOT_FOUND]` or not. This will be reverted once dbt-labs/dbt-spark#515 is resolved.
### What changes were proposed in this pull request? This PR introduces the following error classes: - PARTITIONS_ALREADY_EXIST Cannot ADD or RENAME TO partition(s) <partitionList> in table <tableName> because they already exist. Choose a different name, drop the existing partition, or add the IF NOT EXISTS clause to tolerate a pre-existing partition - PARTITIONS_NOT_FOUND The partition(s) <partitionList> cannot be found in table <tableName>. Verify the partition specification and table name. To tolerate the error on drop use ALTER TABLE … DROP IF EXISTS PARTITION. - ROUTINE_ALREADY_EXISTS Cannot create the function <routineName> because it already exists. Choose a different name, drop or replace the existing function, or add the IF NOT EXISTS clause to tolerate a pre-existing function - ROUTINE_NOT_FOUND The function <routineName> cannot be found. Verify the spelling and correctness of the schema and catalog. If you did not qualify the name with a schema and catalog, verify the current_schema() output, or qualify the name with the correct schema and catalog. To tolerate the error on drop use DROP FUNCTION IF EXISTS - SCHEMA_ALREADY_EXISTS Cannot create schema <schemaName> because it already exists. Choose a different name, drop the existing schema, or add the IF NOT EXISTS clause to tolerate pre-existing schema - SCHEMA_NOT_EMPTY Cannot drop a schema <schemaName> because it contains objects. Use DROP SCHEMA ... CASCADE to drop the schema and all its objects. - SCHEMA_NOT_FOUND The schema <schemaName> cannot be found. Verify the spelling and correctness of the schema and catalog. If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog. To tolerate the error on drop use DROP SCHEMA IF EXISTS. - TABLE_OR_VIEW_ALREADY_EXISTS Cannot create table or view <relationName> because it already exists. Choose a different name, drop or replace the existing object, or add the IF NOT EXISTS clause to tolerate pre-existing objects - TABLE_OR_VIEW_NOT_FOUND The table or view <relationName> cannot be found. Verify the spelling and correctness of the schema and catalog. If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog. To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS. - TEMP_TABLE_OR_VIEW_ALREADY_EXISTS Cannot create the temporary view <relationName> because it already exists. Choose a different name, drop or replace the existing view, or add the IF NOT EXISTS clause to tolerate pre-existing views. Also (for JDBC data sources): - INDEX_ALREADY_EXISTS Cannot create the index because it already exists. <message>. - INDEX_NOT_FOUND Cannot find the index. <message>. Some background: * We use ROUTINE over FUNCTION to be future proof, if/when PROCEDUREs appear. * We coarsify around TABLE_OR_VIEW_NOT_FOUND and TABLE_OR_VIEW_ALREADY_EXISTS (getting rid of dedicated reason as RENAME TABLE, etc. * We combine PARTITION and PARTITIONS errors * I use SCHEMA religiously. A debate can be had whether/ho/when to return NAMESPACE There is currently one failure caused by: https://issues.apache.org/jira/browse/SPARK-40521 Hive based ALTER TABLE ADD PARTITION returns to many partitions in case of PARTITIONS_ALREADY_EXISTS. ### Why are the changes needed? We want to convert all error to use the error-class framework ### Does this PR introduce _any_ user-facing change? Yes, we are moving away from "free txt" and consolidate errors is error-classes.json. This hardens the QA and code allowing us to improve error messages without breaking changes ### How was this patch tested? Run existing QA suite Closes apache#37887 from srielau/SPARK-40360-Convert-some-ddl-mesages. Authored-by: Serge Rielau <serge.rielau@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
This PR introduces the following error classes:
PARTITIONS_ALREADY_EXIST
Cannot ADD or RENAME TO partition(s) in table because they already exist.
Choose a different name, drop the existing partition, or add the IF NOT EXISTS clause to tolerate a pre-existing partition
PARTITIONS_NOT_FOUND
The partition(s) cannot be found in table .
Verify the partition specification and table name.
To tolerate the error on drop use ALTER TABLE … DROP IF EXISTS PARTITION.
ROUTINE_ALREADY_EXISTS
Cannot create the function because it already exists.
Choose a different name, drop or replace the existing function, or add the IF NOT EXISTS clause to tolerate a pre-existing function
ROUTINE_NOT_FOUND
The function cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a schema and catalog, verify the current_schema() output, or qualify the name with the correct schema and catalog.
To tolerate the error on drop use DROP FUNCTION IF EXISTS
SCHEMA_ALREADY_EXISTS
Cannot create schema because it already exists.
Choose a different name, drop the existing schema, or add the IF NOT EXISTS clause to tolerate pre-existing schema
SCHEMA_NOT_EMPTY
Cannot drop a schema because it contains objects.
Use DROP SCHEMA ... CASCADE to drop the schema and all its objects.
SCHEMA_NOT_FOUND
The schema cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog.
To tolerate the error on drop use DROP SCHEMA IF EXISTS.
TABLE_OR_VIEW_ALREADY_EXISTS
Cannot create table or view because it already exists.
Choose a different name, drop or replace the existing object, or add the IF NOT EXISTS clause to tolerate pre-existing objects
TABLE_OR_VIEW_NOT_FOUND
The table or view cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.
To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS.
TEMP_TABLE_OR_VIEW_ALREADY_EXISTS
Cannot create the temporary view because it already exists.
Choose a different name, drop or replace the existing view, or add the IF NOT EXISTS clause to tolerate pre-existing views.
Also (for JDBC data sources):
INDEX_ALREADY_EXISTS
Cannot create the index because it already exists. .
INDEX_NOT_FOUND
Cannot find the index. .
Some background:
There is currently one failure caused by:
https://issues.apache.org/jira/browse/SPARK-40521
Hive based ALTER TABLE ADD PARTITION returns to many partitions in case of PARTITIONS_ALREADY_EXISTS.
Why are the changes needed?
We want to convert all error to use the error-class framework
Does this PR introduce any user-facing change?
Yes, we are moving away from "free txt" and consolidate errors is error-classes.json.
This hardens the QA and code allowing us to improve error messages without breaking changes
How was this patch tested?
Run existing QA suite