Skip to content

Commit

Permalink
[SPARK-40360] ALREADY_EXISTS and NOT_FOUND exceptions
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

This PR introduces the following error classes:

- PARTITIONS_ALREADY_EXIST
  Cannot ADD or RENAME TO partition(s) <partitionList> in table <tableName> because they already exist.
  Choose a different name, drop the existing partition, or add the IF NOT EXISTS clause to tolerate a pre-existing partition

- PARTITIONS_NOT_FOUND
  The partition(s) <partitionList> cannot be found in table <tableName>.
  Verify the partition specification and table name.
  To tolerate the error on drop use ALTER TABLE … DROP IF EXISTS PARTITION.

-  ROUTINE_ALREADY_EXISTS
  Cannot create the function <routineName> because it already exists.
  Choose a different name, drop or replace the existing function, or add the IF NOT EXISTS clause to tolerate a pre-existing function

- ROUTINE_NOT_FOUND
  The function <routineName> cannot be found. Verify the spelling and correctness of the schema and catalog.
  If you did not qualify the name with a schema and catalog, verify the current_schema() output, or qualify the name with the correct schema and catalog.
  To tolerate the error on drop use DROP FUNCTION IF EXISTS

- SCHEMA_ALREADY_EXISTS
  Cannot create schema <schemaName> because it already exists.
  Choose a different name, drop the existing schema, or add the IF NOT EXISTS clause to tolerate pre-existing schema

- SCHEMA_NOT_EMPTY
  Cannot drop a schema <schemaName> because it contains objects.
  Use DROP SCHEMA ... CASCADE to drop the schema and all its objects.

- SCHEMA_NOT_FOUND
  The schema <schemaName> cannot be found. Verify the spelling and correctness of the schema and catalog.
  If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog.
  To tolerate the error on drop use DROP SCHEMA IF EXISTS.

- TABLE_OR_VIEW_ALREADY_EXISTS
  Cannot create table or view <relationName> because it already exists.
  Choose a different name, drop or replace the existing object, or add the IF NOT EXISTS clause to tolerate pre-existing objects

- TABLE_OR_VIEW_NOT_FOUND
  The table or view <relationName> cannot be found. Verify the spelling and correctness of the schema and catalog.
  If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.
  To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS.

- TEMP_TABLE_OR_VIEW_ALREADY_EXISTS
  Cannot create the temporary view <relationName> because it already exists.
  Choose a different name, drop or replace the existing view,  or add the IF NOT EXISTS clause to tolerate pre-existing views.

Also (for JDBC data sources):

- INDEX_ALREADY_EXISTS
  Cannot create the index because it already exists. <message>.

- INDEX_NOT_FOUND
  Cannot find the index. <message>.

Some background:
* We use ROUTINE over FUNCTION to be future proof, if/when PROCEDUREs appear.
* We coarsify around TABLE_OR_VIEW_NOT_FOUND and TABLE_OR_VIEW_ALREADY_EXISTS (getting rid of dedicated reason as RENAME TABLE, etc.
* We combine PARTITION and PARTITIONS errors
* I use SCHEMA religiously. A debate can be had whether/ho/when to return NAMESPACE

There is currently one failure caused by:

https://issues.apache.org/jira/browse/SPARK-40521
Hive based ALTER TABLE ADD PARTITION returns to many partitions in case of PARTITIONS_ALREADY_EXISTS.

### Why are the changes needed?
We want to convert all error to use the error-class framework

### Does this PR introduce _any_ user-facing change?

Yes, we are moving away from "free txt" and consolidate errors is error-classes.json.
This hardens the QA and code allowing us to improve error messages without breaking changes

### How was this patch tested?

Run existing QA suite

Closes #37887 from srielau/SPARK-40360-Convert-some-ddl-mesages.

Authored-by: Serge Rielau <serge.rielau@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
  • Loading branch information
srielau authored and cloud-fan committed Oct 18, 2022
1 parent d0ab83c commit e7fbefe
Show file tree
Hide file tree
Showing 103 changed files with 1,656 additions and 641 deletions.
16 changes: 7 additions & 9 deletions R/pkg/tests/fulltests/test_sparkSQL.R
Original file line number Diff line number Diff line change
Expand Up @@ -722,7 +722,7 @@ test_that("test tableExists, cache, uncache and clearCache", {
clearCache()

expect_error(uncacheTable("zxwtyswklpf"),
"Error in uncacheTable : analysis error - Table or view not found: zxwtyswklpf")
"[TABLE_OR_VIEW_NOT_FOUND]*`zxwtyswklpf`*")

expect_true(tableExists("table1"))
expect_true(tableExists("default.table1"))
Expand Down Expand Up @@ -3367,8 +3367,8 @@ test_that("approxQuantile() on a DataFrame", {

test_that("SQL error message is returned from JVM", {
retError <- tryCatch(sql("select * from blah"), error = function(e) e)
expect_equal(grepl("Table or view not found", retError), TRUE)
expect_equal(grepl("blah", retError), TRUE)
expect_equal(grepl("[TABLE_OR_VIEW_NOT_FOUND]", retError), TRUE)
expect_equal(grepl("`blah`", retError), TRUE)
})

irisDF <- suppressWarnings(createDataFrame(iris))
Expand Down Expand Up @@ -4077,8 +4077,7 @@ test_that("catalog APIs, currentDatabase, setCurrentDatabase, listDatabases, get
expect_equal(currentDatabase(), "default")
expect_error(setCurrentDatabase("default"), NA)
expect_error(setCurrentDatabase("zxwtyswklpf"),
paste0("Error in setCurrentDatabase : no such database - Database ",
"'zxwtyswklpf' not found"))
"[SCHEMA_NOT_FOUND]*`zxwtyswklpf`*")

expect_true(databaseExists("default"))
expect_true(databaseExists("spark_catalog.default"))
Expand Down Expand Up @@ -4110,15 +4109,15 @@ test_that("catalog APIs, listTables, getTable, listColumns, listFunctions, funct
tbs <- collect(tb)
expect_true(nrow(tbs[tbs$name == "cars", ]) > 0)
expect_error(listTables("bar"),
"Error in listTables : no such database - Database 'bar' not found")
"[SCHEMA_NOT_FOUND]*`bar`*")

c <- listColumns("cars")
expect_equal(nrow(c), 2)
expect_equal(colnames(c),
c("name", "description", "dataType", "nullable", "isPartition", "isBucket"))
expect_equal(collect(c)[[1]][[1]], "speed")
expect_error(listColumns("zxwtyswklpf", "default"),
paste("Table or view not found: spark_catalog.default.zxwtyswklpf"))
"[TABLE_OR_VIEW_NOT_FOUND]*`spark_catalog`.`default`.`zxwtyswklpf`*")

f <- listFunctions()
expect_true(nrow(f) >= 200) # 250
Expand All @@ -4127,8 +4126,7 @@ test_that("catalog APIs, listTables, getTable, listColumns, listFunctions, funct
expect_equal(take(orderBy(filter(f, "className IS NOT NULL"), "className"), 1)$className,
"org.apache.spark.sql.catalyst.expressions.Abs")
expect_error(listFunctions("zxwtyswklpf_db"),
paste("Error in listFunctions : no such database - Database",
"'zxwtyswklpf_db' not found"))
"[SCHEMA_NOT_FOUND]*`zxwtyswklpf_db`*")

expect_true(functionExists("abs"))
expect_false(functionExists("aabbss"))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -87,10 +87,12 @@ private[v2] trait V2JDBCNamespaceTest extends SharedSparkSession with DockerInte
}
assert(catalog.namespaceExists(Array("foo")) === false)
assert(catalog.listNamespaces() === builtinNamespaces)
val msg = intercept[AnalysisException] {
val e = intercept[AnalysisException] {
catalog.listNamespaces(Array("foo"))
}.getMessage
assert(msg.contains("Namespace 'foo' not found"))
}
checkError(e,
errorClass = "SCHEMA_NOT_FOUND",
parameters = Map("schemaName" -> "`foo`"))
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,9 @@ package org.apache.spark.sql.jdbc.v2
import org.apache.logging.log4j.Level

import org.apache.spark.sql.{AnalysisException, DataFrame}
import org.apache.spark.sql.catalyst.analysis.{IndexAlreadyExistsException, NoSuchIndexException}
import org.apache.spark.sql.catalyst.analysis.{IndexAlreadyExistsException, NoSuchIndexException, UnresolvedAttribute}
import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Filter, Sample}
import org.apache.spark.sql.catalyst.util.quoteIdentifier
import org.apache.spark.sql.connector.catalog.{Catalogs, Identifier, TableCatalog}
import org.apache.spark.sql.connector.catalog.index.SupportsIndex
import org.apache.spark.sql.connector.expressions.aggregate.GeneralAggregateFunc
Expand Down Expand Up @@ -99,10 +100,12 @@ private[v2] trait V2JDBCTest extends SharedSparkSession with DockerIntegrationFu
assert(msg.contains("Cannot add column, because C3 already exists"))
}
// Add a column to not existing table
val msg = intercept[AnalysisException] {
val e = intercept[AnalysisException] {
sql(s"ALTER TABLE $catalogName.not_existing_table ADD COLUMNS (C4 STRING)")
}.getMessage
assert(msg.contains("Table not found"))
}
checkErrorTableNotFound(e, s"`$catalogName`.`not_existing_table`",
ExpectedContext(s"$catalogName.not_existing_table", 12,
11 + s"$catalogName.not_existing_table".length))
}

test("SPARK-33034: ALTER TABLE ... drop column") {
Expand All @@ -120,10 +123,12 @@ private[v2] trait V2JDBCTest extends SharedSparkSession with DockerIntegrationFu
assert(msg.contains(s"Missing field bad_column in table $catalogName.alt_table"))
}
// Drop a column from a not existing table
val msg = intercept[AnalysisException] {
val e = intercept[AnalysisException] {
sql(s"ALTER TABLE $catalogName.not_existing_table DROP COLUMN C1")
}.getMessage
assert(msg.contains("Table not found"))
}
checkErrorTableNotFound(e, s"`$catalogName`.`not_existing_table`",
ExpectedContext(s"$catalogName.not_existing_table", 12,
11 + s"$catalogName.not_existing_table".length))
}

test("SPARK-33034: ALTER TABLE ... update column type") {
Expand All @@ -136,10 +141,12 @@ private[v2] trait V2JDBCTest extends SharedSparkSession with DockerIntegrationFu
assert(msg2.contains("Missing field bad_column"))
}
// Update column type in not existing table
val msg = intercept[AnalysisException] {
val e = intercept[AnalysisException] {
sql(s"ALTER TABLE $catalogName.not_existing_table ALTER COLUMN id TYPE DOUBLE")
}.getMessage
assert(msg.contains("Table not found"))
}
checkErrorTableNotFound(e, s"`$catalogName`.`not_existing_table`",
ExpectedContext(s"$catalogName.not_existing_table", 12,
11 + s"$catalogName.not_existing_table".length))
}

test("SPARK-33034: ALTER TABLE ... rename column") {
Expand All @@ -154,21 +161,27 @@ private[v2] trait V2JDBCTest extends SharedSparkSession with DockerIntegrationFu
assert(msg.contains("Cannot rename column, because ID2 already exists"))
}
// Rename a column in a not existing table
val msg = intercept[AnalysisException] {
val e = intercept[AnalysisException] {
sql(s"ALTER TABLE $catalogName.not_existing_table RENAME COLUMN ID TO C")
}.getMessage
assert(msg.contains("Table not found"))
}
checkErrorTableNotFound(e,
UnresolvedAttribute.parseAttributeName(s"$catalogName.not_existing_table")
.map(part => quoteIdentifier(part)).mkString("."),
ExpectedContext(s"$catalogName.not_existing_table", 12,
11 + s"$catalogName.not_existing_table".length))
}

test("SPARK-33034: ALTER TABLE ... update column nullability") {
withTable(s"$catalogName.alt_table") {
testUpdateColumnNullability(s"$catalogName.alt_table")
}
// Update column nullability in not existing table
val msg = intercept[AnalysisException] {
val e = intercept[AnalysisException] {
sql(s"ALTER TABLE $catalogName.not_existing_table ALTER COLUMN ID DROP NOT NULL")
}.getMessage
assert(msg.contains("Table not found"))
}
checkErrorTableNotFound(e, s"`$catalogName`.`not_existing_table`",
ExpectedContext(s"$catalogName.not_existing_table", 12,
11 + s"$catalogName.not_existing_table".length))
}

test("CREATE TABLE with table comment") {
Expand Down
Loading

0 comments on commit e7fbefe

Please sign in to comment.