[SPARK-40360] ALREADY_EXISTS and NOT_FOUND exceptions

### What changes were proposed in this pull request? This PR introduces the following error classes: - PARTITIONS_ALREADY_EXIST Cannot ADD or RENAME TO partition(s) <partitionList> in table <tableName> because they already exist. Choose a different name, drop the existing partition, or add the IF NOT EXISTS clause to tolerate a pre-existing partition - PARTITIONS_NOT_FOUND The partition(s) <partitionList> cannot be found in table <tableName>. Verify the partition specification and table name. To tolerate the error on drop use ALTER TABLE … DROP IF EXISTS PARTITION. - ROUTINE_ALREADY_EXISTS Cannot create the function <routineName> because it already exists. Choose a different name, drop or replace the existing function, or add the IF NOT EXISTS clause to tolerate a pre-existing function - ROUTINE_NOT_FOUND The function <routineName> cannot be found. Verify the spelling and correctness of the schema and catalog. If you did not qualify the name with a schema and catalog, verify the current_schema() output, or qualify the name with the correct schema and catalog. To tolerate the error on drop use DROP FUNCTION IF EXISTS - SCHEMA_ALREADY_EXISTS Cannot create schema <schemaName> because it already exists. Choose a different name, drop the existing schema, or add the IF NOT EXISTS clause to tolerate pre-existing schema - SCHEMA_NOT_EMPTY Cannot drop a schema <schemaName> because it contains objects. Use DROP SCHEMA ... CASCADE to drop the schema and all its objects. - SCHEMA_NOT_FOUND The schema <schemaName> cannot be found. Verify the spelling and correctness of the schema and catalog. If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog. To tolerate the error on drop use DROP SCHEMA IF EXISTS. - TABLE_OR_VIEW_ALREADY_EXISTS Cannot create table or view <relationName> because it already exists. Choose a different name, drop or replace the existing object, or add the IF NOT EXISTS clause to tolerate pre-existing objects - TABLE_OR_VIEW_NOT_FOUND The table or view <relationName> cannot be found. Verify the spelling and correctness of the schema and catalog. If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog. To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS. - TEMP_TABLE_OR_VIEW_ALREADY_EXISTS Cannot create the temporary view <relationName> because it already exists. Choose a different name, drop or replace the existing view, or add the IF NOT EXISTS clause to tolerate pre-existing views. Also (for JDBC data sources): - INDEX_ALREADY_EXISTS Cannot create the index because it already exists. <message>. - INDEX_NOT_FOUND Cannot find the index. <message>. Some background: * We use ROUTINE over FUNCTION to be future proof, if/when PROCEDUREs appear. * We coarsify around TABLE_OR_VIEW_NOT_FOUND and TABLE_OR_VIEW_ALREADY_EXISTS (getting rid of dedicated reason as RENAME TABLE, etc. * We combine PARTITION and PARTITIONS errors * I use SCHEMA religiously. A debate can be had whether/ho/when to return NAMESPACE There is currently one failure caused by: https://issues.apache.org/jira/browse/SPARK-40521 Hive based ALTER TABLE ADD PARTITION returns to many partitions in case of PARTITIONS_ALREADY_EXISTS. ### Why are the changes needed? We want to convert all error to use the error-class framework ### Does this PR introduce _any_ user-facing change? Yes, we are moving away from "free txt" and consolidate errors is error-classes.json. This hardens the QA and code allowing us to improve error messages without breaking changes ### How was this patch tested? Run existing QA suite Closes #37887 from srielau/SPARK-40360-Convert-some-ddl-mesages. Authored-by: Serge Rielau <serge.rielau@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
apache · Oct 18, 2022 · e7fbefe · e7fbefe
1 parent d0ab83c
commit e7fbefe
Show file tree

Hide file tree

Showing 103 changed files with 1,656 additions and 641 deletions.
diff --git a/R/pkg/tests/fulltests/test_sparkSQL.R b/R/pkg/tests/fulltests/test_sparkSQL.R
@@ -722,7 +722,7 @@ test_that("test tableExists, cache, uncache and clearCache", {
   clearCache()
 
   expect_error(uncacheTable("zxwtyswklpf"),
-      "Error in uncacheTable : analysis error - Table or view not found: zxwtyswklpf")
+      "[TABLE_OR_VIEW_NOT_FOUND]*`zxwtyswklpf`*")
 
   expect_true(tableExists("table1"))
   expect_true(tableExists("default.table1"))
@@ -3367,8 +3367,8 @@ test_that("approxQuantile() on a DataFrame", {
 
 test_that("SQL error message is returned from JVM", {
   retError <- tryCatch(sql("select * from blah"), error = function(e) e)
-  expect_equal(grepl("Table or view not found", retError), TRUE)
-  expect_equal(grepl("blah", retError), TRUE)
+  expect_equal(grepl("[TABLE_OR_VIEW_NOT_FOUND]", retError), TRUE)
+  expect_equal(grepl("`blah`", retError), TRUE)
 })
 
 irisDF <- suppressWarnings(createDataFrame(iris))
@@ -4077,8 +4077,7 @@ test_that("catalog APIs, currentDatabase, setCurrentDatabase, listDatabases, get
   expect_equal(currentDatabase(), "default")
   expect_error(setCurrentDatabase("default"), NA)
   expect_error(setCurrentDatabase("zxwtyswklpf"),
-               paste0("Error in setCurrentDatabase : no such database - Database ",
-               "'zxwtyswklpf' not found"))
+               "[SCHEMA_NOT_FOUND]*`zxwtyswklpf`*")
 
   expect_true(databaseExists("default"))
   expect_true(databaseExists("spark_catalog.default"))
@@ -4110,15 +4109,15 @@ test_that("catalog APIs, listTables, getTable, listColumns, listFunctions, funct
   tbs <- collect(tb)
   expect_true(nrow(tbs[tbs$name == "cars", ]) > 0)
   expect_error(listTables("bar"),
-               "Error in listTables : no such database - Database 'bar' not found")
+               "[SCHEMA_NOT_FOUND]*`bar`*")
 
   c <- listColumns("cars")
   expect_equal(nrow(c), 2)
   expect_equal(colnames(c),
                c("name", "description", "dataType", "nullable", "isPartition", "isBucket"))
   expect_equal(collect(c)[[1]][[1]], "speed")
   expect_error(listColumns("zxwtyswklpf", "default"),
-               paste("Table or view not found: spark_catalog.default.zxwtyswklpf"))
+               "[TABLE_OR_VIEW_NOT_FOUND]*`spark_catalog`.`default`.`zxwtyswklpf`*")
 
   f <- listFunctions()
   expect_true(nrow(f) >= 200) # 250
@@ -4127,8 +4126,7 @@ test_that("catalog APIs, listTables, getTable, listColumns, listFunctions, funct
   expect_equal(take(orderBy(filter(f, "className IS NOT NULL"), "className"), 1)$className,
                "org.apache.spark.sql.catalyst.expressions.Abs")
   expect_error(listFunctions("zxwtyswklpf_db"),
-               paste("Error in listFunctions : no such database - Database",
-                     "'zxwtyswklpf_db' not found"))
+               "[SCHEMA_NOT_FOUND]*`zxwtyswklpf_db`*")
 
   expect_true(functionExists("abs"))
   expect_false(functionExists("aabbss"))

diff --git a/...r-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCNamespaceTest.scala b/...r-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCNamespaceTest.scala
@@ -87,10 +87,12 @@ private[v2] trait V2JDBCNamespaceTest extends SharedSparkSession with DockerInte
       }
       assert(catalog.namespaceExists(Array("foo")) === false)
       assert(catalog.listNamespaces() === builtinNamespaces)
-      val msg = intercept[AnalysisException] {
+      val e = intercept[AnalysisException] {
         catalog.listNamespaces(Array("foo"))
-      }.getMessage
-      assert(msg.contains("Namespace 'foo' not found"))
+      }
+      checkError(e,
+        errorClass = "SCHEMA_NOT_FOUND",
+        parameters = Map("schemaName" -> "`foo`"))
     }
   }
 

diff --git a/...tor/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala b/...tor/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala
@@ -20,8 +20,9 @@ package org.apache.spark.sql.jdbc.v2
 import org.apache.logging.log4j.Level
 
 import org.apache.spark.sql.{AnalysisException, DataFrame}
-import org.apache.spark.sql.catalyst.analysis.{IndexAlreadyExistsException, NoSuchIndexException}
+import org.apache.spark.sql.catalyst.analysis.{IndexAlreadyExistsException, NoSuchIndexException, UnresolvedAttribute}
 import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Filter, Sample}
+import org.apache.spark.sql.catalyst.util.quoteIdentifier
 import org.apache.spark.sql.connector.catalog.{Catalogs, Identifier, TableCatalog}
 import org.apache.spark.sql.connector.catalog.index.SupportsIndex
 import org.apache.spark.sql.connector.expressions.aggregate.GeneralAggregateFunc
@@ -99,10 +100,12 @@ private[v2] trait V2JDBCTest extends SharedSparkSession with DockerIntegrationFu
       assert(msg.contains("Cannot add column, because C3 already exists"))
     }
     // Add a column to not existing table
-    val msg = intercept[AnalysisException] {
+    val e = intercept[AnalysisException] {
       sql(s"ALTER TABLE $catalogName.not_existing_table ADD COLUMNS (C4 STRING)")
-    }.getMessage
-    assert(msg.contains("Table not found"))
+    }
+    checkErrorTableNotFound(e, s"`$catalogName`.`not_existing_table`",
+      ExpectedContext(s"$catalogName.not_existing_table", 12,
+        11 + s"$catalogName.not_existing_table".length))
   }
 
   test("SPARK-33034: ALTER TABLE ... drop column") {
@@ -120,10 +123,12 @@ private[v2] trait V2JDBCTest extends SharedSparkSession with DockerIntegrationFu
       assert(msg.contains(s"Missing field bad_column in table $catalogName.alt_table"))
     }
     // Drop a column from a not existing table
-    val msg = intercept[AnalysisException] {
+    val e = intercept[AnalysisException] {
       sql(s"ALTER TABLE $catalogName.not_existing_table DROP COLUMN C1")
-    }.getMessage
-    assert(msg.contains("Table not found"))
+    }
+    checkErrorTableNotFound(e, s"`$catalogName`.`not_existing_table`",
+      ExpectedContext(s"$catalogName.not_existing_table", 12,
+        11 + s"$catalogName.not_existing_table".length))
   }
 
   test("SPARK-33034: ALTER TABLE ... update column type") {
@@ -136,10 +141,12 @@ private[v2] trait V2JDBCTest extends SharedSparkSession with DockerIntegrationFu
       assert(msg2.contains("Missing field bad_column"))
     }
     // Update column type in not existing table
-    val msg = intercept[AnalysisException] {
+    val e = intercept[AnalysisException] {
       sql(s"ALTER TABLE $catalogName.not_existing_table ALTER COLUMN id TYPE DOUBLE")
-    }.getMessage
-    assert(msg.contains("Table not found"))
+    }
+    checkErrorTableNotFound(e, s"`$catalogName`.`not_existing_table`",
+      ExpectedContext(s"$catalogName.not_existing_table", 12,
+        11 + s"$catalogName.not_existing_table".length))
   }
 
   test("SPARK-33034: ALTER TABLE ... rename column") {
@@ -154,21 +161,27 @@ private[v2] trait V2JDBCTest extends SharedSparkSession with DockerIntegrationFu
       assert(msg.contains("Cannot rename column, because ID2 already exists"))
     }
     // Rename a column in a not existing table
-    val msg = intercept[AnalysisException] {
+    val e = intercept[AnalysisException] {
       sql(s"ALTER TABLE $catalogName.not_existing_table RENAME COLUMN ID TO C")
-    }.getMessage
-    assert(msg.contains("Table not found"))
+    }
+    checkErrorTableNotFound(e,
+      UnresolvedAttribute.parseAttributeName(s"$catalogName.not_existing_table")
+        .map(part => quoteIdentifier(part)).mkString("."),
+      ExpectedContext(s"$catalogName.not_existing_table", 12,
+        11 + s"$catalogName.not_existing_table".length))
   }
 
   test("SPARK-33034: ALTER TABLE ... update column nullability") {
     withTable(s"$catalogName.alt_table") {
       testUpdateColumnNullability(s"$catalogName.alt_table")
     }
     // Update column nullability in not existing table
-    val msg = intercept[AnalysisException] {
+    val e = intercept[AnalysisException] {
       sql(s"ALTER TABLE $catalogName.not_existing_table ALTER COLUMN ID DROP NOT NULL")
-    }.getMessage
-    assert(msg.contains("Table not found"))
+    }
+    checkErrorTableNotFound(e, s"`$catalogName`.`not_existing_table`",
+      ExpectedContext(s"$catalogName.not_existing_table", 12,
+        11 + s"$catalogName.not_existing_table".length))
   }
 
   test("CREATE TABLE with table comment") {