Skip to content

Commit

Permalink
[SPARK-46875][SQL] When the mode is null, a NullPointException sh…
Browse files Browse the repository at this point in the history
…ould `not` be thrown

### What changes were proposed in this pull request?
The pr aims to provide better prompts when option's `mode` is null.

### Why are the changes needed?
In the original logic, if the mode is null, Spark will throw a `NullPointerException`, which is obviously unfriendly to the user.

```
val cars = spark.read
      .format("csv")
      .options(Map("header" -> "true", "mode" -> null))
      .load(testFile(carsFile))
cars.show(false)
```

Before:
```
Cannot invoke "String.toUpperCase(java.util.Locale)" because "mode" is null
java.lang.NullPointerException: Cannot invoke "String.toUpperCase(java.util.Locale)" because "mode" is null
	at org.apache.spark.sql.catalyst.util.ParseMode$.fromString(ParseMode.scala:50)
	at org.apache.spark.sql.catalyst.csv.CSVOptions.$anonfun$parseMode$1(CSVOptions.scala:105)
	at scala.Option.map(Option.scala:242)
	at org.apache.spark.sql.catalyst.csv.CSVOptions.<init>(CSVOptions.scala:105)
	at org.apache.spark.sql.catalyst.csv.CSVOptions.<init>(CSVOptions.scala:49)
	at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.inferSchema(CSVFileFormat.scala:60)
```

After:
It will fall back to `PermissiveMode` mode and then display the data normally, as shown below:
```
18:54:06.727 WARN org.apache.spark.sql.catalyst.util.ParseMode: mode is null and not a valid parse mode. Using PERMISSIVE.

+----+-----+-----+----------------------------------+-----+
|year|make |model|comment                           |blank|
+----+-----+-----+----------------------------------+-----+
|2012|Tesla|S    |No comment                        |NULL |
|1997|Ford |E350 |Go get one now they are going fast|NULL |
|2015|Chevy|Volt |NULL                              |NULL |
+----+-----+-----+----------------------------------+-----+
```

### Does this PR introduce _any_ user-facing change?
Yes, When `mode` is null, it fallback to `PermissiveMode ` instead of throwing a `NullPointerException`.

### How was this patch tested?
- Add new UT.
- Pass GA.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#44900 from panbingkun/SPARK-46875.

Authored-by: panbingkun <panbingkun@baidu.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
  • Loading branch information
panbingkun authored and MaxGekk committed Jan 27, 2024
1 parent ecdacf8 commit 6d29c72
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 7 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -47,12 +47,17 @@ object ParseMode extends Logging {
/**
* Returns the parse mode from the given string.
*/
def fromString(mode: String): ParseMode = mode.toUpperCase(Locale.ROOT) match {
case PermissiveMode.name => PermissiveMode
case DropMalformedMode.name => DropMalformedMode
case FailFastMode.name => FailFastMode
case _ =>
logWarning(s"$mode is not a valid parse mode. Using ${PermissiveMode.name}.")
PermissiveMode
def fromString(mode: String): ParseMode = Option(mode).map {
v => v.toUpperCase(Locale.ROOT) match {
case PermissiveMode.name => PermissiveMode
case DropMalformedMode.name => DropMalformedMode
case FailFastMode.name => FailFastMode
case _ =>
logWarning(s"$v is not a valid parse mode. Using ${PermissiveMode.name}.")
PermissiveMode
}
}.getOrElse {
logWarning(s"mode is null and not a valid parse mode. Using ${PermissiveMode.name}.")
PermissiveMode
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -348,6 +348,16 @@ abstract class CSVSuite
}
}

test("when mode is null, will fall back to PermissiveMode mode") {
val cars = spark.read
.format("csv")
.options(Map("header" -> "true", "mode" -> null))
.load(testFile(carsFile))
assert(cars.collect().length == 3)
assert(cars.select("make").collect() sameElements
Array(Row("Tesla"), Row("Ford"), Row("Chevy")))
}

test("test for blank column names on read and select columns") {
val cars = spark.read
.format("csv")
Expand Down

0 comments on commit 6d29c72

Please sign in to comment.