Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-46875][SQL] When the
mode
is null, a NullPointException
sh…
…ould `not` be thrown ### What changes were proposed in this pull request? The pr aims to provide better prompts when option's `mode` is null. ### Why are the changes needed? In the original logic, if the mode is null, Spark will throw a `NullPointerException`, which is obviously unfriendly to the user. ``` val cars = spark.read .format("csv") .options(Map("header" -> "true", "mode" -> null)) .load(testFile(carsFile)) cars.show(false) ``` Before: ``` Cannot invoke "String.toUpperCase(java.util.Locale)" because "mode" is null java.lang.NullPointerException: Cannot invoke "String.toUpperCase(java.util.Locale)" because "mode" is null at org.apache.spark.sql.catalyst.util.ParseMode$.fromString(ParseMode.scala:50) at org.apache.spark.sql.catalyst.csv.CSVOptions.$anonfun$parseMode$1(CSVOptions.scala:105) at scala.Option.map(Option.scala:242) at org.apache.spark.sql.catalyst.csv.CSVOptions.<init>(CSVOptions.scala:105) at org.apache.spark.sql.catalyst.csv.CSVOptions.<init>(CSVOptions.scala:49) at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.inferSchema(CSVFileFormat.scala:60) ``` After: It will fall back to `PermissiveMode` mode and then display the data normally, as shown below: ``` 18:54:06.727 WARN org.apache.spark.sql.catalyst.util.ParseMode: mode is null and not a valid parse mode. Using PERMISSIVE. +----+-----+-----+----------------------------------+-----+ |year|make |model|comment |blank| +----+-----+-----+----------------------------------+-----+ |2012|Tesla|S |No comment |NULL | |1997|Ford |E350 |Go get one now they are going fast|NULL | |2015|Chevy|Volt |NULL |NULL | +----+-----+-----+----------------------------------+-----+ ``` ### Does this PR introduce _any_ user-facing change? Yes, When `mode` is null, it fallback to `PermissiveMode ` instead of throwing a `NullPointerException`. ### How was this patch tested? - Add new UT. - Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#44900 from panbingkun/SPARK-46875. Authored-by: panbingkun <panbingkun@baidu.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
- Loading branch information