-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to Load Geojson File using Sedona Context in Databricks #1617
Comments
Thank you for your interest in Apache Sedona! We appreciate you opening your first issue. Contributions like yours help make Apache Sedona better. |
Are you using shared access cluster in Databricks? Copying something Jia said in another thread:
|
Hi James, |
I can confirm that function exists in open source spark with that signature. I wonder if there is some difference in implementation in the databricks json DataSource. @Kontinuation have you been able to test the geojson reader in databricks? |
I've reproduced this on DBR 15.4 LTS. The |
Expected behavior
I am trying to execute the following code in Databricks as mentioned in the Sedona Official Doc
df = sedona.read.format("geojson").option("multiLine", "true").load("PATH/TO/MYFILE.json")
.selectExpr("explode(features) as features") # Explode the envelope to get one feature per row.
.select("features.*") # Unpack the features struct.
.withColumn("prop0", f.expr("properties['prop0']")).drop("properties").drop("type")
df.show()
df.printSchema()
Ref : https://sedona.apache.org/latest-snapshot/tutorial/sql/#__tabbed_14_3
I am getting the following error
Caused by: java.lang.NoSuchMethodError: org.apache.spark.sql.execution.datasources.json.JsonDataSource.readFile(Lorg/apache/hadoop/conf/Configuration;Lorg/apache/spark/sql/execution/datasources/PartitionedFile;Lorg/apache/spark/sql/catalyst/json/JacksonParser;Lorg/apache/spark/sql/types/StructType;)Lscala/collection/Iterator;
Actual behavior
geojson file should be loaded into the dataframe
Steps to reproduce the problem
I have installed the following jar files
I have installed the following libraries
Settings
Sedona version = 1.6.1
Apache Spark version = 3.5.0 (Not working with Spark 3.4 Version as well)
Apache Flink version = NA
API type = Scala, Java, Python? Python
Scala version = 2.11, 2.12, 2.13? 2.12
JRE version = 1.8, 1.11? 1.8
Python version = ?
Environment = Standalone, AWS EC2, EMR, Azure, Databricks?
The text was updated successfully, but these errors were encountered: