You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a set of Excel format files which needs to be read from Spark(2.4.0) as and when an Excel file is loaded into a local directory. Scala version used here is 2.11.8.
I've tried using readstream method of SparkSession, but I'm not able to read in a streaming way. the code as:
val spark = SparkSession.builder().master("local[*]").appName("Spark SQL Example").getOrCreate()
spark.sqlContext.setConf("spark.sql.streaming.schemaInference","true")
import spark.implicits._
val df = spark.readStream.format("com.crealytics.spark.excel").option("header", true).load("file:///filepath/*.xlsx")
df.writeStream.format("memory").queryName("tab").start().awaitTermination()
val res = spark.sql("select * from tab")
res.show()
the error log as:
22/04/01 16:14:16 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
Exception in thread "main" java.lang.UnsupportedOperationException: Data source com.crealytics.spark.excel does not support streamed reading
at org.apache.spark.sql.execution.datasources.DataSource.sourceSchema(DataSource.scala:246)
at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo$lzycompute(DataSource.scala:95)
at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo(DataSource.scala:95)
at org.apache.spark.sql.execution.streaming.StreamingRelation$.apply(StreamingRelation.scala:33)
at org.apache.spark.sql.streaming.DataStreamReader.load(DataStreamReader.scala:215)
at org.apache.spark.sql.streaming.DataStreamReader.load(DataStreamReader.scala:225)
at com.chinafusiongroup.dcp.ExcelStreamApp$.main(ExcelStreamApp.scala:12)
at com.chinafusiongroup.dcp.ExcelStreamApp.main(ExcelStreamApp.scala)
22/04/01 16:14:16 INFO SparkContext: Invoking stop() from shutdown hook
Any answers would be helpful.
The text was updated successfully, but these errors were encountered:
I fear you are out of luck here...
Streaming read probably works in v2 (haven't tested it myself yet), but we have stopped supporting Scala 2.11 quite a while ago...
If you're proficient with Scala, you could try building it yourself for Scala 2.11, but I fear many dependencies have also stopped publishing packages for 2.11...
I fear you are out of luck here... Streaming read probably works in v2 (haven't tested it myself yet), but we have stopped supporting Scala 2.11 quite a while ago... If you're proficient with Scala, you could try building it yourself for Scala 2.11, but I fear many dependencies have also stopped publishing packages for 2.11...
Thanks for your answer, I hope to support streaming as soon as possible in the new version. Through research and verification, we currently use hadoopoffice (spark-hadoopoffice-ds) to deal with the current needs.
I have a set of Excel format files which needs to be read from Spark(2.4.0) as and when an Excel file is loaded into a local directory. Scala version used here is 2.11.8.
I've tried using readstream method of SparkSession, but I'm not able to read in a streaming way. the code as:
the error log as:
Any answers would be helpful.
The text was updated successfully, but these errors were encountered: