You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched in the issues and found no similar issues.
Describe the bug
I'm trying to submit a pyspark batch job using the REST endpoint. I can see the spark job to be completed, but Kyuubi shows the job is in pending state.
Setup:
Spark running on docker with a master and worker, using bitnami/spark image
Kyuubi running in its own docker container built with provided script with base image from bitnami/spark
Sample pyspark code:
frompyspark.sqlimportSparkSessionspark=SparkSession.builder \
.appName("Sample PySpark Code") \
.getOrCreate()
data= [
(1, "Alice", 28),
(2, "Bob", 24),
(3, "Cathy", 29),
(4, "David", 30),
(5, "Eve", 22)
]
columns= ["id", "name", "age"]
df=spark.createDataFrame(data, columns)
# Show the schema and some rows of the DataFramedf.printSchema()
df.show()
df_selected=df.select("name", "age")
# Filter rows based on a conditiondf_filtered=df_selected.filter(df_selected["age"] >25)
# Show the transformed DataFramedf_filtered.show()
spark.stop()
curl --location --request POST 'http://localhost:10099/api/v1/batches' \
--form 'batchRequest="{\"batchType\":\"pyspark\",\"name\":\"PySpark Pi App\", \"conf\": {\"spark.master\":\"spark://spark-master:7077\"}}";type=application/json' \
--form 'resourceFile=@"/code/kyuubi/pyspark_sample/main.py"'
Spark job log
Spark Executor Command: "/opt/bitnami/java/bin/java" "-cp" "/opt/bitnami/spark/conf/:/opt/bitnami/spark/jars/*" "-Xmx1024M" "-Dspark.driver.port=44035" "-Djava.net.preferIPv6Addresses=false" "-XX:+IgnoreUnrecognizedVMOptions" "--add-opens=java.base/java.lang=ALL-UNNAMED" "--add-opens=java.base/java.lang.invoke=ALL-UNNAMED" "--add-opens=java.base/java.lang.reflect=ALL-UNNAMED" "--add-opens=java.base/java.io=ALL-UNNAMED" "--add-opens=java.base/java.net=ALL-UNNAMED" "--add-opens=java.base/java.nio=ALL-UNNAMED" "--add-opens=java.base/java.util=ALL-UNNAMED" "--add-opens=java.base/java.util.concurrent=ALL-UNNAMED" "--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED" "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED" "--add-opens=java.base/sun.nio.cs=ALL-UNNAMED" "--add-opens=java.base/sun.security.action=ALL-UNNAMED" "--add-opens=java.base/sun.util.calendar=ALL-UNNAMED" "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED" "-Djdk.reflect.useDirectMethodHandle=false" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@7e5c53b23aa7:44035" "--executor-id" "0" "--hostname" "172.20.0.3" "--cores" "1" "--app-id" "app-20240716160002-0003" "--worker-url" "spark://Worker@172.20.0.3:44167" "--resourceProfileId" "0"
========================================
Picked up _JAVA_OPTIONS: --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
24/07/16 16:00:03 INFO CoarseGrainedExecutorBackend: Started daemon with process name: 394@73aecc79e3f8
24/07/16 16:00:03 INFO SignalUtils: Registering signal handler for TERM
24/07/16 16:00:03 INFO SignalUtils: Registering signal handler for HUP
24/07/16 16:00:03 INFO SignalUtils: Registering signal handler for INT
24/07/16 16:00:03 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
24/07/16 16:00:03 INFO SecurityManager: Changing view acls to: spark
24/07/16 16:00:03 INFO SecurityManager: Changing modify acls to: spark
24/07/16 16:00:03 INFO SecurityManager: Changing view acls groups to:
24/07/16 16:00:03 INFO SecurityManager: Changing modify acls groups to:
24/07/16 16:00:03 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: spark; groups with view permissions: EMPTY; users with modify permissions: spark; groups with modify permissions: EMPTY
24/07/16 16:00:03 INFO TransportClientFactory: Successfully created connection to 7e5c53b23aa7/172.20.0.4:44035 after 35 ms (0 ms spent in bootstraps)
24/07/16 16:00:03 INFO SecurityManager: Changing view acls to: spark
24/07/16 16:00:03 INFO SecurityManager: Changing modify acls to: spark
24/07/16 16:00:03 INFO SecurityManager: Changing view acls groups to:
24/07/16 16:00:03 INFO SecurityManager: Changing modify acls groups to:
24/07/16 16:00:03 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: spark; groups with view permissions: EMPTY; users with modify permissions: spark; groups with modify permissions: EMPTY
24/07/16 16:00:03 INFO TransportClientFactory: Successfully created connection to 7e5c53b23aa7/172.20.0.4:44035 after 6 ms (0 ms spent in bootstraps)
24/07/16 16:00:04 INFO DiskBlockManager: Created local directory at /tmp/spark-e30bc5b9-d9f6-4658-be36-98af0238de03/executor-f988152e-b542-4f1d-a816-fc8b29e72ed3/blockmgr-9aad88fe-77ab-487e-a24b-070e0b84e696
24/07/16 16:00:04 INFO MemoryStore: MemoryStore started with capacity 434.4 MiB
24/07/16 16:00:04 INFO WorkerWatcher: Connecting to worker spark://Worker@172.20.0.3:44167
24/07/16 16:00:04 INFO CoarseGrainedExecutorBackend: Connecting to driver: spark://CoarseGrainedScheduler@7e5c53b23aa7:44035
24/07/16 16:00:04 INFO TransportClientFactory: Successfully created connection to /172.20.0.3:44167 after 1 ms (0 ms spent in bootstraps)
24/07/16 16:00:04 INFO ResourceUtils: ==============================================================
24/07/16 16:00:04 INFO ResourceUtils: No custom resources configured for spark.executor.
24/07/16 16:00:04 INFO ResourceUtils: ==============================================================
24/07/16 16:00:04 INFO WorkerWatcher: Successfully connected to spark://Worker@172.20.0.3:44167
24/07/16 16:00:04 INFO CoarseGrainedExecutorBackend: Successfully registered with driver
24/07/16 16:00:04 INFO Executor: Starting executor ID 0 on host 172.20.0.3
24/07/16 16:00:04 INFO Executor: OS info Linux, 5.10.104-linuxkit, aarch64
24/07/16 16:00:04 INFO Executor: Java version 17.0.11
24/07/16 16:00:04 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 38653.
24/07/16 16:00:04 INFO NettyBlockTransferService: Server created on 172.20.0.3:38653
24/07/16 16:00:04 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
24/07/16 16:00:04 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(0, 172.20.0.3, 38653, None)
24/07/16 16:00:04 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(0, 172.20.0.3, 38653, None)
24/07/16 16:00:04 INFO BlockManager: Initialized BlockManager: BlockManagerId(0, 172.20.0.3, 38653, None)
24/07/16 16:00:04 INFO Executor: Starting executor with user classpath (userClassPathFirst = false): ''
24/07/16 16:00:04 INFO Executor: Created or updated repl class loader org.apache.spark.util.MutableURLClassLoader@153e2660 for default.
24/07/16 16:00:04 INFO CoarseGrainedExecutorBackend: Got assigned task 0
24/07/16 16:00:04 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
24/07/16 16:00:04 INFO TorrentBroadcast: Started reading broadcast variable 0 with 1 pieces (estimated total size 4.0 MiB)
24/07/16 16:00:04 INFO TransportClientFactory: Successfully created connection to 7e5c53b23aa7/172.20.0.4:34119 after 1 ms (0 ms spent in bootstraps)
24/07/16 16:00:04 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 6.7 KiB, free 434.4 MiB)
24/07/16 16:00:04 INFO TorrentBroadcast: Reading broadcast variable 0 took 81 ms
24/07/16 16:00:04 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 13.0 KiB, free 434.4 MiB)
24/07/16 16:00:05 INFO CodeGenerator: Code generated in 108.676333 ms
24/07/16 16:00:05 INFO PythonRunner: Times: total = 411, boot = 323, init = 87, finish = 1
24/07/16 16:00:05 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 2004 bytes result sent to driver
24/07/16 16:00:05 INFO CoarseGrainedExecutorBackend: Got assigned task 1
24/07/16 16:00:05 INFO Executor: Running task 0.0 in stage 1.0 (TID 1)
24/07/16 16:00:05 INFO TorrentBroadcast: Started reading broadcast variable 1 with 1 pieces (estimated total size 4.0 MiB)
24/07/16 16:00:05 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 6.7 KiB, free 434.4 MiB)
24/07/16 16:00:05 INFO TorrentBroadcast: Reading broadcast variable 1 took 7 ms
24/07/16 16:00:05 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 13.0 KiB, free 434.4 MiB)
24/07/16 16:00:05 INFO PythonRunner: Times: total = 50, boot = -163, init = 213, finish = 0
24/07/16 16:00:05 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 2031 bytes result sent to driver
24/07/16 16:00:06 INFO CoarseGrainedExecutorBackend: Got assigned task 2
24/07/16 16:00:06 INFO Executor: Running task 0.0 in stage 2.0 (TID 2)
24/07/16 16:00:06 INFO TorrentBroadcast: Started reading broadcast variable 2 with 1 pieces (estimated total size 4.0 MiB)
24/07/16 16:00:06 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 6.8 KiB, free 434.4 MiB)
24/07/16 16:00:06 INFO TorrentBroadcast: Reading broadcast variable 2 took 6 ms
24/07/16 16:00:06 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 13.1 KiB, free 434.4 MiB)
24/07/16 16:00:06 INFO CodeGenerator: Code generated in 9.61425 ms
24/07/16 16:00:06 INFO PythonRunner: Times: total = 47, boot = -618, init = 665, finish = 0
24/07/16 16:00:06 INFO Executor: Finished task 0.0 in stage 2.0 (TID 2). 2027 bytes result sent to driver
24/07/16 16:00:06 INFO CoarseGrainedExecutorBackend: Got assigned task 3
24/07/16 16:00:06 INFO Executor: Running task 0.0 in stage 3.0 (TID 3)
24/07/16 16:00:06 INFO TorrentBroadcast: Started reading broadcast variable 3 with 1 pieces (estimated total size 4.0 MiB)
24/07/16 16:00:06 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 6.8 KiB, free 434.4 MiB)
24/07/16 16:00:06 INFO TorrentBroadcast: Reading broadcast variable 3 took 6 ms
24/07/16 16:00:06 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 13.1 KiB, free 434.4 MiB)
24/07/16 16:00:06 INFO PythonRunner: Times: total = 46, boot = -32, init = 78, finish = 0
24/07/16 16:00:06 INFO Executor: Finished task 0.0 in stage 3.0 (TID 3). 1959 bytes result sent to driver
24/07/16 16:00:06 INFO CoarseGrainedExecutorBackend: Driver commanded a shutdown
24/07/16 16:00:06 ERROR CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
Conclusion
From the logs it is clear that the job is being executed as expected. But Kyuubi is not able to find the job status and shows in a forever PENDING state
Affects Version(s)
1.9.1
Kyuubi Server Log Output
No logs available
Kyuubi Engine Log Output
24/07/1616:00:01WARNNativeCodeLoader:Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
24/07/1616:00:02INFOSparkContext:RunningSpark version 3.5.124/07/1616:00:02INFOSparkContext:OS info Linux,5.10.104-linuxkit, aarch64
24/07/1616:00:02INFOSparkContext:Java version 17.0.1124/07/1616:00:02INFOResourceUtils:==============================================================24/07/1616:00:02INFOResourceUtils:No custom resources configured for spark.driver.24/07/1616:00:02INFOResourceUtils:==============================================================24/07/1616:00:02INFOSparkContext:Submitted application:SamplePySparkCode24/07/1616:00:02INFOResourceProfile:DefaultResourceProfile created, executor resources:Map(memory -> name: memory, amount:1024, script:, vendor:, offHeap -> name: offHeap, amount:0, script:, vendor: ), task resources:Map(cpus -> name: cpus, amount:1.0)
24/07/1616:00:02INFOResourceProfile:Limiting resource is cpu
24/07/1616:00:02INFOResourceProfileManager:AddedResourceProfile id:024/07/1616:00:02INFOSecurityManager:Changing view acls to: spark
24/07/1616:00:02INFOSecurityManager:Changing modify acls to: spark
24/07/1616:00:02INFOSecurityManager:Changing view acls groups to:24/07/1616:00:02INFOSecurityManager:Changing modify acls groups to:24/07/1616:00:02INFOSecurityManager:SecurityManager: authentication disabled; ui acls disabled; users with view permissions: spark; groups with view permissions:EMPTY; users with modify permissions: spark; groups with modify permissions:EMPTY24/07/1616:00:02INFOUtils:Successfully started service 'sparkDriver' on port 44035.24/07/1616:00:02INFOSparkEnv:RegisteringMapOutputTracker24/07/1616:00:02INFOSparkEnv:RegisteringBlockManagerMaster24/07/1616:00:02INFOBlockManagerMasterEndpoint:Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
24/07/1616:00:02INFOBlockManagerMasterEndpoint:BlockManagerMasterEndpoint up
24/07/1616:00:02INFOSparkEnv:RegisteringBlockManagerMasterHeartbeat24/07/1616:00:02INFODiskBlockManager:Created local directory at /tmp/blockmgr-d6e03fc3-f910-49de-b842-2fdce0701bd8
24/07/1616:00:02INFOMemoryStore:MemoryStore started with capacity 434.4MiB24/07/1616:00:02INFOSparkEnv:RegisteringOutputCommitCoordinator24/07/1616:00:02INFOJettyUtils:StartJetty0.0.0.0:4040 for SparkUI24/07/1616:00:02INFOUtils:Successfully started service 'SparkUI' on port 4040.24/07/1616:00:02INFOStandaloneAppClient$ClientEndpoint:Connecting to master spark://spark-master:7077...24/07/1616:00:02INFOTransportClientFactory:Successfully created connection to spark-master/172.20.0.2:7077 after 15 ms (0 ms spent in bootstraps)
24/07/1616:00:02INFOStandaloneSchedulerBackend:Connected to Spark cluster with app ID app-20240716160002-000324/07/1616:00:02INFOStandaloneAppClient$ClientEndpoint:Executor added: app-20240716160002-0003/0 on worker-20240716154203-172.20.0.3-44167 (172.20.0.3:44167) with 1 core(s)
24/07/1616:00:02INFOStandaloneSchedulerBackend:Granted executor ID app-20240716160002-0003/0 on hostPort 172.20.0.3:44167 with 1 core(s),1024.0MiBRAM24/07/1616:00:02INFOUtils:Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 34119.24/07/1616:00:02INFONettyBlockTransferService:Server created on 7e5c53b23aa7:3411924/07/1616:00:02INFOBlockManager:Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
24/07/1616:00:02INFOBlockManagerMaster:RegisteringBlockManagerBlockManagerId(driver, 7e5c53b23aa7,34119,None)
24/07/1616:00:02INFOBlockManagerMasterEndpoint:Registering block manager 7e5c53b23aa7:34119 with 434.4MiBRAM,BlockManagerId(driver, 7e5c53b23aa7,34119,None)
24/07/1616:00:02INFOBlockManagerMaster:RegisteredBlockManagerBlockManagerId(driver, 7e5c53b23aa7,34119,None)
24/07/1616:00:02INFOBlockManager:InitializedBlockManager:BlockManagerId(driver, 7e5c53b23aa7,34119,None)
24/07/1616:00:02INFOStandaloneAppClient$ClientEndpoint:Executor updated: app-20240716160002-0003/0is now RUNNING24/07/1616:00:02INFOStandaloneSchedulerBackend:SchedulerBackendis ready for scheduling beginning after reached minRegisteredResourcesRatio:0.024/07/1616:00:02INFOSharedState:Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir.24/07/1616:00:02INFOSharedState:Warehouse path is'file:/opt/kyuubi/work/anonymous/spark-warehouse'.
root
|-- id: long (nullable =true)
|-- name: string (nullable =true)
|-- age: long (nullable =true)
24/07/1616:00:04INFOStandaloneSchedulerBackend$StandaloneDriverEndpoint:Registered executor NettyRpcEndpointRef(spark-client://Executor) (172.20.0.3:48738) with ID0,ResourceProfileId024/07/1616:00:04INFOBlockManagerMasterEndpoint:Registering block manager 172.20.0.3:38653 with 434.4MiBRAM,BlockManagerId(0,172.20.0.3,38653,None)
24/07/1616:00:04INFOCodeGenerator:Code generated in 96.000666 ms
24/07/1616:00:04INFOSparkContext:Starting job: showString at NativeMethodAccessorImpl.java:024/07/1616:00:04INFODAGScheduler:Got job 0 (showString at NativeMethodAccessorImpl.java:0) with 1 output partitions
24/07/1616:00:04INFODAGScheduler:Final stage:ResultStage0 (showString at NativeMethodAccessorImpl.java:0)
24/07/1616:00:04INFODAGScheduler:Parents of final stage:List()
24/07/1616:00:04INFODAGScheduler:Missing parents:List()
24/07/1616:00:04INFODAGScheduler:SubmittingResultStage0 (MapPartitionsRDD[6] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents
24/07/1616:00:04INFOMemoryStore:Block broadcast_0 stored as values in memory (estimated size 13.0KiB, free 434.4MiB)
24/07/1616:00:04INFOMemoryStore:Block broadcast_0_piece0 stored as bytes in memory (estimated size 6.7KiB, free 434.4MiB)
24/07/1616:00:04INFOBlockManagerInfo:Added broadcast_0_piece0 in memory on 7e5c53b23aa7:34119 (size:6.7KiB, free:434.4MiB)
24/07/1616:00:04INFOSparkContext:Created broadcast 0 from broadcast at DAGScheduler.scala:158524/07/1616:00:04INFODAGScheduler:Submitting1 missing tasks from ResultStage0 (MapPartitionsRDD[6] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0))
24/07/1616:00:04INFOTaskSchedulerImpl:Adding task set 0.0 with 1 tasks resource profile 024/07/1616:00:04INFOTaskSetManager:Starting task 0.0 in stage 0.0 (TID0) (172.20.0.3, executor 0, partition 0,PROCESS_LOCAL,7651 bytes)
24/07/1616:00:04INFOBlockManagerInfo:Added broadcast_0_piece0 in memory on 172.20.0.3:38653 (size:6.7KiB, free:434.4MiB)
24/07/1616:00:05INFOTaskSetManager:Finished task 0.0 in stage 0.0 (TID0) in 854 ms on 172.20.0.3 (executor 0) (1/1)
24/07/1616:00:05INFOTaskSchedulerImpl:RemovedTaskSet0.0, whose tasks have all completed, from pool
24/07/1616:00:05INFOPythonAccumulatorV2:Connected to AccumulatorServer at host:127.0.0.1 port:5271924/07/1616:00:05INFODAGScheduler:ResultStage0 (showString at NativeMethodAccessorImpl.java:0) finished in 0.933 s
24/07/1616:00:05INFODAGScheduler:Job0is finished.Cancelling potential speculative or zombie tasks for this job
24/07/1616:00:05INFOTaskSchedulerImpl:Killing all running tasks in stage 0:Stage finished
24/07/1616:00:05INFODAGScheduler:Job0 finished: showString at NativeMethodAccessorImpl.java:0, took 0.949888 s
24/07/1616:00:05INFOSparkContext:Starting job: showString at NativeMethodAccessorImpl.java:024/07/1616:00:05INFODAGScheduler:Got job 1 (showString at NativeMethodAccessorImpl.java:0) with 1 output partitions
24/07/1616:00:05INFODAGScheduler:Final stage:ResultStage1 (showString at NativeMethodAccessorImpl.java:0)
24/07/1616:00:05INFODAGScheduler:Parents of final stage:List()
24/07/1616:00:05INFODAGScheduler:Missing parents:List()
24/07/1616:00:05INFODAGScheduler:SubmittingResultStage1 (MapPartitionsRDD[6] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents
24/07/1616:00:05INFOMemoryStore:Block broadcast_1 stored as values in memory (estimated size 13.0KiB, free 434.4MiB)
24/07/1616:00:05INFOMemoryStore:Block broadcast_1_piece0 stored as bytes in memory (estimated size 6.7KiB, free 434.4MiB)
24/07/1616:00:05INFOBlockManagerInfo:Added broadcast_1_piece0 in memory on 7e5c53b23aa7:34119 (size:6.7KiB, free:434.4MiB)
24/07/1616:00:05INFOSparkContext:Created broadcast 1 from broadcast at DAGScheduler.scala:158524/07/1616:00:05INFODAGScheduler:Submitting1 missing tasks from ResultStage1 (MapPartitionsRDD[6] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(1))
24/07/1616:00:05INFOTaskSchedulerImpl:Adding task set 1.0 with 1 tasks resource profile 024/07/1616:00:05INFOTaskSetManager:Starting task 0.0 in stage 1.0 (TID1) (172.20.0.3, executor 0, partition 1,PROCESS_LOCAL,7690 bytes)
24/07/1616:00:05INFOBlockManagerInfo:Added broadcast_1_piece0 in memory on 172.20.0.3:38653 (size:6.7KiB, free:434.4MiB)
24/07/1616:00:05INFOTaskSetManager:Finished task 0.0 in stage 1.0 (TID1) in 82 ms on 172.20.0.3 (executor 0) (1/1)
24/07/1616:00:05INFOTaskSchedulerImpl:RemovedTaskSet1.0, whose tasks have all completed, from pool
24/07/1616:00:05INFODAGScheduler:ResultStage1 (showString at NativeMethodAccessorImpl.java:0) finished in 0.090 s
24/07/1616:00:05INFODAGScheduler:Job1is finished.Cancelling potential speculative or zombie tasks for this job
24/07/1616:00:05INFOTaskSchedulerImpl:Killing all running tasks in stage 1:Stage finished
24/07/1616:00:05INFODAGScheduler:Job1 finished: showString at NativeMethodAccessorImpl.java:0, took 0.092564 s
24/07/1616:00:05INFOBlockManagerInfo:Removed broadcast_0_piece0 on 7e5c53b23aa7:34119 in memory (size:6.7KiB, free:434.4MiB)
24/07/1616:00:05INFOBlockManagerInfo:Removed broadcast_0_piece0 on 172.20.0.3:38653 in memory (size:6.7KiB, free:434.4MiB)
24/07/1616:00:06INFOBlockManagerInfo:Removed broadcast_1_piece0 on 7e5c53b23aa7:34119 in memory (size:6.7KiB, free:434.4MiB)
24/07/1616:00:06INFOBlockManagerInfo:Removed broadcast_1_piece0 on 172.20.0.3:38653 in memory (size:6.7KiB, free:434.4MiB)
24/07/1616:00:06INFOCodeGenerator:Code generated in 22.422917 ms
+---+-----+---+
| id| name|age|
+---+-----+---+
| 1|Alice| 28|
| 2| Bob| 24|
| 3|Cathy| 29|
| 4|David| 30|
| 5| Eve| 22|
+---+-----+---+24/07/1616:00:06INFOCodeGenerator:Code generated in 7.778417 ms
24/07/1616:00:06INFOSparkContext:Starting job: showString at NativeMethodAccessorImpl.java:024/07/1616:00:06INFODAGScheduler:Got job 2 (showString at NativeMethodAccessorImpl.java:0) with 1 output partitions
24/07/1616:00:06INFODAGScheduler:Final stage:ResultStage2 (showString at NativeMethodAccessorImpl.java:0)
24/07/1616:00:06INFODAGScheduler:Parents of final stage:List()
24/07/1616:00:06INFODAGScheduler:Missing parents:List()
24/07/1616:00:06INFODAGScheduler:SubmittingResultStage2 (MapPartitionsRDD[8] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents
24/07/1616:00:06INFOMemoryStore:Block broadcast_2 stored as values in memory (estimated size 13.1KiB, free 434.4MiB)
24/07/1616:00:06INFOMemoryStore:Block broadcast_2_piece0 stored as bytes in memory (estimated size 6.8KiB, free 434.4MiB)
24/07/1616:00:06INFOBlockManagerInfo:Added broadcast_2_piece0 in memory on 7e5c53b23aa7:34119 (size:6.8KiB, free:434.4MiB)
24/07/1616:00:06INFOSparkContext:Created broadcast 2 from broadcast at DAGScheduler.scala:158524/07/1616:00:06INFODAGScheduler:Submitting1 missing tasks from ResultStage2 (MapPartitionsRDD[8] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0))
24/07/1616:00:06INFOTaskSchedulerImpl:Adding task set 2.0 with 1 tasks resource profile 024/07/1616:00:06INFOTaskSetManager:Starting task 0.0 in stage 2.0 (TID2) (172.20.0.3, executor 0, partition 0,PROCESS_LOCAL,7651 bytes)
24/07/1616:00:06INFOBlockManagerInfo:Added broadcast_2_piece0 in memory on 172.20.0.3:38653 (size:6.8KiB, free:434.4MiB)
24/07/1616:00:06INFOTaskSetManager:Finished task 0.0 in stage 2.0 (TID2) in 72 ms on 172.20.0.3 (executor 0) (1/1)
24/07/1616:00:06INFOTaskSchedulerImpl:RemovedTaskSet2.0, whose tasks have all completed, from pool
24/07/1616:00:06INFODAGScheduler:ResultStage2 (showString at NativeMethodAccessorImpl.java:0) finished in 0.078 s
24/07/1616:00:06INFODAGScheduler:Job2is finished.Cancelling potential speculative or zombie tasks for this job
24/07/1616:00:06INFOTaskSchedulerImpl:Killing all running tasks in stage 2:Stage finished
24/07/1616:00:06INFODAGScheduler:Job2 finished: showString at NativeMethodAccessorImpl.java:0, took 0.080722 s
24/07/1616:00:06INFOSparkContext:Starting job: showString at NativeMethodAccessorImpl.java:024/07/1616:00:06INFODAGScheduler:Got job 3 (showString at NativeMethodAccessorImpl.java:0) with 1 output partitions
24/07/1616:00:06INFODAGScheduler:Final stage:ResultStage3 (showString at NativeMethodAccessorImpl.java:0)
24/07/1616:00:06INFODAGScheduler:Parents of final stage:List()
24/07/1616:00:06INFODAGScheduler:Missing parents:List()
24/07/1616:00:06INFODAGScheduler:SubmittingResultStage3 (MapPartitionsRDD[8] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents
24/07/1616:00:06INFOMemoryStore:Block broadcast_3 stored as values in memory (estimated size 13.1KiB, free 434.4MiB)
24/07/1616:00:06INFOMemoryStore:Block broadcast_3_piece0 stored as bytes in memory (estimated size 6.8KiB, free 434.4MiB)
24/07/1616:00:06INFOBlockManagerInfo:Added broadcast_3_piece0 in memory on 7e5c53b23aa7:34119 (size:6.8KiB, free:434.4MiB)
24/07/1616:00:06INFOSparkContext:Created broadcast 3 from broadcast at DAGScheduler.scala:158524/07/1616:00:06INFODAGScheduler:Submitting1 missing tasks from ResultStage3 (MapPartitionsRDD[8] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(1))
24/07/1616:00:06INFOTaskSchedulerImpl:Adding task set 3.0 with 1 tasks resource profile 024/07/1616:00:06INFOTaskSetManager:Starting task 0.0 in stage 3.0 (TID3) (172.20.0.3, executor 0, partition 1,PROCESS_LOCAL,7690 bytes)
24/07/1616:00:06INFOBlockManagerInfo:Added broadcast_3_piece0 in memory on 172.20.0.3:38653 (size:6.8KiB, free:434.4MiB)
24/07/1616:00:06INFOTaskSetManager:Finished task 0.0 in stage 3.0 (TID3) in 68 ms on 172.20.0.3 (executor 0) (1/1)
24/07/1616:00:06INFOTaskSchedulerImpl:RemovedTaskSet3.0, whose tasks have all completed, from pool
24/07/1616:00:06INFODAGScheduler:ResultStage3 (showString at NativeMethodAccessorImpl.java:0) finished in 0.072 s
24/07/1616:00:06INFODAGScheduler:Job3is finished.Cancelling potential speculative or zombie tasks for this job
24/07/1616:00:06INFOTaskSchedulerImpl:Killing all running tasks in stage 3:Stage finished
24/07/1616:00:06INFODAGScheduler:Job3 finished: showString at NativeMethodAccessorImpl.java:0, took 0.075610 s
24/07/1616:00:06INFOCodeGenerator:Code generated in 5.160208 ms
+-----+---+
| name|age|
+-----+---+
|Alice| 28|
|Cathy| 29|
|David| 30|
+-----+---+24/07/1616:00:06INFOSparkContext:Invoking stop() from shutdown hook
24/07/1616:00:06INFOSparkContext:SparkContextis stopping with exitCode 0.24/07/1616:00:06INFOSparkUI:StoppedSpark web UI at http://7e5c53b23aa7:404024/07/1616:00:06INFOStandaloneSchedulerBackend:Shutting down all executors
24/07/1616:00:06INFOStandaloneSchedulerBackend$StandaloneDriverEndpoint:Asking each executor to shut down
24/07/1616:00:06INFOMapOutputTrackerMasterEndpoint:MapOutputTrackerMasterEndpoint stopped!24/07/1616:00:06INFOMemoryStore:MemoryStore cleared
24/07/1616:00:06INFOBlockManager:BlockManager stopped
24/07/1616:00:06INFOBlockManagerMaster:BlockManagerMaster stopped
24/07/1616:00:06INFOOutputCommitCoordinator$OutputCommitCoordinatorEndpoint:OutputCommitCoordinator stopped!24/07/1616:00:06INFOSparkContext:Successfully stopped SparkContext24/07/1616:00:06INFOShutdownHookManager:Shutdown hook called
24/07/1616:00:06INFOShutdownHookManager:Deleting directory /tmp/spark-4156bfce-659a-4d64-b1b2-be7bf15a9ea4
24/07/1616:00:06INFOShutdownHookManager:Deleting directory /tmp/spark-c948551e-35b5-4ab8-8dd5-76db70beddf3
24/07/1616:00:06INFOShutdownHookManager:Deleting directory /tmp/spark-c948551e-35b5-4ab8-8dd5-76db70beddf3/pyspark-3f35a35e-7954-4f6c-83b3-ba67744ec2ba
This is because Kyuubi does not implement a Spark Standalone ApplicationOperation yet.
For interactive sessions (this is not your case), Kyuubi submits a custom Spark app that has a built-in RPC Server, so that the Kyuubi server can communicate to the Spark driver directly to know the query status.
For batch Spark jobs(no matter Java/Scala jar or PySpark or R), Kyuubi just assembles the spark-submit command to submit your job to the Resource Manager(Local, Mesos, YARN, K8s, or Standalone), and then uses ApplicationOperation to query the job status. Currently, it only supports Local, YARN, and K8s.
Code of Conduct
Search before asking
Describe the bug
I'm trying to submit a pyspark batch job using the REST endpoint. I can see the spark job to be completed, but Kyuubi shows the job is in pending state.
Setup:
Sample pyspark code:
Spark job log
Conclusion
From the logs it is clear that the job is being executed as expected. But Kyuubi is not able to find the job status and shows in a forever PENDING state
Affects Version(s)
1.9.1
Kyuubi Server Log Output
No logs available
Kyuubi Engine Log Output
Kyuubi Server Configurations
Kyuubi Engine Configurations
No response
Additional context
I'm new to Kyuubi and Scala, I would still be happy to support.
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: