OutputMode

Output mode (OutputMode) describes what data is written to a streaming sink when there is new data available in streaming data sources (in a trigger / streaming batch).

The output mode of a streaming query is specified using DataStreamWriter.outputMode method.

val inputStream = spark
  .readStream
  .format("rate")
  .load
import org.apache.spark.sql.streaming.{OutputMode, Trigger}
import scala.concurrent.duration._
val consoleOutput = inputStream
  .writeStream
  .format("console")
  .option("truncate", false)
  .trigger(Trigger.ProcessingTime(10.seconds))
  .queryName("rate-console")
  .option("checkpointLocation", "checkpoint")
  .outputMode(OutputMode.Update)  // <-- update output mode
  .start

Table 1. Available Output Modes

OutputMode	Name	Behaviour
`Append`	`append`	Default output mode that writes "new" rows only. For streaming aggregations, "new" row is when the intermediate state becomes final, i.e. when new events for the grouping key can only be considered late which is when watermark moves past the event time of the key. `Append` output mode requires that a streaming query defines event time watermark (using withWatermark operator) on the event time column that is used in aggregation (directly or using window function). Required for datasets with `FileFormat` format (to create FileStreamSink) `Append` is mandatory when multiple `flatMapGroupsWithState` operators are used in a structured query.
`Complete`	`complete`	Writes all rows of a Result Table (and corresponds to a traditional batch structured query). Complete mode does not drop old aggregation state and preserves all data in the Result Table. Supported only for streaming aggregation queries with groupBy or groupByKey aggregations (as asserted by UnsupportedOperationChecker).
`Update`	`update`	Write the rows that were updated (every time there are updates). If the query does not contain aggregations, it is equivalent to Append mode.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spark-sql-streaming-OutputMode.adoc

spark-sql-streaming-OutputMode.adoc

OutputMode

Files

spark-sql-streaming-OutputMode.adoc

Latest commit

History

spark-sql-streaming-OutputMode.adoc

File metadata and controls

OutputMode