DataStreamWriter

Interface used to write a streaming DataFrame to external storage systems (for example, file systems and key-value stores). Use df.writeStream to access this.

Syntax

# Access through DataFrame
df.writeStream

Methods

Method	Description
`outputMode(outputMode)`	Specifies how data of a streaming DataFrame is written to the sink. Options are `append`, `complete`, and `update`.
`format(source)`	Specifies the output data source format.
`option(key, value)`	Adds an output option for the underlying data source.
`options(**options)`	Adds multiple output options for the underlying data source.
`partitionBy(*cols)`	Partitions the output by the given columns on the file system.
`clusterBy(*cols)`	Clusters the output by the given columns.
`queryName(queryName)`	Specifies the name of the streaming query.
`trigger(**kwargs)`	Sets the trigger for the streaming query execution.
`foreach(f)`	Sets the output of the streaming query to be processed by the given function or object.
`foreachBatch(func)`	Sets the output of each microbatch to be processed by the given function.
`start(path)`	Starts the execution of the streaming query and returns a `StreamingQuery` object.
`table(tableName)`	Alias for `toTable()`. Writes data to the specified table and returns a `StreamingQuery` object.
`toTable(tableName)`	Starts the execution of the streaming query, continually outputting results to the given table.

Examples

Load a rate stream, apply a transformation, write to the console, and stop after 3 seconds.

import time
df = spark.readStream.format("rate").load()
df = df.selectExpr("value % 3 as v")
q = df.writeStream.format("console").start()
time.sleep(3)
q.stop()

Feedback

Was this page helpful?

Last updated on 2026-04-17