Share via


StreamingQuery

A handle to a query that is executing continuously in the background as new data arrives. All methods are thread-safe.

Syntax

# Returned by DataStreamWriter.start() or DataStreamWriter.toTable()
q = df.writeStream.format("console").start()

Properties

Property Description
id Returns the unique ID of this query that persists across restarts from checkpoint data.
runId Returns the unique ID of this query that does not persist across restarts.
name Returns the user-specified name of the query, or None if not specified.
isActive Returns whether this streaming query is currently active.
status Returns the current status of the query as a dict.
recentProgress Returns an array of the most recent StreamingQueryProgress updates for this query.
lastProgress Returns the most recent StreamingQueryProgress update, or None if there have been no updates.

Methods

Method Description
awaitTermination(timeout) Waits for the termination of this query, either by stop() or by an exception.
processAllAvailable() Blocks until all available data in the source has been processed and committed to the sink. Intended for testing.
stop() Stops this streaming query.
explain(extended) Prints the (logical and physical) plans to the console for debugging.
exception() Returns the StreamingQueryException if the query terminated with an exception, or None.

Examples

sdf = spark.readStream.format("rate").load()
sq = sdf.writeStream.format('memory').queryName('this_query').start()
sq.isActive
# True
sq.name
# 'this_query'
sq.awaitTermination(5)
# False
sq.stop()
sq.isActive
# False