Share via


toTable (DataStreamWriter)

Starts the execution of the streaming query, continually outputting results to the given table as new data arrives. Returns a StreamingQuery object.

Syntax

toTable(tableName, format=None, outputMode=None, partitionBy=None, queryName=None, **options)

Parameters

Parameter Type Description
tableName str Name of the table.
format str, optional The format used to save.
outputMode str, optional How data is written to the sink: append, complete, or update.
partitionBy str or list, optional Names of partitioning columns. Ignored for v2 tables that already exist.
queryName str, optional Unique name for the query.
**options
All other string options. Provide a checkpointLocation for most streams.

Returns

StreamingQuery

Notes

For v1 tables, partitionBy columns are always respected. For v2 tables, partitionBy is only respected if the table does not yet exist.

Examples

Save a data stream to a table:

import tempfile
import time
_ = spark.sql("DROP TABLE IF EXISTS my_table2")
with tempfile.TemporaryDirectory(prefix="toTable") as d:
    q = spark.readStream.format("rate").option(
        "rowsPerSecond", 10).load().writeStream.toTable(
            "my_table2",
            queryName='that_query',
            outputMode="append",
            format='parquet',
            checkpointLocation=d)
    time.sleep(3)
    q.stop()
    spark.read.table("my_table2").show()
    _ = spark.sql("DROP TABLE my_table2")