Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Saves the contents of the DataFrame to a data source. The data source is specified by format and a set of options. If format is not specified, the default data source configured by spark.sql.sources.default is used.
Syntax
save(path=None, format=None, mode=None, partitionBy=None, **options)
Parameters
| Parameter | Type | Description |
|---|---|---|
path |
str, optional | The path in a Hadoop-supported file system. |
format |
str, optional | The format used to save. |
mode |
str, optional | The behavior when data already exists. Accepted values are 'append', 'overwrite', 'ignore', and 'error' or 'errorifexists' (default). |
partitionBy |
list, optional | Names of partitioning columns. |
**options |
dict | Additional string options. |
Returns
None
Examples
Write a DataFrame into a JSON file and read it back.
import tempfile
with tempfile.TemporaryDirectory(prefix="save") as d:
spark.createDataFrame(
[{"age": 100, "name": "Alice"}]
).write.mode("overwrite").format("json").save(d)
spark.read.format('json').load(d).show()
# +---+------------+
# |age| name|
# +---+------------+
# |100|Alice|
# +---+------------+