Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Loads a JSON file stream and returns the results as a DataFrame. JSON Lines (newline-delimited JSON) is supported by default. For JSON with one record per file, set the multiLine option to true. If schema is not specified, the input schema is inferred from the data.
Syntax
json(path, schema=None, **options)
Parameters
| Parameter | Type | Description |
|---|---|---|
path |
str | Path to the JSON dataset. |
schema |
StructType or str, optional | Schema as a StructType or DDL-formatted string (for example, col0 INT, col1 DOUBLE). |
Returns
DataFrame
Examples
Load a stream from a temporary JSON file:
import tempfile
import time
with tempfile.TemporaryDirectory(prefix="json") as d:
spark.createDataFrame(
[(100, "Hyukjin Kwon"),], ["age", "name"]
).write.mode("overwrite").format("json").save(d)
q = spark.readStream.schema(
"age INT, name STRING"
).json(d).writeStream.format("console").start()
time.sleep(3)
q.stop()