Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Unpivot a DataFrame from wide format to long format, optionally leaving identifier columns set. This is the reverse to groupBy(...).pivot(...).agg(...), except for the aggregation, which cannot be reversed.
Added in Databricks Runtime 11.1
Syntax
unpivot(ids: Union["ColumnOrName", List["ColumnOrName"], Tuple["ColumnOrName", ...]], values: Optional[Union["ColumnOrName", List["ColumnOrName"], Tuple["ColumnOrName", ...]]], variableColumnName: str, valueColumnName: str)
Parameters
| Parameter | Type | Description |
|---|---|---|
ids |
str, Column, tuple, list | Column(s) to use as identifiers. Can be a single column or column name, or a list or tuple for multiple columns. |
values |
str, Column, tuple, list, optional | Column(s) to unpivot. Can be a single column or column name, or a list or tuple for multiple columns. If specified, must not be empty. If not specified, uses all columns that are not set as ids. |
variableColumnName |
str | Name of the variable column. |
valueColumnName |
str | Name of the value column. |
Returns
DataFrame: Unpivoted DataFrame.
Notes
Supports Spark Connect.
Examples
df = spark.createDataFrame(
[(1, 11, 1.1), (2, 12, 1.2)],
["id", "int", "double"],
)
df.show()
# +---+---+------+
# | id|int|double|
# +---+---+------+
# | 1| 11| 1.1|
# | 2| 12| 1.2|
# +---+---+------+
from pyspark.sql import functions as sf
df.unpivot(
"id", ["int", "double"], "var", "val"
).sort("id", sf.desc("var")).show()
# +---+------+----+
# | id| var| val|
# +---+------+----+
# | 1| int|11.0|
# | 1|double| 1.1|
# | 2| int|12.0|
# | 2|double| 1.2|
# +---+------+----+