Bemærk
Adgang til denne side kræver godkendelse. Du kan prøve at logge på eller ændre mapper.
Adgang til denne side kræver godkendelse. Du kan prøve at ændre mapper.
Returns an array of elements for which a predicate holds in a given array. Supports Spark Connect.
For the corresponding Databricks SQL function, see filter function.
Syntax
from pyspark.databricks.sql import functions as dbf
dbf.filter(col=<col>, f=<f>)
Parameters
| Parameter | Type | Description |
|---|---|---|
col |
pyspark.sql.Column or str |
Name of column or expression. |
f |
function |
A function that returns the Boolean expression. Can take one of the following forms: Unary (x: Column) -> Column or Binary (x: Column, i: Column) -> Column where the second argument is a 0-based index of the element. |
Returns
pyspark.sql.Column: filtered array of elements where given function evaluated to True when passed as an argument.
Examples
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame(
[(1, ["2018-09-20", "2019-02-03", "2019-07-01", "2020-06-01"])],
("key", "values")
)
def after_second_quarter(x):
return dbf.month(dbf.to_date(x)) > 6
df.select(
dbf.filter("values", after_second_quarter).alias("after_second_quarter")
).show(truncate=False)
+------------------------+
|after_second_quarter |
+------------------------+
|[2018-09-20, 2019-07-01]|
+------------------------+