pyspark.sql.PandasCogroupedOps#

class pyspark.sql.PandasCogroupedOps(gd1, gd2)[source]#

A logical grouping of two GroupedData, created by GroupedData.cogroup().

New in version 3.0.0.

Changed in version 3.4.0: Support Spark Connect.

Notes

This API is experimental.

Methods

applyInArrow(func, schema)

Applies a function to each cogroup using Arrow and returns the result as a DataFrame.

applyInPandas(func, schema)

Applies a function to each cogroup using pandas and returns the result as a DataFrame.