Applying Functions in PySpark

PySpark, the Python API for Apache Spark, provides multiple ways to apply functions to DataFrame columns. This flexibility allows data engineers and analys…

Mar 19, 2026 2 min read

PySpark, the Python API for Apache Spark, provides multiple ways to apply functions to DataFrame columns. This flexibility allows data engineers and analysts to perform transformations efficiently.

Apply फ़ंक्शन का उपयोग PySpark में डेटा फ़्रेम (DataFrame) के कॉलम पर विभिन्न प्रकार के ट्रांसफॉर्मेशन (परिवर्तन) करने के लिए किया जाता है। यह हमें डेटा को बिल्ट-इन फंक्शन्स (जैसे upper(), lower(), trim(), आदि) या कस्टम यूजर-डिफाइंड फंक्शन्स (UDFs) के माध्यम से बदलने की सुविधा देता है।

Creating a Spark DataFrame

First, let's create a sample DataFrame to demonstrate function application:

from pyspark.sql import SparkSession

spark = SparkSession.builder.appName('SparkByExamples.com').getOrCreate()
columns = 
data = 

df = spark.createDataFrame(data=data, schema=columns)
df.show(truncate=False)