Back to all posts

withColumn() in Pyspark

PySpark withColumn() is a transformation function of DataFrame which is used to change the value, convert the datatype of an existing column, create a new …

PySpark withColumn() is a transformation function of DataFrame which is used to change the value, convert the datatype of an existing column, create a new column, and many more.

Java
from pyspark.sql.functions import col, lit


data = 

columns = 
df = spark.createDataFrame(data=data, schema = columns)
df.show()

Keep building your data skillset

Explore more SQL, Python, analytics, and engineering tutorials.