Back to all posts

PySpark SQL Date and Timestamp Functions

🔧 Setup First (Optional for Reference) from pyspark.sql import functions as F from pyspark.sql import types as T data = df = spark.createDataFrame(data, )…

🔧 Setup First (Optional for Reference)

JavaScript
from pyspark.sql import functions as F
from pyspark.sql import types as T

data = 
df = spark.createDataFrame(data, )
df = df.withColumn("input_date", to_date("input_date"))
df.show()

📆 Common Date & Timestamp Functions with Examples

FunctionDescriptionExampleOutput
current_date()Returns current datedf.select(current_date())2025-04-08
current_timestamp()Returns current timestampdf.select(current_timestamp())2025-04-08 14:32:10.123
date_format(col, fmt)Formats date as stringdate_format("input_date", "dd-MM-yyyy")08-04-2024
to_date(col)Converts string to dateto_date("2024-04-08")2024-04-08
to_timestamp(col)Converts string to timestampto_timestamp("2024-04-08 12:30:00")2024-04-08 12:30:00
datediff(end, start)Difference in daysdatediff("2024-04-08", "2023-12-31")99
months_between()Months difference (float)months_between("2024-04-08", "2023-12-08")4.0
add_months(col, n)Add n monthsadd_months("input_date", 2)2024-06-08
date_add(col, n)Add n daysdate_add("input_date", 10)2024-04-18
date_sub(col, n)Subtract n daysdate_sub("input_date", 10)2024-03-29
next_day(col, day)Next specific weekdaynext_day("input_date", "Friday")2024-04-12
last_day(col)Last day of monthlast_day("input_date")2024-04-30
trunc(col, fmt)Truncates to month/yeartrunc("input_date", "MM")2024-04-01
date_trunc(fmt, col)Truncates to level (year, month, day, etc.)date_trunc("month", "input_date")2024-04-01 00:00:00
year(col)Extract yearyear("input_date")2024
month(col)Extract monthmonth("input_date")4
dayofmonth(col)Extract day of monthdayofmonth("input_date")8
dayofweek(col)Extract day of week (1 = Sunday)dayofweek("input_date")2
dayofyear(col)Extract day of yeardayofyear("input_date")99
weekofyear(col)Week numberweekofyear("input_date")15
quarter(col)Quarter of year (1-4)quarter("input_date")2
hour(col)Extract hour from timestamphour(current_timestamp())14
minute(col)Extract minuteminute(current_timestamp())32
second(col)Extract secondsecond(current_timestamp())10
unix_timestamp()Returns current UNIX timestampunix_timestamp()Timestamp as long
from_unixtime(ts)Converts UNIX to timestampfrom_unixtime(1680979200)2023-04-09 00:00:00

✅ Example Code

SQL
from pyspark.sql.functions import *

df.withColumn("formatted", date_format("input_date", "dd-MM-yyyy")) \
  .withColumn("year", year("input_date")) \
  .withColumn("month", month("input_date")) \
  .withColumn("add_10_days", date_add("input_date", 10)) \
  .withColumn("sub_10_days", date_sub("input_date", 10)) \
  .withColumn("next_friday", next_day("input_date", "Friday")) \
  .withColumn("last_day", last_day("input_date")) \
  .show()

💡 Tips

  • Always use to_date() or to_timestamp() before applying date functions if your column is string type.
  • Week starts from Sunday (1) when using dayofweek().
  • trunc() and date_trunc() behave differently: trunc() gives a date, date_trunc() gives a timestamp.
    • Common Usage:
    • trunc(date_column, 'MM')first day of the month
    • trunc(date_column, 'YYYY')first day of the year
    • date_trunc('month', timestamp_column)beginning of the month 00:00:00
    • date_trunc('hour', timestamp_column)beginning of the hour

Keep building your data skillset

Explore more SQL, Python, analytics, and engineering tutorials.