🔧 Setup First (Optional for Reference)
from pyspark.sql import functions as F
from pyspark.sql import types as T
data =
df = spark.createDataFrame(data, )
df = df.withColumn("input_date", to_date("input_date"))
df.show()
📆 Common Date & Timestamp Functions with Examples
| Function | Description | Example | Output |
|---|---|---|---|
current_date() | Returns current date | df.select(current_date()) | 2025-04-08 |
current_timestamp() | Returns current timestamp | df.select(current_timestamp()) | 2025-04-08 14:32:10.123 |
date_format(col, fmt) | Formats date as string | date_format("input_date", "dd-MM-yyyy") | 08-04-2024 |
to_date(col) | Converts string to date | to_date("2024-04-08") | 2024-04-08 |
to_timestamp(col) | Converts string to timestamp | to_timestamp("2024-04-08 12:30:00") | 2024-04-08 12:30:00 |
datediff(end, start) | Difference in days | datediff("2024-04-08", "2023-12-31") | 99 |
months_between() | Months difference (float) | months_between("2024-04-08", "2023-12-08") | 4.0 |
add_months(col, n) | Add n months | add_months("input_date", 2) | 2024-06-08 |
date_add(col, n) | Add n days | date_add("input_date", 10) | 2024-04-18 |
date_sub(col, n) | Subtract n days | date_sub("input_date", 10) | 2024-03-29 |
next_day(col, day) | Next specific weekday | next_day("input_date", "Friday") | 2024-04-12 |
last_day(col) | Last day of month | last_day("input_date") | 2024-04-30 |
trunc(col, fmt) | Truncates to month/year | trunc("input_date", "MM") | 2024-04-01 |
date_trunc(fmt, col) | Truncates to level (year, month, day, etc.) | date_trunc("month", "input_date") | 2024-04-01 00:00:00 |
year(col) | Extract year | year("input_date") | 2024 |
month(col) | Extract month | month("input_date") | 4 |
dayofmonth(col) | Extract day of month | dayofmonth("input_date") | 8 |
dayofweek(col) | Extract day of week (1 = Sunday) | dayofweek("input_date") | 2 |
dayofyear(col) | Extract day of year | dayofyear("input_date") | 99 |
weekofyear(col) | Week number | weekofyear("input_date") | 15 |
quarter(col) | Quarter of year (1-4) | quarter("input_date") | 2 |
hour(col) | Extract hour from timestamp | hour(current_timestamp()) | 14 |
minute(col) | Extract minute | minute(current_timestamp()) | 32 |
second(col) | Extract second | second(current_timestamp()) | 10 |
unix_timestamp() | Returns current UNIX timestamp | unix_timestamp() | Timestamp as long |
from_unixtime(ts) | Converts UNIX to timestamp | from_unixtime(1680979200) | 2023-04-09 00:00:00 |
✅ Example Code
from pyspark.sql.functions import *
df.withColumn("formatted", date_format("input_date", "dd-MM-yyyy")) \
.withColumn("year", year("input_date")) \
.withColumn("month", month("input_date")) \
.withColumn("add_10_days", date_add("input_date", 10)) \
.withColumn("sub_10_days", date_sub("input_date", 10)) \
.withColumn("next_friday", next_day("input_date", "Friday")) \
.withColumn("last_day", last_day("input_date")) \
.show()

💡 Tips
- Always use
to_date()orto_timestamp()before applying date functions if your column is string type. - Week starts from Sunday (1) when using
dayofweek(). trunc()anddate_trunc()behave differently:trunc()gives a date,date_trunc()gives a timestamp.- Common Usage:
trunc(date_column, 'MM')→ first day of the monthtrunc(date_column, 'YYYY')→ first day of the yeardate_trunc('month', timestamp_column)→ beginning of the month00:00:00date_trunc('hour', timestamp_column)→ beginning of the hour
