Back to all posts

Understanding show() in PySpark

In PySpark, the .show() function is used to display DataFrame content in a tabular format. Syntax of show() DataFrame.show(n=20, truncate=True, vertical=Fa…

In PySpark, the .show() function is used to display DataFrame content in a tabular format.


Syntax of show()

SQL
DataFrame.show(n=20, truncate=True, vertical=False)

Parameters:

  1. n (default = 20) → Number of rows to display.
  2. truncate (default = True) → If True, long strings will be truncated to 20 characters. You can also specify an integer value to limit the number of characters.
  3. vertical (default = False) → If True, displays rows in a vertical format instead of a table.

Example 1: Using show() without Arguments

Python
from pyspark.sql import SparkSession

# Start Spark Session
spark = SparkSession.builder.appName("ShowExample").getOrCreate()

# Sample Data
data = 
columns = 

# Create DataFrame
df = spark.createDataFrame(data, schema=columns)

# Show DataFrame
df.show()

Output:

SQL
+---+-------+---+
| ID|  Name |Age|
+---+-------+---+
|  1| Alice | 25|
|  2|   Bob | 30|
|  3|Charlie| 35|
+---+-------+---+

Example 2: Display Only 2 Rows

Plain Text
df.show(2)

Output:

SQL
+---+-----+---+
| ID| Name|Age|
+---+-----+---+
|  1|Alice| 25|
|  2|  Bob| 30|
+---+-----+---+

Example 3: Truncating Long Strings

Python
long_data = 
df_long = spark.createDataFrame(long_data, )

df_long.show(truncate=True)  # Default truncates strings to 20 characters
df_long.show(truncate=10)  # Truncates strings to 10 characters
df_long.show(truncate=False)  # Shows full content

Output:


Example 4: Displaying in Vertical Format

Python
df.show(vertical=True)

Output:

SQL
-RECORD 0--------
 ID   | 1        
 Name | Alice    
 Age  | 25       
-RECORD 1--------
 ID   | 2        
 Name | Bob      
 Age  | 30       
-RECORD 2--------
 ID   | 3        
 Name | Charlie  
 Age  | 35       

This is useful when there are many columns, making the output more readable.

SQL
# Default - displays 20 rows and 
# 20 charactes from column value 
df.show()

#Display full column contents
df.show(truncate=False)

# Display 2 rows and full column contents
df.show(2,truncate=False) 

# Display 2 rows & column values 25 characters
df.show(2,truncate=25) 

# Display DataFrame rows & columns vertically
df.show(n=3,truncate=25,vertical=True)

Conclusion

  • .show() is a handy function to display data in PySpark.
  • It allows controlling the number of rows, truncation of strings, and vertical display.
  • Helps in quickly inspecting data while working with large datasets.

Keep building your data skillset

Explore more SQL, Python, analytics, and engineering tutorials.