How to Display Full Column Content in a Spark DataFrame?

To display the full content of a column in a Spark DataFrame, you often need to change the default settings for column width. By default, Spark truncates the output if it exceeds a certain length, usually 20 characters. Below is how you can achieve this in PySpark and Scala.

Method 1: Using `show` Method with `truncate` Parameter

The simplest way to display full column content is to use the show method with the truncate parameter set to False.

Example in PySpark


from pyspark.sql import SparkSession

# Initialize SparkSession
spark = SparkSession.builder.appName("Display Full Column").getOrCreate()

# Sample data
data = [("Alice", "Engineering and Science"), ("Bob", "Arts and Humanities")]
columns = ["Name", "Department"]

# Create DataFrame
df = spark.createDataFrame(data, columns)

# Show DataFrame with full content
df.show(truncate=False)

Expected Output:



+-----+-------------------------+
| Name|Department |
+-----+-------------------------+
|Alice|Engineering and Science |
|Bob |Arts and Humanities |
+-----+-------------------------+

Example in Scala


import org.apache.spark.sql.SparkSession

// Initialize SparkSession
val spark = SparkSession.builder.appName("Display Full Column").getOrCreate()

// Sample data
val data = Seq(("Alice", "Engineering and Science"), ("Bob", "Arts and Humanities"))
val columns = Seq("Name", "Department")

// Create DataFrame
val df = spark.createDataFrame(data).toDF(columns: _*)

// Show DataFrame with full content
df.show(truncate = false)

Expected Output:



+-----+-------------------------+
| Name|Department |
+-----+-------------------------+
|Alice|Engineering and Science |
|Bob |Arts and Humanities |
+-----+-------------------------+

Method 2: Using `toPandas` Method in PySpark

If you are working with PySpark, another method to display the full content of columns is by converting the DataFrame to a Pandas DataFrame using the toPandas() method.

Example in PySpark


# Create DataFrame
df = spark.createDataFrame(data, columns)

# Convert to Pandas DataFrame
pdf = df.toPandas()

# Display DataFrame
print(pdf)

Expected Output:


Name Department
0 Alice Engineering and Science
1 Bob Arts and Humanities

Method 3: Setting Spark Configuration

If you want to apply this setting globally, you can configure the Spark session to increase the width of the columns.

Example in PySpark


# Modify Spark configuration
spark.conf.set("spark.sql.debug.maxToStringFields", "100")

# Create DataFrame
df = spark.createDataFrame(data, columns)

# Display DataFrame
df.show(truncate=False)

Example in Scala


// Modify Spark configuration
spark.conf.set("spark.sql.debug.maxToStringFields", "100")

// Create DataFrame
val df = spark.createDataFrame(data).toDF(columns: _*)

// Display DataFrame
df.show(truncate = false)

These are the various methods to display full column content in a Spark DataFrame, each suited to different scenarios and use cases.

About Editorial Team

Our Editorial Team is made up of tech enthusiasts who are highly skilled in Apache Spark, PySpark, and Machine Learning. They are also proficient in Python, Pandas, R, Hive, PostgreSQL, Snowflake, and Databricks. They aren't just experts; they are passionate teachers. They are dedicated to making complex data concepts easy to understand through engaging and simple tutorials with examples.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top