How to Turn Off Info Logging in Spark: A Step-by-Step Guide

Disabling info logging in Apache Spark can be beneficial when you want to reduce the verbosity of logs and focus on more critical log levels like warnings or errors. This guide will explain how you can turn off info logging in Spark using various languages and configurations.

Step-by-Step Guide to Turn Off Info Logging in Spark

Step 1: Modify the log4j.properties File

The most effective way to turn off info logging in Spark is by modifying the `log4j.properties` file, which is used by the `log4j` library to control logging settings. This file is usually found in the `conf` directory of your Spark installation.

  1. Open the `log4j.properties` file using any text editor.
  2. Find the line that sets the root logger, which typically looks like this:

log4j.rootCategory=INFO, console

Change `INFO` to `WARN` (or any other desired log level). Your line should now look like:


log4j.rootCategory=WARN, console

This change sets the root logging level to WARN, thus filtering out INFO-level logs.

Step 2: Programmatically Set Log Level (Optional)

If you prefer to set the logging level programmatically, you can achieve this in your Spark application. Below are code snippets for different languages.

PySpark


from pyspark.sql import SparkSession

# Create Spark session
spark = SparkSession.builder\
    .appName("TurnOffInfoLogging")\
    .getOrCreate()

# Get the logger and set the log level
spark.sparkContext.setLogLevel("WARN")

# Your Spark code here...

spark.stop()

Scala


import org.apache.spark.sql.SparkSession

object TurnOffInfoLogging {
  def main(args: Array[String]): Unit = {
    // Create Spark session
    val spark = SparkSession.builder
      .appName("TurnOffInfoLogging")
      .getOrCreate()

    // Set the log level
    spark.sparkContext.setLogLevel("WARN")

    // Your Spark code here...

    spark.stop()
  }
}

Java


import org.apache.spark.sql.SparkSession;

public class TurnOffInfoLogging {
    public static void main(String[] args) {
        // Create Spark session
        SparkSession spark = SparkSession.builder()
                .appName("TurnOffInfoLogging")
                .getOrCreate();

        // Set the log level
        spark.sparkContext().setLogLevel("WARN");

        // Your Spark code here...

        spark.stop();
    }
}

Step 3: Verify the Changes

After making changes, run your Spark application and observe the logs. You should now see significantly fewer logs, with INFO-level logs being filtered out.

By following these steps, you can effectively reduce the verbosity of your Spark logs and focus on more critical issues by turning off INFO logging.

About Editorial Team

Our Editorial Team is made up of tech enthusiasts deeply skilled in Apache Spark, PySpark, and Machine Learning, alongside proficiency in Pandas, R, Hive, PostgreSQL, Snowflake, and Databricks. They're not just experts; they're passionate educators, dedicated to demystifying complex data concepts through engaging and easy-to-understand tutorials.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top