Databricks: Make Log4J Configurable

Reading Time: 3 minutes


The goal of this blog is to define the processes to make the databricks log4j configuration file configurable for debugging purpose

Using the below approaches we can easily change the log level(ERROR, INFO or DEBUG) or change the appender.

Databricks Approach-1

There is no standard way to overwrite log4j configurations on clusters with custom configurations. You must overwrite the configuration files using init scripts.

The current configurations are stored in two files:

On the driver:
%sh cat /home/ubuntu/databricks/spark/dbconf/log4j/driver/

On the worker:
%sh cat /home/ubuntu/databricks/spark/dbconf/log4j/executor/

To set class-specific logging on the driver or on workers, use the following script:


echo "Executing on Driver: $DB_IS_DRIVER"

if [[ $DB_IS_DRIVER = "TRUE" ]]; then

echo "Adjusting here: ${LOG4J_PATH}"
echo "log4j.<custom-prop>=<value>" >> ${LOG4J_PATH}

Replace <custom-prop> with the property name, and <value> with the property value.

Upload the script to DBFS and select a cluster using the cluster configuration UI.

The above script append my log4j configuration into the default file on each node(DRIVER and WORKER) in the spark cluster


  • Whenever you want to change in the script you need to restart the cluster
  • Init script dependent, so only cluster edit permission can add the init script.

Databricks Approach-2

Another way to configure the log4j configuration is to use the Spark Monitoring library method which can load the custom log4j configuration from dbfs.

Using this approach we will not depend on the Data solutions team to setup the init script on each cluster. We can easily load the configuration by calling a method in a notebook.


  • Spark Monitoring library set up on the cluster : We need this library to setup on the databricks cluster.


Create custom file for any of the package or class logs.


log4j.appender.custom.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n

log4j.logger.<package> = DEBUG, custom

The above custom file create a custom appender for your package and store the logs in the logs/customfile-active.log, you can change the appender name and the file name according to your requirement.

We also applied the logs rollover policy which rolls over the logs hourly basis and makes the .gz file for your logs which is stored in the cluster log delivery location mentioned in the cluster configuration.

Now we created the custom file, the next step is to copy this file into the dbfs.

After that, you can easily load the custom file in the notebook itself by calling the above method.



Whenever you execute the notebook, It logs the custom log4j properties file for your package and writing the logs with your log level into the file which you mentioned in the configuration.

Set Executor Log Level

To set the log level on all executors, set it inside the JVM on each worker. Run the code below to set it:

sc.parallelize(Seq("")).foreachPartition(x => {
  import org.apache.log4j.{LogManager, Level}
  import org.apache.commons.logging.LogFactory

  val log = LogFactory.getLog("EXECUTOR-LOG:")

Run the above code in the notebook, then It will change the executor root log level.

Thank you for sticking to the end. If you like this blog, please do show your appreciation by giving thumbs ups and share this blog and give me suggestions on how I can improve my future posts to suit your needs. Follow me to get updates on different technologies


1 thought on “Databricks: Make Log4J Configurable4 min read

Comments are closed.