Reading SBT dependency tree

Knoldus Blog Audio
Reading Time: 3 minutes

In this post, we are going to look into reading sbt dependency tree and resolving one of the scenario using an example.

Fully grown dependency tree

While upgrading our library versions of our repository, we often fall into different issues like compatibility between library versions and so on. In this situations dependencyTree is one of the tool, which can help us sneek into different versions of library our build is currently using.

The tree !

Yes, its just a tree where you will be able to see different branches of libraries and how they are connected to each other.

Running them is straight forward

sbt dependencyTree

Considering we have the following dependencies in our built.sbt

scalaVersion := "2.12.5"

val sparkVersion = "3.0.2"
val scalaTestVersion = "3.0.5"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-streaming" % sparkVersion,
  "org.apache.spark" %% "spark-sql" % sparkVersion,
  "org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVersion,
  "org.scalatest" %% "scalatest" % scalaTestVersion % "test",
  "net.manub" %% "scalatest-embedded-kafka" % "2.0.0" % "test",
  "org.apache.kafka" % "kafka-clients" %  "0.10.2.0"
)

Now, with the above dependencies we will be having two versions of kafka client. The one is the version which we are providing explicitly at the last one. And the other one which comes from spark-streaming-kafka-0-10.

So, if we focus only on kafka client, below you can see it is now evicted with the higher version 2.4.1. It comes from spark-streaming-kafka-0-10.
Similarly, we can check for the other libraries as well.

How to read ?
It is just the parent child relationship here. For example :
org.xerial.snappy:snappy-java:1.1.7.3 is coming from org.apache.kafka:kafka-clients
That’s it.

sbt:dedupe_spark_sample> dependencyTree
[warn] There may be incompatibilities among your library dependencies; run 'evicted' to see detailed eviction warnings.
[info] dedupe_spark_sample:dedupe_spark_sample_2.12:0.1 [S]
[info]   +-org.apache.kafka:kafka-clients:2.4.1
[info]   | +-com.github.luben:zstd-jni:1.4.3-1 (evicted by: 1.4.4-3)
[info]   | +-org.lz4:lz4-java:1.6.0 (evicted by: 1.7.1)
[info]   | +-org.slf4j:slf4j-api:1.7.28 (evicted by: 1.7.30)
[info]   | +-org.xerial.snappy:snappy-java:1.1.7.3 (evicted by: 1.1.8.2)
[info]   | 
[info]   +-org.apache.spark:spark-sql_2.12:3.0.2
[info]   | +-com.fasterxml.jackson.core:jackson-databind:2.10.0
[info]   | | +-com.fasterxml.jackson.core:jackson-annotations:2.10.0
[info]   | | +-com.fasterxml.jackson.core:jackson-core:2.10.0
[info]   | | 

Lets try to run an example.

So, here we are going to simply run a sample UT. We are just going to check whether kafka server starts or not.

class EmbedSpec extends WordSpec with Matchers with BeforeAndAfterAll with EmbeddedKafka {

  "Embed" should {

    "runs with embedded kafka on a specific port" should {

      "work" in {
        implicit val config = EmbeddedKafkaConfig(kafkaPort = 12345)

        withRunningKafka {
          // now a kafka broker is listening on port 12345
          publishStringMessageToKafka("topic", "message")
          consumeFirstStringMessageFrom("topic") shouldBe "message"
        }
      }
    }
  }
}

So, now if you run this UT you will get an error saying :

21/03/24 20:06:29 ERROR LogDirFailureChannel: Failed to create or validate data directory /tmp/kafka123230553677025850.tmp
java.io.IOException: Failed to load /tmp/kafka123230553677025850.tmp during broker startup
	at kafka.log.LogManager.$anonfun$createAndValidateLogDirs$1(LogManager.scala:152)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:59)

Why is that ?

If you remember we have embedded-kafka and kafka-client with version as :

  "net.manub" %% "scalatest-embedded-kafka" % "2.0.0" % "test",
  "org.apache.kafka" % "kafka-clients" %  "0.10.2.0"

We can try updating kafka-clients to 2.0.0 similar to embedded kafka. But still, we will be having that error as spark streaming-kafka-0-10 is having 2.4.1 as kafka-client version.

And currently embedded-kafka was having its support only upto 2.0.0. So one way is to add kafka client’s version similar to embedded-kafka. But as an overriden dependency as test, if we want to keep the later version in code.

dependencyOverrides += "org.apache.kafka" % "kafka-clients" %  "2.0.0" % "test"

Now, if you run back it will behave as expected.
This is one of the areas where dependencyTree will help you resolve this type of issues.

Thanks for reading. 🌳

References :
https://www.scala-sbt.org/
https://blog.knoldus.com/sbt-dependency-tree/
https://blog.knoldus.com/simple-build-tool-getting-with-sbt-setting-up-running/