New to Cats? No worries, go through my previous article Getting Started with Scala Cats to understand how amazing Cats is.
In this article, we will cover the concepts and implementation of Semigroups in Cats.
What are Semigroups?
In functional programming terms a Semigroup is a concept which encapsulates aggregation with an associative binary operation.
The Semigroup type class comes with a method combine
which simply combines two values of same data type by following the principle of Associativity.
combine
method is constructed as:
trait Semigroup[A] {
def combine(x: A, y: A): A
}
and can be implemented as:
// The type class definition
import cats.kernel.Semigroup
// Import type class instance for type Int
import cats.instances.int._
// The combine implementation for Int is by default addition
val onePlusTwo = Semigroup[Int].combine(1, 2)
Infix syntax is also available for types that have a Semigroup
instance:
import cats.implicits._
1 |+| 2
Associativity is the only law for Semigroup
combine
holds associativity that means the following equality must hold for any choice of x
, y
, and z
.
combine(x, combine(y, z)) = combine(combine(x, y), z)
Associativity allows us to partition the data any way we want and potentially parallelize the operations.
For example, for a given list of numbers: 1, 2, 3, 4, 5
we can do something like this:
// Could run in different threads
val group1 = Semigroup[Int].combine(1, 2)
val group2 = Semigroup[Int].combine(3, 4)
val group3 = Semigroup[Int].combine(group1, group2)
val total = Semigroup[Int].combine(group3, 5)
Writing a custom Semigroup
The combine
implementation for Int
will add the two parameters. What if we want to multiply two numbers? We can write our own implementation:
implicit val multiplicationSemigroup = new Semigroup[Int] {
override def combine(x: Int, y: Int): Int = x * y
}
// uses our implicit Semigroup instance above
val four = Semigroup[Int].combine(2, 2)
Cats allows to provide implementation of combine
in more concise ways:
implicit val multiplicationSemigroup: Semigroup[Int] = (x: Int, y: Int) => x * y
// or more succinctly
implicit val multiplicationSemigroup: Semigroup[Int] = (x, y) => x * y
// or even
implicit val multiplicationSemigroup: Semigroup[Int] = _ * _
// or even more verbose
implicit val multiplicationSemigroup = Semigroup.instance[Int](_ * _)
We can easily provide our own implementation of combine
for Semigroup
instances of all the common types in Scala ecosystem.
Semigroups with strings
Cats implementation for String
concatenates the two parameters. But that implementation doesn’t include a space between them. What if I want to add a space between the strings?
implicit val customStringSemigroup = Semigroup.instance[String](_ + " " + _)
Semigroup[String].combine("john", "doe")
// results: String = "john doe"
Semigroups with collections
Given the associative constraint we can build more useful constructs from the simple combine(x, y)
method.
We can use recursion directly or make use of Scala’s fold()
to operate on a collection of values:
def combineStrings(collection: Seq[String]): String = {
collection.foldLeft("")(Semigroup[String].combine)
}
However, given just Semigroup
we cannot write the above expression generically.
Limitation of Semigroups
We quickly run into issues if we try to write a generic method combineAll(collection: Seq[A]): [A]
for the above expression, because the fallback value will depend on the type of A
(””
for String
, 0
for Int
, etc).
There is a solution to this problem though, it’s called the Monoid.
We’ll learn about Monoids in our next article – Diving into Scala Cats – Monoids. Stay tuned!!!
References
The two best Scala Cats resources I know are here:
- The Cats library is available at github.com/typelevel/cats
- The book, Advanced Scala with Cats, is available at underscore.io/books/advanced-scala/