Diving into Scala Cats – Semigroups

Scala Cats - Functors
Reading Time: 2 minutes

New to Cats? No worries, go through my previous article Getting Started with Scala Cats to understand how amazing Cats is.

In this article, we will cover the concepts and implementation of Semigroups in Cats.

What are Semigroups?

In functional programming terms a Semigroup is a concept which encapsulates aggregation with an associative binary operation.

The Semigroup type class comes with a method combine which simply combines two values of same data type by following the principle of Associativity.

combine method is constructed as:

trait Semigroup[A] {
    def combine(x: A, y: A): A
}

and can be implemented as:

// The type class definition
import cats.kernel.Semigroup

// Import type class instance for type Int
import cats.instances.int._

// The combine implementation for Int is by default addition
val onePlusTwo = Semigroup[Int].combine(1, 2)

Infix syntax is also available for types that have a Semigroup instance:

import cats.implicits._

1 |+| 2

Associativity is the only law for Semigroup

combine holds associativity that means the following equality must hold for any choice of xy, and z.

combine(x, combine(y, z)) = combine(combine(x, y), z)

Associativity allows us to partition the data any way we want and potentially parallelize the operations.

For example, for a given list of numbers: 1, 2, 3, 4, 5 we can do something like this:

// Could run in different threads
val group1 = Semigroup[Int].combine(1, 2)
val group2 = Semigroup[Int].combine(3, 4)
val group3 = Semigroup[Int].combine(group1, group2)
val total  = Semigroup[Int].combine(group3, 5)

Writing a custom Semigroup

The combine implementation for Int will add the two parameters. What if we want to multiply two numbers? We can write our own implementation:

implicit val multiplicationSemigroup = new Semigroup[Int] {
  override def combine(x: Int, y: Int): Int = x * y
} 

// uses our implicit Semigroup instance above
val four = Semigroup[Int].combine(2, 2)

Cats allows to provide implementation of combine in more concise ways:

implicit val multiplicationSemigroup: Semigroup[Int] = (x: Int, y: Int) => x * y

// or more succinctly
implicit val multiplicationSemigroup: Semigroup[Int] = (x, y) => x * y

// or even
implicit val multiplicationSemigroup: Semigroup[Int] = _ * _

// or even more verbose
implicit val multiplicationSemigroup = Semigroup.instance[Int](_ * _)

We can easily provide our own implementation of combine for Semigroup instances of all the common types in Scala ecosystem.

Semigroups with strings

Cats implementation for String concatenates the two parameters. But that implementation doesn’t include a space between them. What if I want to add a space between the strings?

implicit val customStringSemigroup = Semigroup.instance[String](_ + " " + _) 

Semigroup[String].combine("john", "doe") 
// results: String = "john doe"

Semigroups with collections

Given the associative constraint we can build more useful constructs from the simple combine(x, y) method.

We can use recursion directly or make use of Scala’s fold() to operate on a collection of values:

def combineStrings(collection: Seq[String]): String = {
  collection.foldLeft("")(Semigroup[String].combine)
}

However, given just Semigroup we cannot write the above expression generically.

Limitation of Semigroups

We quickly run into issues if we try to write a generic method combineAll(collection: Seq[A]): [A] for the above expression, because the fallback value will depend on the type of A (”” for String, 0 for Int, etc).

There is a solution to this problem though, it’s called the Monoid.

We’ll learn about Monoids in our next article – Diving into Scala Cats – Monoids. Stay tuned!!!

References

The two best Scala Cats resources I know are here: