Introduction
In this article, we will compare the List and Vector data structures in Scala and analyze the pros and cons of using them in different use cases. As we all know List is the most popular data structure in scala and quite simply the most beginner friendly of Scala Collections. But is it the obvious choice for all if not most cases? Let us examine Scala’s Vector vs List and their performance in this article and help you pick the right choice for your use case.
What is a Vector?
Vector
is a general-purpose, immutable data structure. It provides random access and updates in constant time and very fast appends and prepends. Vectors strike a good balance between fast random selections and fast random functional updates. They are currently the default implementation of immutable indexed sequences.
i.e you get the following result in REPL if you instantiate an IndexedSeq
val seq = IndexedSeq(1, 2)
val seq: IndexedSeq[Int] = Vector(1, 2)
Vector
is a base-32 integer trie. A Trie is a type of n-ary search tree data structure that can efficiently store and search data. It is typically used to store strings but can be used to store ordered lists of any type such as digits or bits. Each node contains the value up to that point in the Trie. Each edge is the key we need to insert or search. Here is a sample example of a Trie
Vector
is a similar Trie where the root node has 32 children nodes. Each children node can have 32 children and so on. Hence to hold 2^30 elements a Trie has just 5 levels.
This makes traversing an “effectively constant time” (eC) operation. Further Vector
elements exist in a batch of 32, thus providing a high locality. This means that they are stored in neighboring memory locations. Accessing such elements invokes the cache to store them hence providing quick access.
What is a List?
A List
in scala is an immutable linked-list implementation of ordered elements of type [A]. Hence the splitting of the head and tail is a constant operation. List
is a commonly used data structure in scala and is immutable. Adding elements to List
or removing elements from the List
returns a new one. It is a stack-type data structure optimal for last-in-first-out LIFO access patterns. It is not suited for random access or FIFO-type access patterns.
Vector Vs List in Scala
Both Vectors
and Lists
provide structural sharing. Structural sharing means copying similar prefix or suffix parts to the new memory location (new variable). This offers some optimization in update operations where we create a new structure from the existing one. However, there are some fundamental performance differences.
Head / Tail
Since List
is a linked list, the head or tail operations are incredibly fast and are a constant (C) operation. Whereas in Vector
it is nearly-a-constant-time operation(eC) considering factors such as the maximum size of the Vector
.
Access
Since List
is a LinearSeq
, getting an element at a specified index is an O(k) time operation where k is the position of the element in the List
. Vector
, on the other hand, is an IndexedSeq
and is way faster. It offers fast random access. Vectors
are Tries and just need to do a constant number of operations (up to 7 in real-time use cases) to look up sufficiently large data. Hence List
takes Linear time here while Vector
takes nearly constant time.
Prepend / Append
Lists
are fast for prepend operations. They just update the new node’s pointer to point to the existing List
. Vectors
also, provide fast prepend operations due to their tree-ish structure. Structural sharing helps them copy the existing structure and add a new element on top of it. Lists
prepend in constant (C) time whereas Vectors
prepend in nearly constant (eC) time.
As for append operations, Vectors
are significantly faster than Lists
. Lists
traverse till the last node and then append the new element which takes a linear O(n) time. Vector
lookup takes place in O(1) i.e constant time. Hence appending to the element also takes a nearly constant time. With help from structural sharing, Vectors
can copy the prefix, add the new element, and return the new Vector
.
Updates
Vector
is an IndexedSeq
which modifies the existing element in nearly constant time. It looks up the position to update in constant time. The element is updated. Finally, the prefix/suffix structure up to that point is shared, returning a Vector with the updated element.Lists
on the other hand, have to traverse and find the element before updating them. This takes O(n) time. Hence Lists
take Linear time for updating while Vectors
take nearly constant time.
Iteration
Lists
are a LinearSeq
and are a good choice for iterative traversals. But Vectors
offer equally good traversal time. Iteration in Lists
is following each node’s pointer to its location. This can sometimes result in a cache miss. Addresses of those pointers reside in a different part of the memory and cannot be fetched from the cache.
Traversing elements is slightly more complicated in Vectors
. Still, due to their excellent locality, elements belonging to the same level, especially to a particular node are almost next to each other. Hence Vectors
reduce cache misses by a significant level. Tries traverse in O(n) time, and so Vectors
also traverse in O(n) time.
Parallelism
List
does not work well with parallel algorithms. Since a List
requires O(n) traversals to create a sublist or access an element from it, it is not suitable to use Lists
when implementing parallel algorithms. An example of such an algorithm is splitting a list into sublists and sorting each of the sublists. But this cannot be done in isolation. To traverse each of the sublists, we need to traverse the whole list from the start. Furthermore, Lists themselves contain mutable states in their internal implementation making them gullible to race conditions. A note from the official docs on scala says thus,
Note: Despite being an immutable collection, the implementation uses mutable state internally during construction. These state changes are invisible in single-threaded code but can lead to race conditions in some multi-threaded scenarios.
It also makes sense as to why List
was not included in the parallel collections package in scala. Vector
is much more suitable in this aspect providing fast random access and updates, thus making it useful in parallel algorithms without any race conditions. Scala ParVector
is a specialized data structure that allows for efficient copying and splitting while keeping the data immutable in distributed environments.
As a summarization of the above differences, here are performance benchmarks from the docs, that list the differences between Vector vs List in Scala.

Conclusion
After a detailed analysis of Vector vs List in Scala, it is clear that Vectors
gain the upper hand over Lists
in case of random access, and updates while Lists
are better at the head, tail, and prepend operations. Vector
is the obvious choice when you need an Indexed Sequence
collection and when implementing parallel algorithms.
I hope by considering the above points you will be able to decide which data structure is better suited for your use case. Feel free to add to the discussion by dropping your suggestions, and feedback below. Thanks for reading and I hope you have a good day!