Mike Slinn

Mutable Collections

— Draft —

Published 2014-03-10. Last modified 2019-07-12.
Time to read: 4 minutes.

This lecture explores some of the mutable collections provided by Scala, and discusses threadsafe usage.

This lecture begins by examining the non-parallel versions of the commonly used mutable collection types which were not discussed in the Collections Overview lecture.

    Collection Traits and Subclasses
  Special Foundation Set Sequence Random Access Sequence Map Stack & Queue
Mutable Array
ArrayBuffer
ParArray
Iterable
ParIterable
Traversable
BitSet
HashSet
ListSet
ParHashSet
ParSet
Set
SortedSet
TreeSet
ArraySeq
Buffer
LinearSeq
ParSeq
Seq
IndexedSeq
IndexedSeqView
ListBuffer
HashMap
ListMap
OpenHashMap
ParHashMap
ParMap
WeakHashMap
ArrayStack
PriorityQueue
Stack
Queue

Later in the lecture we will discuss linked collections, then talk about what happened to threadsafe mixins for collections with Scala 2.11. This lecture concludes with a discussion of concurrent maps.

Seq / Buffers

Buffer is a mutable subtrait of Seq. There are two commonly used Buffer implementations:

ArrayBuffer

ArrayBuffer is useful for efficiently building a collection by adding new items to the end; it is not efficient for prepending items. Invoking toBuffer on an instance of a Seq implementation, such as List or Vector, returns an ArrayBuffer. Let’s use toBuffer to convert an immutable List to ArrayBuffer.

Scala REPL
scala> import collection._
import collection._
scala>
immutable.List(1, 2, 3).toBuffer res32: scala.collection.mutable.Buffer[Int] = ArrayBuffer(1, 2, 3)

ListBuffer

A ListBuffer is a mutable List with constant time prepend and append.

Scala REPL
scala> val lb = mutable.ListBuffer(1, 2, 3)
lb: scala.collection.mutable.ListBuffer[Int] = ListBuffer(1, 2, 3) 

Method update is not defined for List because lists are immutable, however ListBuffer does define update.

Scala REPL
scala> lb(0) = lb(0) * 3
scala> lb res33: scala.collection.mutable.ListBuffer[Int] = ListBuffer(3, 2, 3)

We can update the entire ListBuffer by applying a function to each element. To do that, first we need to generate the indices.

Scala REPL
scala> 0 until lb.size
res34: scala.collection.immutable.Range = Range(0, 1, 2)
scala>
0 until lb.size foreach { i => lb(i) = lb(i) * 2 }
scala> lb res35: scala.collection.mutable.ListBuffer[Int] = ListBuffer(6, 4, 6)

Let’s convert this ListBuffer to various types of immutable collections.

Scala REPL
scala> lb.toArray
res36: Array[Int] = Array(6, 4, 6)
scala>
lb.toSeq res37: Seq[Int] = List(6, 4, 6)
scala>
lb.toIndexedSeq res38: scala.collection.immutable.IndexedSeq[Int] = Vector(6, 4, 6)

Queue, Stack and ArrayStack

Scala’s mutable Queue exhibits first-in first-out (FIFO) behavior.

Scala REPL
scala> val queue = mutable.Queue.empty[String]
queue: scala.collection.mutable.Queue[String] = Queue()
scala>
queue += "asdf" res40: queue.type = Queue(asdf)
scala>
queue += "qwer" res41: queue.type = Queue(asdf, qwer)
scala>
queue.dequeue() res42: String = asdf
scala>
queue res43: scala.collection.mutable.Queue[String] = Queue(qwer)

PriorityQueue will be discussed in the lecture on Sorting and Ordered Collections.

Scala’s mutable Stack exhibits last-in first-out (LIFO) behavior, using push and pop. The :+ operator is a synonym for push.

Scala REPL
scala> val stack1 = mutable.Stack("one", "two", "three")
res44: scala.collection.mutable.Stack[String] = Stack(one, two, three)
scala>
stack1.push("blah")
scala> stack1 += "ick"
scala> stack1 res3: scala.collection.mutable.Stack[String] = Stack(ick, blah, four, five, six)
scala>
stack1.pop res4: String = ick
scala>
stack1 res5: scala.collection.mutable.Stack[String] = Stack(blah, four, five, six)

Be careful not to confuse the methods += and +: += is a mutable operation which prepends and element to a mutable collection, and returns the updated collection. +: is an immutable operation which constructs a new collection object with the element prepended and returns the new collection.

Map

The mutable version of the Map trait is easily recognized by its fully qualified name: collection.mutable.Map. Subclasses include Hash­Map, Linked­Hash­Map, List­Map and Open­Hash­Map and Weak­Hash­Map. Both Linked­Hash­Map and List­Map iterate through their data in the order the data was originally provided. List­Map has complexity O(n). Use Linked­Hash­Map if you need a fast insertion and lookup.

LinkedHashMap and LinkedHashSet

Linked lists are mutable sequences of nodes linked by references. Scala’s linked lists do not throw exceptions if the next reference is not directly manipulated. Scala 2.11 deprecated two problematic collection classes: LinkedList and DoubleLinkedList, and Scala 2.13 removed them; those classes are not discussed in this course.

    Collection Traits and Subclasses
  Special Foundation Set Sequence Random Access Sequence Map Stack & Queue
Linked     LinkedHashSet DoubleLinkedList
LinkedList
UnrolledBuffer
  LinkedHashMap  

Let’s play with these classes for a moment, and notice that they maintain the order in which data was added.

Scala REPL
scala> val lhset = mutable.LinkedHashSet(1, 2, 3, 4)
LinkedHashMap and LinkedHashSet lhset: scala.collection.mutable.LinkedHashSet[Int] = Set(1, 2, 3, 4)
scala>
lhset LinkedHashMap and LinkedHashSet res47: scala.collection.mutable.LinkedHashSet[Int] = Set(1, 2, 3, 4)
scala>
lhset -=1 // mutable operation. lhset is updated res48: lhset.type = Set(2, 3, 4)
scala>
lhset res49: scala.collection.mutable.LinkedHashSet[Int] = Set(2, 3, 4)
scala>
lhset - 1 - 3 res50: scala.collection.mutable.LinkedHashSet[Int] = Set(2, 4)
scala>
lhset res51: scala.collection.mutable.LinkedHashSet[Int] = Set(2, 4)
scala>
val mlh = mutable.LinkedHashMap( 1 => "one", 33 => "thirty-three", 4 => "four" ).withDefaultValue("default") mlh: scala.collection.mutable.Map[Int,String] = Map(1 => one, 33 => thirty-three, 4 => four)
scala>
mlh(0) res52: String = default
scala>
mlh(1) res53: String = one
scala>
mlh(0) = "zero"
scala> mlh(0) res54: String = zero
scala>
mlh res55: scala.collection.mutable.Map[Int,String] = Map(1 => one, 33 => thirty-three, 4 => four, 0 => zero)

To mutate the existing collection up updating it to remove an element, use -= When working with mutable collections its really important to be clear on whether you are mutating the collection or not, as this can be a source of many subtle errors.

The Scala collections library helps you out with conventions. As we have seen above methods ending with an = are mutable. Also any given polymorphic method name is either mutable or immutable. The same method name is never immutable with some collections and mutable with others.

WeakHashMap

A weak HashMap is a special kind of HashMap where the garbage collector does not follow links from the map to the keys stored in it. This means that a key and its associated value will disappear from the map if there is no other reference to that key.

Weak HashMaps are useful for tasks such as caching, where you want to re-use an expensive function’s result if the function is called again on the same key.

If keys and function results are stored in a regular HashMap, the map could grow without bounds, and no key would ever become garbage.

Using a weak HashMap avoids this problem. As soon as a key object becomes unreachable, its entry is removed from the weak HashMap.
Scala REPL
scala> mutable.WeakHashMap(1 -> "one", 33 -> "thirty-three", 4 -> "four")
scala> res39: scala.collection.mutable.WeakHashMap[Int,Thang] = Map(4 -> four, 33 -> thirty-three, 1 -> one) 

It seems like the LazyList class should have the option to memoize using a Weak­Hash­Map, so the possibility of consuming all available memory could be avoided, but currently this is not an option.

Threadsafe Mutable Collections

Immutable collections are by definition threadsafe. Mutable collections are not threadsafe by default.

If you need a threadsafe mutable collection, please consider the java.util.concurrent classes: Concurrent­Linked­Queue, Concurrent­Hash­Map, Concurrent­Skip­List­Set and Linked­Blocking­Dequeue.

ConcurrentMaps

Concurrent maps are implemented by the Java runtime library and are wrapped by the Scala runtime library. They are synchronized on each key, thus multiple threads can update or retrieve different keys simultaneously. Concurrent maps have the same API as regular maps.

    Collection Traits and Subclasses
  Special Foundation Set Sequence Random Access Sequence Map Stack & Queue
Concurrent           TrieMap
ParTrieMap
 

You can create a new concurrent map two ways:

  1. by wrapping a Java concurrent map by using JavaConverters
  2. by creating a Scala concurrent map

We will explore both ways.

Converting Java ConcurentMaps to Scala

In the following code, we import JavaConverters so we can wrap the Java ConcurrentSkipListMap and ConcurrentHashMap within a Scala interface so the Java maps behaves just like Scala maps.

Scala REPL
scala> import java.util.concurrent.{ConcurrentHashMap, ConcurrentSkipListMap}
import java.util.concurrent.ConcurrentHashMap
scala>
import collection.JavaConverters._ import scala.collection.JavaConverters._
scala>
val cmap = new ConcurrentHashMap[String, String].asScala cmap: scala.collection.concurrent.Map[String,String] = Map()
scala>
val cslm = new ConcurrentSkipListMap[String, String].asScala cslm: scala.collection.concurrent.Map[String,String] = Map()
scala>
val cslm2 = new ConcurrentSkipListMap[String, String].asScala.withDefaultValue("default") cslm2: scala.collection.concurrent.Map[String,String] = Map()
scala>
cslm2("") res37: String = default

Creating Native Scala ConcurrentMaps

The collection.concurrent.Map trait is implemented by the default concurrent map implementation, TrieMap. It is a is a concurrent, thread-safe, lock-free implementation of a hash array mapped trie. The term trie came from the word retrieval. It was originally intended to be pronounced like ’tree’ but now is usually pronounced like ’try.’ Rene De La Briandais wrote the original 1959 paper entitled "File Searching Using Variable Length Keys" that introduced the concept of tries, although the term was not used in the document.

Scala REPL
scala> import collection.concurrent.TrieMap
import collection.concurrent.TrieMap
scala>
val tm = TrieMap.empty[String, String].withDefaultValue("default") tm: scala.collection.mutable.Map[String,String] = Map()
scala>
tm.put("key1", "value1") res52: Option[String] = None
scala>
tm.put("key2", "value2") res53: Option[String] = None
scala>
tm.get("key1") res54: Option[String] = Some(value1)
scala>
tm("key1") res55: String = value1
scala>
tm("x") res56: String = default

Working with Concurrent HashMaps

Native Scala concurrent.Maps and java.util.concurrent.Maps converted to Scala by JavaConverters behave exactly the same.

Scala REPL
scala> import java.util.concurrent.{ ConcurrentHashMap => JConcurrentHashMap, ConcurrentSkipListMap}
import java.util.concurrent.{ConcurrentHashMap=>JConcurrentHashMap, ConcurrentSkipListMap}
scala>
def ensureKeyIsPresent[U, V](map: collection.concurrent.Map[U, V], key: U, value: V): collection.concurrent.Map[U, V] = { | map.putIfAbsent(key, value) | map | } ensureKeyIsPresent: [U, V](map: scala.collection.concurrent.Map[U,V], key: U, value: V)scala.collection.concurrent.Map[U,V]
scala>
ensureKeyIsPresent(new JConcurrentHashMap[String, String].asScala, "key", "value") res25: scala.collection.concurrent.Map[String,String] = Map(key -> value)
scala>
ensureKeyIsPresent(new TrieMap, "key", "value") res26: scala.collection.concurrent.Map[String,String] = TrieMap(key -> value)
scala>
ensureKeyIsPresent(new ConcurrentSkipListMap().asScala, "key", "value") res27: scala.collection.concurrent.Map[String,String] = Map(key -> value)
scala>
ensureKeyIsPresent(new JConcurrentHashMap().asScala, "key", "value") res28: scala.collection.concurrent.Map[String,String] = Map(key -> value)

* indicates a required field.

Please select the following to receive Mike Slinn’s newsletter:

You can unsubscribe at any time by clicking the link in the footer of emails.

Mike Slinn uses Mailchimp as his marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp’s privacy practices.