Immutable Collections

Published 2014-03-10. Last modified 2019-07-09.
Time to read: 11 minutes.

Scala's immutable collections are explored. This information is vital for large-scale computation.

The sample code for this lecture can be found in courseNotes/src/main/scala/collections/CollectionImmutableFun.scala.

This lecture discusses many of the non-parallel versions of the following collection types.

		Collection Traits and Subclasses
	Special	Foundation	Set	Sequence	Random Access Sequence	Map	Stack & Queue
Immutable	`Option` `ParRange` `Range` `Stream`	`Iterable` `ParIterable` `Traversable`	`BitSet` `HashSet` `ListSet` `ParHashSet` `ParSet` `Set` `SortedSet` `TreeSet`	`List` `LinearSeq` `PagedSeq` `ParSeq` `Seq`	`IndexedSeq` `Vector` `ParVector`	`HashMap` `ListMap` `Map` `ParMap` `ParHashMap` `SortedMap` `TreeMap`	`Queue`

List

Scala’s List is an immutable singly-linked list, which means that List is an ordered collection. List is designed for linear access, also known as sequential access.

You can easily make a new List:

Scala REPL

scala> val list1 = List(1, 2, 3)
list1: List[Int] = List(1, 2, 3)

The compiler examined all of the elements of the list and noticed that they were all Ints, \ so list1 has type List[Int].

Nil refers to the empty list.

Scala REPL

scala> Nil
res0: scala.collection.immutable.Nil.type = List()

Nil is not parametric; it is a singleton and represents any and all empty lists.

You can use the cons operator (::) or the prepend operator(+:) to prepend items to a list and thereby create a new list. While :: is specific to List, and +: can be used for any type of sequence, both of these operators are equivalent for List instances. List items are concatenated with either of these operators, and the end of the concatenation is denoted by Nil.

Scala REPL

scala> val list2 = 4 :: 5 :: 6 :: Nil
list2: List[Int] = List(4, 5, 6)

scala> 7 +: 8 +: 9 +: Nil
res0: List[Int] = List(7, 8, 9)

You can concatenate two lists with the ::: and ++ operators. While ::: is specific to List and ++ can be used with any sequence, they both do the same thing for List instances.

Scala REPL

scala> list1 ::: list2
res1: List[Int] = List(1, 2, 3, 4, 5, 6)

scala> list1 ++ list2
res2: List[Int] = List(1, 2, 3, 4, 5, 6)

Appending a element to an immutable.List is inefficient because the whole collection must be copied. For this reason Scala does not provide an append-single-element-to-list operator for immutable Lists. However, prepending an element (with :: or +:) to an immutable.List is not computationally expensive. Because an instance of immutable.List cannot be changed, the prepend operators (:: and +:) and the concatenation operators (::: and ++) return new instances of immutable.List. Also, the concatenation operators have linear complexity (O(n)), which is fine if used once in a while, but does not perform well in a loop.

You can have a List that contains any type of object. For example, let’s say we have a Thing case class.

Scala REPL

scala> case class Thing(i: Int, s: String)
defined class Thing

We could make a collection of Things:

Scala REPL

scala> val thing1 = Thing(1, "z")
thing1: Thing = Thing(1,a)

scala> val thing2 = Thing(2, "y")
thing2: Thing = Thing(2,b)

scala> val thing3 = Thing(3, "x")
thing3: Thing = Thing(3,c)

scala> val things = List(thing3, thing1, thing2)
things: List[Thing] = List(Thing(3,x),Thing(1,z),Thing(2,y))

Be sure to specify the type of items in empty collections

The Scala compiler recognized that all of the elements in the things list were of type Thing, so things has type List[Thing]. Notice what happens if I add a String to the list.

Scala REPL

scala> "Hi" :: things
res3: List[java.io.Serializable] = List(Hi, Thing(3,x), Thing(1,z), Thing(2,y))

The type of the new collection is List[Serializable], which is rather generic. I could have specified an even more generic type.

Scala REPL

scala> val x: List[Any] = "Hi" :: things
x: List[Any] = List(Hi, Thing(3,x), Thing(1,z), Thing(2,y))

To guard against inadvertently adding something of the wrong type into a collection, be sure to specify the type that you require. The compiler will then be able to notify you of elements with mismatched types.

Scala REPL

scala> val x: List[Thing] = "Hi" :: things
<console>:14: error: type mismatch;
found : String
required: Thing
val x: List[Thing] = "Hi" :: things

Like we saw for Array, there are several ways to create empty collections like List. The compiler needs to know the type of item contained in the collection, and there are two ways you can provide that type hint.

Scala REPL

scala> List.empty[Double]
res3: List[Double] = Seq()

scala> val emptyList: List[Double] = List.empty
emptyList: List[Double] = List()

scala> val emptyList: List[Double] = List()
emptyList: List[Double] = List()

Notice what happens if you do not provide a type hint to the compiler to indicate the element type in the List.

Scala REPL

scala> val badList = List.empty
badList: List[Nothing] = Seq()

As you can see, the element type in the collection is inferred to be Nothing, defeating type safety. Be sure to specify a type hint to the right or the left of the equals sign.

More About Nil

The empty list is represented by the Nil object. Nil is implemented as a singleton, and therefore cannot be created with a type specifier.

Scala REPL

scala> Nil[Int]
<console>:8: error: object Nil does not take type parameters.
Nil[Int]
^

Look what happens when you apply a combinator to Nil and no type hints are provided to the compiler – the type of element in the collection is inferred to be Nothing.

Scala REPL

scala> Nil.headOption
res1: Option[Nothing] = None

You should always provide a type hint with Nil. Here is an example of providing a type hint on the left side of the equals sign, so the Scala compiler infers the element type to be Int.

Scala REPL

scala> val x: Option[Int] = Nil.headOption
x: Option[Int] = None

Here is another example, where the type hint is provided on the right side of the equals sign. The Scala compiler knows that List("hi") is of type List[String], so it coerces the type of the empty list denoted by Nil to match.

Scala REPL

scala> val y = Nil ::: List("hi")
y: List[String] = List(hi)

Right-Associative Operators

You may have noticed that the :: and +: operators act unusually, in that the item to be prepended precedes the List instance. This is possible because Scala treats all operators that end in a colon character as right-associative when written using infix notation. In other words, this expression.

Scala REPL

scala> 5 :: Nil
res4: List[Int] = List(5)

... is equivalent to:

Scala REPL

scala> Nil.::(5)
res5: List[Int] = List(5)

In this expression, the :: operator is a method of the empty List denoted by Nil. Because :: is right-associative, the binding is backwards from what you may be used to. The argument 5 is passed to List.:: method.

These expressions are also equivalent.

Scala REPL

scala> 5 +: Nil
res6: List[Int] = List(5)

scala> Nil.+:(5)
res7: List[Int] = List(5)

Here is another example of building a List[Int].

Scala REPL

scala> 7 +: 8 +: 9 +: Nil
res1: List[Int] = List(7, 8, 9)

Vector

The immutable.Vector class is also an immutable ordered collection that provides a compromise between indexed and linear access. It has both effectively constant time indexing overhead and constant time linear access overhead. Because if this, Vector is a good foundation for mixed access patterns where both indexed and linear accesses are used. The :: and ::: List operations are not supported by Vector, but you can use the +: and ++: equivalents.

Scala REPL

scala> val vector1 = Vector(1, 2, 3)
res8: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3)

scala> 0 +: vector1
res9: scala.collection.immutable.Vector[Int] = Vector(0, 1, 2, 3)

scala> Vector(thing1, thing2, thing3)
res10: scala.collection.immutable.Vector[Thing] = Vector(Thing(1,z), Thing(2,y), Thing(3,z))

If you cannot decide if you should use List or Vector, use Vector if possible.

Scala REPL

scala> val vector2 = Vector(4, 5, 6)
res11: scala.collection.immutable.Vector[Int] = Vector(4, 5, 6)

scala> vector1 ++ vector2
res12: Vector[Int] = Vector(1, 2, 3, 4, 5, 6)

Seq

Seq is the base trait for ordered sequences and lists such as List and Vector. Unlike the Iterator trait, the Seq trait enables collections with a definite order. Seq is a generalization of sequences, so you should use Seq to define interfaces and use List or Vector for defining variables, like this.

Scala REPL

scala> def doSomething(seq: Seq[Int]) = seq.foreach(println)
doSomething: (seq: Seq[Int])Unit

scala> doSomething(List(1, 2, 3))
1
2
3

scala> doSomething(Vector(4, 5, 6))
4
5
6

In other words, Scala’s Seq is similar to Java’s List interface, and Scala’s List is similar to an immutable version of Java’s LinkedList concrete class.

BTW, the Seq constructor actually returns a List, which implements the Seq trait. It is not good practice to use this constructor, even though many Scala programmers write code like this.

Scala REPL

scala> val seq = Seq(1, 2, 3)
seq: Seq[Int] = List(1, 2, 3)

The old Scala Collections documentation has more useful information on Seq.

Sorting on Demand

You can sort on demand using one or more properties of an ordered collection’s elements. Of course, for this to work all elements in the collection must have the required properties. This means that the declared type of the items in the collection will define sortable properties, and you would not sort collection types such as HashMap or HashSet.

Scala REPL

scala> things.sortBy(_.i)
res13: List[Thing] = List(Thing(1,z), Thing(2,y), Thing(3,x))

scala> things.sortBy(_.s)
res14: List[Thing] = List(Thing(3,x), Thing(2,y), Thing(1,z))

In order to sort by more than one property, specify them as a tuple. The order of the sort elements defines the sort key order.

Scala REPL

scala> things.sortBy(t => (t.i, t.s))
res15: List[Thing] = List(Thing(1,z), Thing(2,y), Thing(3,x))

You will often want to define a natural ordering for your collection. We will discuss that topic in the lecture on Sorting and Ordered Collections.

You can sort tuples in a similar manner:

Scala REPL

scala> val tuples = Vector((4, "z"), (2, "q"), (5, "b"))
tuples: scala.collection.immutable.Vector[(Int, String)] = Vector((4,z), (2,q), (5,b))

scala> tuples.sortBy(_._1)
res16: scala.collection.immutable.Vector[(Int, String)] = Vector((2,q), (4,z), (5,b))

scala> tuples.sortBy(_._2)
res17: scala.collection.immutable.Vector[(Int, String)] = Vector((5,b), (2,q), (4,z))

scala> tuples.sortBy(t => (t._1, t._2))
res18: scala.collection.immutable.Vector[(Int, String)] = Vector((2,q), (4,z), (5,b))

Avoid calling head, tail and last because they throw an Exception when used on an empty collection

head, tail and the Remainder of a Collection

head, init, tail, last, headOption and lastOption are defined in IterableOps, which means there are available throughout the Scala collection classes. They can be applied to ordered and unordered collections, and also to mutable and immutable collections.

head returns the first item in a collection, or throws a NoSuchElementException if the collection is empty.

Scala REPL

scala> val seq: Seq[Int] = Vector(1, 2, 3)
seq: Seq[Int] = Vector(1, 2, 3)

scala> seq.head
res19: Int = 1

scala> Vector[Int].empty.head
java.util.NoSuchElementException: head of empty list

scala> Nil.head
java.util.NoSuchElementException: head of empty list

scala> List.empty[Int].head
java.util.NoSuchElementException: head of empty list
at scala.collection.immutable.Nil$.head(List.scala:417)
at scala.collection.immutable.Nil$.head(List.scala:414)
...
27 elided

headOption returns an Option (Some or None) and never throws an exception.

Scala REPL

scala> seq.headOption
res20: Option[Int] = Some(1)

scala> Vector[Int].empty.headOption
res21: Option[Int] = None

scala> Nil.headOption
res22: Option[Nothing] = None

scala> List.empty[Int].headOption
res23: Option[Int] = None

tail returns the remainder of the collection after the head is removed. If the collection is empty an exception is thrown.

Scala REPL

scala> seq.tail
res24: Seq[Int] = Vector(2, 3)

scala> Nil.tail
java.lang.UnsupportedOperationException: tail of empty list

last returns the last element of the collection, or throws an exception if the collection is empt.

Scala REPL

scala> seq.last
res24: Int = 3

scala> Nil.last
java.util.NoSuchElementException

lastOption returns an Option and never throws an exception.

Scala REPL

scala> seq.lastOption
res25: Option[Int] = Some(3)

scala> Nil.lastOption
res26: Option[Nothing] = None

scala> List.empty[Int].lastOption
res27: Option[Int] = None

init returns all but the last element, or throws an exception if the collection is empty. If there is only one element in the collection, the empty collection is returned.

Scala REPL

scala> seq.init
res28: Seq[Int] = Vector(1, 2)

scala> Vector(1).init
res29: Vector[Int] = List()

scala> Nil.init
java.lang.UnsupportedOperationException: empty.init

There is no initOption combinator. You should avoid using init. Instead, consider using take, like this, or wrap this code into a method (which you will do shortly!).

Scala REPL

scala> seq.take(seq.size-1)
res30: Vector[Int] = Vector(1, 2)

The Create a Failure-Proof init Method exercise in the Parametric Variance and Bounds lecture explores how to write this method in a generic manner for all collections.

take, drop and takeWhile

take returns up to the specified number of elements from the beginning of the collection. drop returns the remainder of the collection after the specified number of elements have been removed. If you try to take or drop more elements from the collection than actually exist, only the number of elements requested are processed and no exception is raised.

Scala REPL

scala> seq.take(2)
res31: Seq[Int] = Vector(1, 2)

scala> seq.take(0)
res32: Seq[Int] = Vector()

scala> seq.take(4)
res33: Seq[Int] = Vector(1, 2, 3)

scala> seq.drop(2)
res34: Seq[Int] = Vector(3)

scala> seq.drop(0)
res35: Seq[Int] = Vector(1, 2, 3)

scala> seq.drop(-4)
res36: Seq[Int] = Vector(1, 2, 3)

scala> seq.drop(4)
res37: Seq[Int] = Vector()

takeWhile is a version of take that accepts a predicate instead of a constant value.

Maps

Immutable.Map is very similar to mutable.Map, with the obvious difference being that immutable.Maps cannot be modified. However, two immutable.Maps can have binary operations performed, such as concatenation and difference, however the result is a new immutable.Map.

SeqMap

Scala 2.13 introduced a new type of immutable.Map trait, SeqMap, and two concrete implementations: TreeSeqMap and VectorMap.

TreeSeqMap

TreeSeqMap is similar in behavior to a ListMap, but it performs better for larger collections and uses more memory.

VectorMap

VectorMap is similar in behavior to ListMap, but provides constant lookup time while using more memory and generally running slower.

Iterator

Iterator defines some powerful methods that are inherited by all Scala collections.

Iterator.iterate

The iterate method starts by initializing its accumulator with the initial value, then repeatedly applies the accumulated value back into the input of the given function. Here I invoke iterate with a zero initial value, and then generate values that are computed by adding 1 to the accumulated value:

Scala REPL

scala> Iterator.iterate(0)(_ + 1) take(10) foreach println
0
1
2
3
4
5
6
7
8
9

This is similar, with the computation doubling the accumulated value then adding one:

Scala REPL

scala> Iterator.iterate(0)(_ * 2 + 1) take(10) foreach println
0
1
3
7
15
31
63
127
255
511

Warning! This never stops.

Scala REPL

scala> Iterator.iterate(0)(_ + 1) foreach println

Iterator.fill

Iterator.fill produces a collection containing the results of a computation a specified number of times. Variants of fill are defined for 1-dimensional data through 5-dimensional data. This example uses fill to retrieve the elements of a PriorityQueue and sort them in ascending order.

Scala REPL

scala> val q = PriorityQueue(1289, 12, 123, 894, 1)(Ordering.Int.reverse)
q: scala.collection.mutable.PriorityQueue[Int] = PriorityQueue(1, 12, 123, 894, 1289)

scala> Iterator.fill(q.size)(q.dequeue) foreach println
1
12
123
894
1289

Iterator.tabulate

Iterator.tabulate produces a collection containing values of a given function over a range of integer values starting from 0. Variants of tabluate are defined for 1-dimensional data through 5-dimensional data. Here is an example.

Scala REPL

scala> Array.tabulate(10)(_.toDouble)
res17: Array[Double] = Array(0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0)

Here is a 2-dimensional example:

Scala REPL

scala> Array.tabulate(3, 3)((a: Int, b: Int) => a.toDouble * b)
res20: Array[Array[Double]] = Array(Array(0.0, 0.0, 0.0), Array(0.0, 1.0, 2.0), Array(0.0, 2.0, 4.0))

Iterator.continually

We can use takeWhile in conjunction with Iterator.continually to obtain unique values until a user indicates they have enough.

Scala REPL

scala> import collection.Iterator
import collection.Iterator

scala> Iterator.
  continually(System.nanoTime).
  takeWhile(_ => !io.StdIn.readLine("\nMore? <Y/n>: ").
  toLowerCase.
  startsWith("n")
  ).
  foreach(println)
More? <Y/n>: 23773991672470
More? <Y/n>: 23775269319438
More? <Y/n>:

Iterator.unfold

Iterator.unfold, introduced in Scala 2.13, is to producing collections as fold is to consuming collections. unfold provides a purely functional way of producing a collection of which we don’t know the size and elements a priori. When using unfold, the elements are computed one after the other, until a None termination signal is computed, otherwise the elements continue infinitely.

This example defines a method that uses unfold to build a list of lists containing similar values.

Scala REPL

scala> def segments[T](xs: List[T]): List[List[T]] =
  List.unfold(xs)(xs =>
    if (xs.isEmpty) None
    else Some(xs.span(_ == xs.head))
  )
segments: [T](xs: List[T])List[List[T]]

scala> segments(List(1,1,2,3,3,3,4,5,5))
res0: List[List[Int]] = List(List(1, 1), List(2), List(3, 3, 3), List(4), List(5, 5))

Another example is the Fibonacci series, implemented with LazyList.unfold.

Scala REPL

scala> val fib =
  LazyList.unfold((1, 1)) { case (a, b) => Some((a, (b, a + b))) }
fib: scala.collection.immutable.LazyList[Int] = LazyList(<not computed>)

scala> fib.take(10).force
res6: scala.collection.immutable.LazyList[Int] = LazyList(1, 1, 2, 3, 5, 8, 13, 21, 34, 55)

unfold is more powerful than Iterator.apply (for example, the default apply method is invoked when writing List(1, 1, 2, 3)). apply constrains the user to enumerate all the elements of the collection when invoked. unfold is more powerful than tabulate, which constrains the user to provide the collection length and is limited to collections whose elements can be computed from their index. unfold is also more powerful than iterate, which is limited to collections whose n^th element can be computed from the n-1^th element.

We can prove that by implementing iterate, tabulate, and apply in terms of unfold.

Scala code

def apply[A](as: A*): List[A] =
  List.unfold(as) { case h +: t => Some((h, t)) case _ => None }

def tabulate[A](length: Int)(f: Int => A): List[A] =
  List.unfold(length)(n => if (n > 0) Some((f(n), n - 1)) else None)

def iterate[A](start: A, length: Int)(f: A => A): List[A] =
  List.unfold((length, start)) {
    case (0, _) => None
    case (n, a) => Some((a, (n - 1, f(a))))
  }

Use unfold when the iteration starts from some hidden state.

Scala 2.13 defines Iterator.unfold. The version for List is essentially the same method, is much simpler than the general case, and works for all Scala versions.

Scala code

def unfold[A, S](start: S)
                (op: S => Option[(A, S)]): List[A] =
  Iterator.
    iterate(op(start))(_.flatMap{ case (_, s) => op(s) }).
    map(_.map(_._1)).
    takeWhile(_.isDefined).
    flatten.
    toList

Exercise – Compute SHA of a file

This exercise replicates the functionality of sha1sum.

Git uses the SHA-1 algorithm to obtain the digest of each file that it checks in. SHA-1 is not secure but is inexpensive to compute. Write a console application that reports the SHA-1 digest of the contents of /etc/passwd. Do not write this application for maximum efficiency; instead, use it as practice for processing a java.io.InputStream of data. Even though the source of the data is a file, this algorithm works on a java.io.InputStream so it is appropriate for an input of unknown length.

Implement a method with this signature as part of the solution; you can provide a default value for the algorithm if you like. For this exercise, the value of algorithm must be "SHA-1".

def digest(inStream: java.io.InputStream, algorithm: String): Array[Byte]

Hints.

In case you have encountered Scala Streams before, they have no relationship to Java Streams. Scala Streams are not required for this solution, only Java Streams and Scala’s Iterator.
You can create a buffer to hold the accumulated result by calling new Array[Byte](1024).
You can get the SHA-1 message digest algorithm by calling java.security.MessageDigest.getInstance("SHA-1")
java.io.FileInputStream.read returns the number of bytes read, or -1 to signal no more data.
Use LazyList.continually and takeWhile to read successive byte arrays from an InputFileStream until FileInputStream.read signals no more data.
Iterate using foreach over MessageDigest.update to calculate an updated digest in the MessageDigest, using the data in the buffer as input. The third argument must be set to the number of bytes read for this iteration. One way of expressing this is:
Scala code, or whatever
```
(whatever).foreach { b => md.update(buffer, 0, b) }
```
Another way, using shorthand, is:
Scala code, or whatever
```
(whatever).foreach { md.update(buffer, 0, _) }
```

Output should be something like:

Output

SHA-1 of /etc/passwd is 2112088146319-7-11555-1126870-43-105-63-21-5028-43-98

Solution

Shell

package solutions

import java.security.MessageDigest
import java.io.{FileInputStream, InputStream}

@main object SHA1 {
  def digest(fileName: String, algorithm: String): Array[Byte] = {
    val inStream = new FileInputStream(fileName)
    digest(inStream, algorithm)
  }

  def digest(inStream: InputStream, algorithm: String="SHA-256"): Array[Byte] = {
    val md = MessageDigest.getInstance(algorithm)
    val buffer = new Array[Byte](1024)
    LazyList.continually { inStream.read(buffer) }
      .takeWhile(_ != -1)
      .foreach { md.update(buffer, 0, _) }
    md.digest
  }

  println(s"""SHA-1 of /etc/passwd is ${digest("/etc/passwd", "SHA-1").mkString}""")
}

To run this solution, type:

Shell

$ sbt "runMain solutions.SHA1"

Views

Most Scala collections are strict – that is, all values of the collection are computed and stored, even if they are not retrieved. In contrast, the elements of a lazy collection are computed only when accessed. Ranges, which we first discussed in the Learning Scala Using The REPL 1/3 lecture of the Introduction to Scala course, are examples of lazy collections, specifically lazy ordered sequences.

Scala REPL

scala> val range: Range.Inclusive = 3 to 8
range: Range.Inclusive = Range 3 to 8

scala> range(0)
res0: Int = 3

Making Lazy Collections from Strict Collections

Lazy collections require less computation than strict collections for many or most circumstances because collection elements are only computed when required.

The SeqOps trait is available to all collections. The SeqOps.view method can convert an otherwise strict collection to a lazy collection. The view method returns a subclass of SeqOps. Even though Ranges are lazy and not strict, the view method is defined because it implements IndexedSeqOps, which implements SeqOps, which implements IterableOps, which declares view.

Scala REPL

scala> val view = range.view
view: scala.collection.IndexedSeqView[Int] = IndexedSeqView(<not computed>)

scala> view(0)
res1: Int = 3

LazyList

Prior to Scala 2.13, Stream was used to model a lazily evaluated list.

An evaluation strategy which delays the evaluation of an expression until its value is needed (non-strict evaluation) and which also avoids repeated evaluations.

– Wikipedia defines lazy evaluation

Scala 2.13 deprecated Stream in favor of LazyList. As its name suggests, a LazyList is a linked list whose elements are lazily evaluated. An important semantic difference with Stream is that in LazyList both the head and the tail are lazy, whereas in Stream only the tail is lazy.

Scala lazy lists are unrelated to Java streams. Instead, Scala lazy lists are subtypes of immutable.Seq. LazyList is commonly used to handle subsets of infinite sequences, and for sequences that are expensive to compute. Predef imports collection.immutable.LazyList so it is always available to any program.

Lazy lists are parametric, or generic, which means elements of a lazy list conform to a specific type. LazyList[T] is a lazy collection whose elements of type T are memoized, or cached.

Because Stream is memoized, given a stream with sufficient unique values, the memoization cache will eventually fill all available memory. We will discuss memoization in the Memoization in Depth lecture.

BTW, Scala 2.13 renamed Stream.append to LazyList.lazyAppendedAll.

LazyList.continually creates a Scala LazyList, which can be considered as a lazy Seq which contains elements that are only evaluated when they are needed. For example, here is how one might compute a LazyList of unique Longs.

Scala REPL

scala> import collection.immutable.LazyList
import collection.immutable.LazyList

scala> val ids = LazyList.continually(System.nanoTime)
ids: scala.collection.immutable.LazyList[Long] = Stream(86653342711375, ?)

We can examine the first 5 elements of the stream this way. Note that toVector forces evaluation.

Scala REPL

scala> ids.take(5)
res37: scala.collection.immutable.LazyList[Long] = LazyList(86549694623903, ?)

scala> ids.take(5).toVector
res38: Vector[Long] = Vector(86549694623903, 86592186252578, 86592186259051, 86592186261332, 86592186263348)

This demonstrates that LazyList memoizes values that have been generated.

Scala REPL

scala> ids.head
res39: Long = 72091815908327

LazyList.continually can be used to write a similar program as the one I showed you a moment ago that used Iterator.continually, but should only be used if the previous values need to be accessed as the program executes. Notice that I put dots at the end of each line of the REPL input, except for the last line. This is so the REPL knows to expect a multi-line expression. If I had put the dots at the start of the last two lines instead, the REPL would not have read all three lines as one expression.

Scala REPL

scala> LazyList.continually(System.nanoTime).
  takeWhile(_ => !io.StdIn.readLine("\nMore? <Y/n>: ")
                              .toLowerCase
                              .startsWith("n")
  ).foreach(println)
More? <Y/n>: 24282594984003
More? <Y/n>: 24283649919676
More? <Y/n>:

Iterator.continually provides a similar mechanism to LazyList.continually that also supports lazy sequences, but without the memory overhead of memoization. You should use Iterator.continually when you do not need to memoize the values.

About foreach

Here is how you can discover the acceptable expressions that can be passed into the foreach higher-order method above.

The collection.Iterator Scaladoc shows the method signature for continually. We see that it returns an Iterator[A] where A is defined by the type streamed from continually. In the above example, System.nanoTime returns a Long, so A is of type Long.

Scala code

def continually[A](elem: => A): Iterator[A]

The Scaladoc for Iterator.foreach shows the method signature.

Scala code

def foreach( f: (A) => Unit): Unit

Thus Iterator.foreach accepts a Function1[A, Unit], which can also be written as A => Unit. In other words, this foreach accepts any Function1 that accepts an instance of A, and does not return a value (returns Unit).

Acceptable examples include:

Scala code
```
x => println(x) // println returns Unit
```
Scala code
```
println(_) // shorthand for above
```
Scala code
```
println // shorter shorthand for above
```

Scala code

x =>
val y = callSomeMethod(x)
doSomethingWithY(y)
() // explicitly return Unit

Scala code

def myMethod[T]: Unit = {
  doSomethingWithT(t)
  () // explicitly return Unit
}

Iterator
  .continually(System.nanoTime)
  .takeWhile(_ => !io.StdIn.readLine("\nMore? <Y/n>: ")
                           .toLowerCase
                           .startsWith("n")
  ).foreach(myMethod) // myMethod is automatically lifted to a Function1

Streaming Binary I/O

Here is a more efficient way to read in a binary file into a byte array, than the sample code shown in the the Higher-Order Functions lecture. This version does not do pointless character encoding.

Scala REPL

scala> import scala.language.postfixOps
import scala.language.postfixOps

scala> import sys.process._
import sys.process._

scala> val fileName = ("which scalac".!!).trim
fileName: String = /usr/local/bin/scalac

scala> val bis = new java.io.BufferedInputStream(new java.io.FileInputStream(fileName))
bis: java.io.BufferedInputStream = java.io.BufferedInputStream@3c04ddda

scala> val bArray = Stream.continually(bis.read).
  takeWhile(-1 !=).
  map(_.toByte).
  toArray
bArray: Array[Byte] = Array(35, 33, 47, 117, 115, 114, 47, 98, 105, 110, 47, 101, 110, 118, 32, 98, 97, 115, 104, 10, 35, 10, 35, 35, 35, 35, 35, 35,
35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35,
35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 10, 35,
32, 67, 111, 112, 121, 114, 105, 103, 104, 116, 32, 50, 48, 48, 50, 45, 50, 48, 49, 51, 32, 76, 65, 77, 80, 47, 69, 80, 70, 76, 10, 35, 10, 35, 32,
84, 104, 105, 115, 32, 105, 115, 32, 102, 114, 101, 101, 32, 115, 111, 102, 116, 119, 97, 114, 101, 59, 32, 115, 101, 101, 32, 116, 104, 101, 32, 100,
105, 115, 116, 114, 105, 9...)
scala> 

scala> bis.close()

Several variations of this code are provided in HigherFun.scala. You can run this program by typing:

Shell

$ sbt "runMain BinaryIO"

Here is an explanation.

BufferedInputStream.read returns the next byte of data, or -1 if the end of the input stream is reached.
LazyList.continually creates a Scala Stream, which is a lazy Seq where elements are only evaluated when they are needed. Stream memoizes elements as they are computed.
takeWhile returns the beginning elements of a Seq that satisfy a predicate.
Stream.continually(bis.read).takeWhile(_.!=(-1)) returns a Stream[Int]
Stream.continually(bis.read).takeWhile(-1 !=).map(_.toByte) returns a Stream[Byte]
The function value -1 != is extremely terse because shorthand is combined with infix notation. It is equivalent to each of these incantations:
1. item => item != -1
2. item => item.!=(-1)
3. _.!=(-1)
4. _ != -1 This is my preferred style because it is reasonably terse while remaining understandable to most programmers.
5. -1 != _

Previous lecture: Hash Tables

Next lecture: Mutable Collections

© Copyright 1994-2024 Michael Slinn. All rights reserved.
If you would like to request to use this copyright-protected work in any manner, please send an email.

This website was made using Jekyll and Mike Slinn’s Jekyll Plugins.