Published 2014-03-10.
Last modified 2019-07-09.
Time to read: 11 minutes.
Scala's immutable collections are explored. This information is vital for large-scale computation.
The sample code for this lecture can be found in
courseNotes/
.
This lecture discusses many of the non-parallel versions of the following collection types.
Collection Traits and Subclasses | |||||||
---|---|---|---|---|---|---|---|
Special | Foundation | Set | Sequence | Random Access Sequence | Map | Stack & Queue | |
Immutable |
Option ParRange Range Stream
|
Iterable ParIterable Traversable
|
BitSet HashSet ListSet ParHashSet ParSet Set SortedSet TreeSet
|
List LinearSeq PagedSeq ParSeq Seq
|
IndexedSeq Vector ParVector
|
HashMap ListMap Map ParMap ParHashMap SortedMap TreeMap
| Queue
|
List
Scala’s List
is an immutable singly-linked list, which means that List
is an ordered collection.
List
is designed for linear access, also known as sequential access.
You can easily make a new List
:
scala> val list1 = List(1, 2, 3) list1: List[Int] = List(1, 2, 3)
The compiler examined all of the elements of the list and noticed that they were all Int
s, \
so list1
has type List[Int]
.
Nil
refers to the empty list.
scala> Nil res0: scala.collection.immutable.Nil.type = List()
Nil
is not parametric; it is a singleton and represents any and all empty lists.
You can use the cons operator (::
) or the prepend operator(+:
)
to prepend items to a list and thereby create a new list.
While ::
is specific to List
, and +:
can be used for any type of sequence,
both of these operators are equivalent for List
instances.
List items are concatenated with either of these operators, and the end of the concatenation is denoted by Nil
.
scala> val list2 = 4 :: 5 :: 6 :: Nil list2: List[Int] = List(4, 5, 6)
scala> 7 +: 8 +: 9 +: Nil res0: List[Int] = List(7, 8, 9)
You can concatenate two lists with the :::
and ++
operators.
While :::
is specific to List and ++
can be used with any sequence,
they both do the same thing for List
instances.
scala> list1 ::: list2 res1: List[Int] = List(1, 2, 3, 4, 5, 6)
scala> list1 ++ list2 res2: List[Int] = List(1, 2, 3, 4, 5, 6)
Appending a element to an immutable.List
is
inefficient
because the whole collection must be copied.
For this reason Scala does not provide an append-single-element-to-list operator for immutable List
s.
However, prepending an element (with ::
or +:
) to an immutable.List
is not computationally expensive.
Because an instance of immutable.List
cannot be changed, the prepend operators (::
and +:
) and the
concatenation operators (:::
and ++
) return new instances of immutable.List
.
Also, the concatenation operators have linear complexity (O(n))
, which is fine if used once in a while,
but does not perform well in a loop.
You can have a List
that contains any type of object.
For example, let’s say we have a Thing
case class.
scala> case class Thing(i: Int, s: String) defined class Thing
We could make a collection of Thing
s:
scala> val thing1 = Thing(1, "z") thing1: Thing = Thing(1,a)
scala> val thing2 = Thing(2, "y") thing2: Thing = Thing(2,b)
scala> val thing3 = Thing(3, "x") thing3: Thing = Thing(3,c)
scala> val things = List(thing3, thing1, thing2) things: List[Thing] = List(Thing(3,x),Thing(1,z),Thing(2,y))
The Scala compiler recognized that all of the elements in the things
list were of type Thing, so things
has type
List[Thing]
.
Notice what happens if I add a String
to the list.
scala> "Hi" :: things
res3: List[java.io.Serializable] = List(Hi, Thing(3,x), Thing(1,z), Thing(2,y))
The type of the new collection is List[Serializable]
, which is rather generic.
I could have specified an even more generic type.
scala> val x: List[Any] = "Hi" :: things
x: List[Any] = List(Hi, Thing(3,x), Thing(1,z), Thing(2,y))
To guard against inadvertently adding something of the wrong type into a collection, be sure to specify the type that you require. The compiler will then be able to notify you of elements with mismatched types.
scala> val x: List[Thing] = "Hi" :: things <console>:14: error: type mismatch; found : String required: Thing val x: List[Thing] = "Hi" :: things
Like we saw for Array
, there are several ways to create empty collections like List
.
The compiler needs to know the type of item contained in the collection, and there are two ways you can provide that type hint.
scala> List.empty[Double] res3: List[Double] = Seq()
scala> val emptyList: List[Double] = List.empty emptyList: List[Double] = List()
scala> val emptyList: List[Double] = List() emptyList: List[Double] = List()
Notice what happens if you do not provide a type hint to the compiler to indicate the element type in the List
.
scala> val badList = List.empty badList: List[Nothing] = Seq()
As you can see, the element type in the collection is inferred to be Nothing
, defeating type safety.
Be sure to specify a type hint to the right or the left of the equals sign.
More About Nil
The empty list is represented by the Nil
object.
Nil
is implemented as a singleton, and therefore cannot be created with a type specifier.
scala> Nil[Int] <console>:8: error: object Nil does not take type parameters. Nil[Int] ^
Look what happens when you apply a combinator to Nil
and no type hints are provided to the compiler – the type of element in the
collection is inferred to be Nothing
.
scala> Nil.headOption res1: Option[Nothing] = None
You should always provide a type hint with Nil
.
Here is an example of providing a type hint on the left side of the equals sign, so the Scala compiler infers the element type to be
Int
.
scala> val x: Option[Int] = Nil.headOption x: Option[Int] = None
Here is another example, where the type hint is provided on the right side of the equals sign.
The Scala compiler knows that List("hi")
is of type List[String]
, so it coerces the type of the empty list
denoted by Nil
to match.
scala> val y = Nil ::: List("hi") y: List[String] = List(hi)
Right-Associative Operators
You may have noticed that the ::
and +:
operators act unusually, in that the item to be prepended precedes the
List
instance.
This is possible because Scala treats all operators that end in a colon character as right-associative when written using infix notation.
In other words, this expression.
scala> 5 :: Nil res4: List[Int] = List(5)
... is equivalent to:
scala> Nil.::(5) res5: List[Int] = List(5)
In this expression, the ::
operator is a method of the empty List
denoted by Nil
.
Because :: is right-associative, the binding is backwards from what you may be used to.
The argument 5 is passed to List.::
method.
These expressions are also equivalent.
scala> 5 +: Nil res6: List[Int] = List(5)
scala> Nil.+:(5) res7: List[Int] = List(5)
Here is another example of building a List[Int]
.
scala> 7 +: 8 +: 9 +: Nil res1: List[Int] = List(7, 8, 9)
Vector
The immutable.Vector
class is
also an immutable ordered collection that provides a compromise between indexed and linear access.
It has both effectively constant time indexing overhead and constant time linear access overhead.
Because if this, Vector
is a good foundation for mixed access patterns where both indexed and linear accesses are used.
The ::
and :::
List
operations are not supported by Vector
, but you can use the +:
and ++:
equivalents.
scala> val vector1 = Vector(1, 2, 3) res8: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3)
scala> 0 +: vector1 res9: scala.collection.immutable.Vector[Int] = Vector(0, 1, 2, 3)
scala> Vector(thing1, thing2, thing3) res10: scala.collection.immutable.Vector[Thing] = Vector(Thing(1,z), Thing(2,y), Thing(3,z))
If you cannot decide if you should use List
or Vector
, use Vector
if possible.
scala> val vector2 = Vector(4, 5, 6) res11: scala.collection.immutable.Vector[Int] = Vector(4, 5, 6)
scala> vector1 ++ vector2 res12: Vector[Int] = Vector(1, 2, 3, 4, 5, 6)
Seq
Seq
is the base trait for ordered sequences and lists such as List
and Vector
.
Unlike the Iterator
trait, the Seq
trait enables collections with a definite order.
Seq
is a generalization of sequences, so you should use Seq
to define interfaces and use List
or
Vector
for defining variables, like this.
scala> def doSomething(seq: Seq[Int]) = seq.foreach(println) doSomething: (seq: Seq[Int])Unit
scala> doSomething(List(1, 2, 3)) 1 2 3
scala> doSomething(Vector(4, 5, 6)) 4 5 6
In other words, Scala’s Seq
is similar to Java’s List
interface, and Scala’s List
is similar to an immutable version of Java’s
LinkedList
concrete class.
BTW, the Seq
constructor actually returns a List
, which implements the Seq
trait.
It is not good practice to use this constructor, even though many Scala programmers write code like this.
scala> val seq = Seq(1, 2, 3) seq: Seq[Int] = List(1, 2, 3)
The old Scala Collections documentation has
more useful information on Seq
.
Sorting on Demand
You can sort on demand using one or more properties of an ordered collection’s elements.
Of course, for this to work all elements in the collection must have the required properties.
This means that the declared type of the items in the collection will define sortable properties, and you would not sort collection types such as
HashMap
or HashSet
.
scala> things.sortBy(_.i) res13: List[Thing] = List(Thing(1,z), Thing(2,y), Thing(3,x))
scala> things.sortBy(_.s) res14: List[Thing] = List(Thing(3,x), Thing(2,y), Thing(1,z))
In order to sort by more than one property, specify them as a tuple. The order of the sort elements defines the sort key order.
scala> things.sortBy(t => (t.i, t.s)) res15: List[Thing] = List(Thing(1,z), Thing(2,y), Thing(3,x))
You will often want to define a natural ordering for your collection. We will discuss that topic in the lecture on Sorting and Ordered Collections.
You can sort tuples in a similar manner:
scala> val tuples = Vector((4, "z"), (2, "q"), (5, "b")) tuples: scala.collection.immutable.Vector[(Int, String)] = Vector((4,z), (2,q), (5,b))
scala> tuples.sortBy(_._1) res16: scala.collection.immutable.Vector[(Int, String)] = Vector((2,q), (4,z), (5,b))
scala> tuples.sortBy(_._2) res17: scala.collection.immutable.Vector[(Int, String)] = Vector((5,b), (2,q), (4,z))
scala> tuples.sortBy(t => (t._1, t._2)) res18: scala.collection.immutable.Vector[(Int, String)] = Vector((2,q), (4,z), (5,b))
head, tail and the Remainder of a Collection
head
, init
, tail
, last,
headOption
and lastOption
are defined in
IterableOps
,
which means there are available throughout the Scala collection classes.
They can be applied to ordered and unordered collections, and also to mutable and immutable collections.
head
returns the first item in a collection, or throws a NoSuchElementException
if the collection is empty.
scala> val seq: Seq[Int] = Vector(1, 2, 3) seq: Seq[Int] = Vector(1, 2, 3)
scala> seq.head res19: Int = 1
scala> Vector[Int].empty.head java.util.NoSuchElementException: head of empty list
scala> Nil.head java.util.NoSuchElementException: head of empty list
scala> List.empty[Int].head java.util.NoSuchElementException: head of empty list at scala.collection.immutable.Nil$.head(List.scala:417) at scala.collection.immutable.Nil$.head(List.scala:414) ... 27 elided
headOption
returns an Option
(Some
or None
) and never throws an exception.
scala> seq.headOption res20: Option[Int] = Some(1)
scala> Vector[Int].empty.headOption res21: Option[Int] = None
scala> Nil.headOption res22: Option[Nothing] = None
scala> List.empty[Int].headOption res23: Option[Int] = None
tail
returns the remainder of the collection after the head is removed.
If the collection is empty an exception is thrown.
scala> seq.tail res24: Seq[Int] = Vector(2, 3)
scala> Nil.tail java.lang.UnsupportedOperationException: tail of empty list
last
returns the last element of the collection, or throws an exception if the collection is empt.
scala> seq.last res24: Int = 3
scala> Nil.last java.util.NoSuchElementException
lastOption
returns an Option
and never throws an exception.
scala> seq.lastOption res25: Option[Int] = Some(3)
scala> Nil.lastOption res26: Option[Nothing] = None
scala> List.empty[Int].lastOption res27: Option[Int] = None
init
returns all but the last element, or throws an exception if the collection is empty.
If there is only one element in the collection, the empty collection is returned.
scala> seq.init res28: Seq[Int] = Vector(1, 2)
scala> Vector(1).init res29: Vector[Int] = List()
scala> Nil.init java.lang.UnsupportedOperationException: empty.init
There is no initOption
combinator.
You should avoid using init
.
Instead, consider using take
, like this, or wrap this code into a method (which you will do shortly!).
scala> seq.take(seq.size-1) res30: Vector[Int] = Vector(1, 2)
The Create a Failure-Proof init Method exercise in the Parametric Variance and Bounds lecture explores how to write this method in a generic manner for all collections.
take, drop and takeWhile
take
returns up to the specified number of elements from the beginning of the collection.
drop
returns the remainder of the collection after the specified number of elements have been removed.
If you try to take
or drop
more elements from the collection than actually exist,
only the number of elements requested are processed and no exception is raised.
scala> seq.take(2) res31: Seq[Int] = Vector(1, 2)
scala> seq.take(0) res32: Seq[Int] = Vector()
scala> seq.take(4) res33: Seq[Int] = Vector(1, 2, 3)
scala> seq.drop(2) res34: Seq[Int] = Vector(3)
scala> seq.drop(0) res35: Seq[Int] = Vector(1, 2, 3)
scala> seq.drop(-4) res36: Seq[Int] = Vector(1, 2, 3)
scala> seq.drop(4) res37: Seq[Int] = Vector()
takeWhile
is a version of take
that accepts a predicate instead of a constant value.
Maps
Immutable.Map
is very similar to mutable.Map
, with the obvious difference being that immutable.Map
s cannot be
modified.
However, two immutable.Map
s can have binary operations performed, such as concatenation and difference, however the result is a new
immutable.Map
.
SeqMap
Scala 2.13 introduced a new type of immutable.Map
trait, SeqMap
, and two concrete implementations:
TreeSeqMap
and VectorMap
.
TreeSeqMap
TreeSeqMap
is similar in behavior to a ListMap
, but it performs better for larger collections and uses more memory.
VectorMap
VectorMap
is similar in behavior to ListMap
, but provides constant lookup time while using more memory and generally running slower.
Iterator
Iterator defines some powerful methods that are inherited by all Scala collections.
Iterator.iterate
The iterate
method starts by initializing its accumulator with the initial value,
then repeatedly applies the accumulated value back into the input of the given function.
Here I invoke iterate with a zero initial value,
and then generate values that are computed by adding 1 to the accumulated value:
scala> Iterator.iterate(0)(_ + 1) take(10) foreach println 0 1 2 3 4 5 6 7 8 9
This is similar, with the computation doubling the accumulated value then adding one:
scala> Iterator.iterate(0)(_ * 2 + 1) take(10) foreach println 0 1 3 7 15 31 63 127 255 511
Warning! This never stops.
scala> Iterator.iterate(0)(_ + 1) foreach println
Iterator.fill
Iterator.fill
produces a collection containing the results of a computation a specified number of times.
Variants of fill are defined for 1-dimensional data through 5-dimensional data.
This example uses fill
to retrieve the elements of a PriorityQueue
and sort them in ascending order.
scala> val q = PriorityQueue(1289, 12, 123, 894, 1)(Ordering.Int.reverse) q: scala.collection.mutable.PriorityQueue[Int] = PriorityQueue(1, 12, 123, 894, 1289)
scala> Iterator.fill(q.size)(q.dequeue) foreach println 1 12 123 894 1289
Iterator.tabulate
Iterator.tabulate
produces a collection containing values of a given function over a range of integer values starting from 0.
Variants of tabluate
are defined for 1-dimensional data through 5-dimensional data.
Here is an example.
scala> Array.tabulate(10)(_.toDouble) res17: Array[Double] = Array(0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0)
Here is a 2-dimensional example:
scala> Array.tabulate(3, 3)((a: Int, b: Int) => a.toDouble * b) res20: Array[Array[Double]] = Array(Array(0.0, 0.0, 0.0), Array(0.0, 1.0, 2.0), Array(0.0, 2.0, 4.0))
Iterator.continually
We can use takeWhile
in conjunction with
Iterator.continually
to obtain unique values until a user indicates they have enough.
scala> import collection.Iterator import collection.Iterator
scala> Iterator. continually(System.nanoTime). takeWhile(_ => !io.StdIn.readLine("\nMore? <Y/n>: "). toLowerCase. startsWith("n") ). foreach(println) More? <Y/n>: 23773991672470 More? <Y/n>: 23775269319438 More? <Y/n>:
Iterator.unfold
Iterator.unfold
, introduced in Scala 2.13, is to producing collections as fold
is to consuming collections.
unfold
provides a purely functional way of producing a collection of which we don’t know the size and elements a priori.
When using unfold
, the elements are computed one after the other, until a None
termination signal is computed, otherwise
the elements continue infinitely.
This example defines a method that uses unfold
to build a list of lists containing similar values.
scala> def segments[T](xs: List[T]): List[List[T]] = List.unfold(xs)(xs => if (xs.isEmpty) None else Some(xs.span(_ == xs.head)) ) segments: [T](xs: List[T])List[List[T]]
scala> segments(List(1,1,2,3,3,3,4,5,5)) res0: List[List[Int]] = List(List(1, 1), List(2), List(3, 3, 3), List(4), List(5, 5))
Another example is the Fibonacci series, implemented with LazyList.unfold
.
scala> val fib = LazyList.unfold((1, 1)) { case (a, b) => Some((a, (b, a + b))) } fib: scala.collection.immutable.LazyList[Int] = LazyList(<not computed>)
scala> fib.take(10).force res6: scala.collection.immutable.LazyList[Int] = LazyList(1, 1, 2, 3, 5, 8, 13, 21, 34, 55)
unfold
is more powerful than Iterator.apply
(for example,
the default apply
method is invoked when writing List(1, 1, 2, 3)
).
apply
constrains the user to enumerate all the elements of the collection when invoked.
unfold
is more powerful than tabulate
,
which constrains the user to provide the collection length and is limited to collections whose elements can be computed from their index.
unfold
is also more powerful than iterate
,
which is limited to collections whose n
th element can be computed from the n-1th element.
We can prove that by implementing iterate
, tabulate
, and apply
in terms of unfold
.
def apply[A](as: A*): List[A] = List.unfold(as) { case h +: t => Some((h, t)) case _ => None }
def tabulate[A](length: Int)(f: Int => A): List[A] = List.unfold(length)(n => if (n > 0) Some((f(n), n - 1)) else None)
def iterate[A](start: A, length: Int)(f: A => A): List[A] = List.unfold((length, start)) { case (0, _) => None case (n, a) => Some((a, (n - 1, f(a)))) }
Use unfold
when the iteration starts from some hidden state.
Scala 2.13 defines Iterator.unfold
.
The version for List is essentially the same method, is much simpler than the general case, and works for all Scala versions.
def unfold[A, S](start: S) (op: S => Option[(A, S)]): List[A] = Iterator. iterate(op(start))(_.flatMap{ case (_, s) => op(s) }). map(_.map(_._1)). takeWhile(_.isDefined). flatten. toList
Exercise – Compute SHA of a file
This exercise replicates the functionality of sha1sum
.
Git uses the SHA-1
algorithm to obtain the digest of each file that it checks in.
SHA-1
is not secure but is inexpensive to compute.
Write a console application that reports the SHA-1 digest of the contents of /etc/passwd
.
Do not write this application for maximum efficiency;
instead, use it as practice for processing a java.io.InputStream
of data.
Even though the source of the data is a file,
this algorithm works on a java.io.InputStream
so it is appropriate for an input of unknown length.
Implement a method with this signature as part of the solution;
you can provide a default value for the algorithm if you like.
For this exercise, the value of algorithm must be "SHA-1"
.
def digest(inStream: java.io.InputStream, algorithm: String): Array[Byte]
Hints.
-
In case you have encountered Scala
Stream
s before, they have no relationship to JavaStream
s. ScalaStream
s are not required for this solution, only JavaStream
s and Scala’sIterator
. - You can create a buffer to hold the accumulated result by calling
new Array[Byte](1024)
. -
You can get the SHA-1 message digest algorithm by calling
java.
("SHA-1")security. MessageDigest. getInstance -
java.
returns the number of bytes read, or -1 to signal no more data.io. FileInputStream. read -
Use
LazyList.continually
andtakeWhile
to read successive byte arrays from anInputFileStream
untilFileInputStream.read
signals no more data. -
Iterate using
foreach
overMessageDigest.update
to calculate an updated digest in theMessageDigest
, using the data in the buffer as input. The third argument must be set to the number of bytes read for this iteration. One way of expressing this is:Scala code, or whatever(whatever).foreach { b => md.update(buffer, 0, b) }
Scala code, or whatever(whatever).foreach { md.update(buffer, 0, _) }
Output should be something like:
SHA-1 of /etc/passwd is 2112088146319-7-11555-1126870-43-105-63-21-5028-43-98
Solution
package solutions
import java.security.MessageDigest import java.io.{FileInputStream, InputStream}
@main object SHA1 { def digest(fileName: String, algorithm: String): Array[Byte] = { val inStream = new FileInputStream(fileName) digest(inStream, algorithm) }
def digest(inStream: InputStream, algorithm: String="SHA-256"): Array[Byte] = { val md = MessageDigest.getInstance(algorithm) val buffer = new Array[Byte](1024) LazyList.continually { inStream.read(buffer) } .takeWhile(_ != -1) .foreach { md.update(buffer, 0, _) } md.digest }
println(s"""SHA-1 of /etc/passwd is ${digest("/etc/passwd", "SHA-1").mkString}""") }
To run this solution, type:
$ sbt "runMain solutions.SHA1"
Views
Most Scala collections are strict – that is,
all values of the collection are computed and stored,
even if they are not retrieved.
In contrast, the elements of a lazy collection are computed only when accessed.
Range
s,
which we first discussed in the Learning Scala Using The REPL 1/3 lecture of the
Introduction to Scala course,
are examples of lazy collections, specifically lazy ordered sequences.
scala> val range: Range.Inclusive = 3 to 8 range: Range.Inclusive = Range 3 to 8
scala> range(0) res0: Int = 3
Making Lazy Collections from Strict Collections
Lazy collections require less computation than strict collections for many or most circumstances because collection elements are only computed when required.
The SeqOps
trait is available to all collections.
The SeqOps.view
method can convert an otherwise strict collection to a lazy collection.
The view
method returns a subclass of SeqOps
.
Even though Range
s are lazy and not strict, the view
method is defined because it implements
IndexedSeqOps
,
which implements
SeqOps
,
which implements
IterableOps
,
which declares view
.
scala> val view = range.view view: scala.collection.IndexedSeqView[Int] = IndexedSeqView(<not computed>)
scala> view(0) res1: Int = 3
LazyList
Prior to Scala 2.13,
Stream
was used to model a lazily evaluated list.
Scala 2.13 deprecated Stream
in favor of
LazyList
.
As its name suggests, a LazyList
is a linked list whose elements are lazily evaluated.
An important semantic difference with Stream
is that in LazyList
both the head and the tail are lazy, whereas in
Stream
only the tail is lazy.
Scala lazy lists are unrelated to Java streams.
Instead, Scala lazy lists are subtypes of immutable.Seq
.
LazyList
is commonly used to handle subsets of infinite sequences, and for sequences that are expensive to compute.
Predef
imports collection.immutable.LazyList
so it is always available to any program.
Lazy lists are parametric, or generic, which means elements of a lazy list conform to a specific type.
LazyList[T]
is a lazy collection whose elements of type T
are
memoized, or cached.
Because Stream
is memoized, given a stream with sufficient unique values,
the memoization cache will eventually fill all available memory.
We will discuss memoization in the
Memoization in Depth lecture.
BTW, Scala 2.13 renamed Stream.append
to
LazyList.lazyAppendedAll
.
LazyList.continually
creates a Scala LazyList
, which can be considered as a lazy Seq
which contains elements that
are only evaluated when they are needed.
For example, here is how one might compute a LazyList
of unique Long
s.
scala> import collection.immutable.LazyList import collection.immutable.LazyList
scala> val ids = LazyList.continually(System.nanoTime) ids: scala.collection.immutable.LazyList[Long] = Stream(86653342711375, ?)
We can examine the first 5 elements of the stream this way.
Note that toVector
forces evaluation.
scala> ids.take(5) res37: scala.collection.immutable.LazyList[Long] = LazyList(86549694623903, ?)
scala> ids.take(5).toVector res38: Vector[Long] = Vector(86549694623903, 86592186252578, 86592186259051, 86592186261332, 86592186263348)
This demonstrates that LazyList
memoizes values that have been generated.
scala> ids.head res39: Long = 72091815908327
LazyList.continually
can be used to write a similar program as the one I showed you a moment ago that used
Iterator.continually
, but should only be used if the previous values need to be accessed as the program executes.
Notice that I put dots at the end of each line of the REPL input, except for the last line.
This is so the REPL knows to expect a multi-line expression.
If I had put the dots at the start of the last two lines instead, the REPL would not have read all three lines as one expression.
scala> LazyList.continually(System.nanoTime). takeWhile(_ => !io.StdIn.readLine("\nMore? <Y/n>: ") .toLowerCase .startsWith("n") ).foreach(println) More? <Y/n>: 24282594984003 More? <Y/n>: 24283649919676 More? <Y/n>:
Iterator.continually
provides a similar mechanism to LazyList.continually
that also supports lazy sequences,
but without the memory overhead of memoization.
You should use Iterator.
when you do not need to memoize the values.
About foreach
Here is how you can discover the acceptable expressions that can be passed into the foreach
higher-order method above.
The collection.Iterator
Scaladoc shows the method signature for continually
.
We see that it returns an Iterator[A]
where A
is defined by the type streamed from continually
.
In the above example,
System.Long
, so A
is of type Long
.
def continually[A](elem: => A): Iterator[A]
The Scaladoc for
Iterator.
shows the method signature.
def foreach( f: (A) => Unit): Unit
Thus Iterator.foreach
accepts a Function1[A, Unit]
, which can also be written as A => Unit
.
In other words, this foreach
accepts any Function1
that accepts an instance of A
,
and does not return a value (returns Unit
).
Acceptable examples include:
-
Scala code
x => println(x) // println returns Unit
-
Scala code
println(_) // shorthand for above
-
Scala code
println // shorter shorthand for above
-
Scala code
x => val y = callSomeMethod(x) doSomethingWithY(y) () // explicitly return Unit
-
Scala code
def myMethod[T]: Unit = { doSomethingWithT(t) () // explicitly return Unit }
Iterator .continually(System.nanoTime) .takeWhile(_ => !io.StdIn.readLine("\nMore? <Y/n>: ") .toLowerCase .startsWith("n") ).foreach(myMethod) // myMethod is automatically lifted to a Function1
Streaming Binary I/O
Here is a more efficient way to read in a binary file into a byte array, than the sample code shown in the the Higher-Order Functions lecture. This version does not do pointless character encoding.
scala> import scala.language.postfixOps import scala.language.postfixOps
scala> import sys.process._ import sys.process._
scala> val fileName = ("which scalac".!!).trim fileName: String = /usr/local/bin/scalac
scala> val bis = new java.io.BufferedInputStream(new java.io.FileInputStream(fileName)) bis: java.io.BufferedInputStream = java.io.BufferedInputStream@3c04ddda
scala> val bArray = Stream.continually(bis.read). takeWhile(-1 !=). map(_.toByte). toArray bArray: Array[Byte] = Array(35, 33, 47, 117, 115, 114, 47, 98, 105, 110, 47, 101, 110, 118, 32, 98, 97, 115, 104, 10, 35, 10, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 10, 35, 32, 67, 111, 112, 121, 114, 105, 103, 104, 116, 32, 50, 48, 48, 50, 45, 50, 48, 49, 51, 32, 76, 65, 77, 80, 47, 69, 80, 70, 76, 10, 35, 10, 35, 32, 84, 104, 105, 115, 32, 105, 115, 32, 102, 114, 101, 101, 32, 115, 111, 102, 116, 119, 97, 114, 101, 59, 32, 115, 101, 101, 32, 116, 104, 101, 32, 100, 105, 115, 116, 114, 105, 9...) scala>
scala> bis.close()
Several variations of this code are provided in HigherFun.scala
.
You can run this program by typing:
$ sbt "runMain BinaryIO"
Here is an explanation.
-
BufferedInputStream.read
returns the next byte of data, or -1 if the end of the input stream is reached. -
LazyList.continually
creates a ScalaStream
, which is a lazySeq
where elements are only evaluated when they are needed.Stream
memoizes elements as they are computed. -
takeWhile
returns the beginning elements of aSeq
that satisfy a predicate. Stream.continually(bis.read).takeWhile(_.!=(-1))
returns aStream[Int]
Stream.continually(bis.read).takeWhile(-1 !=).map(_.toByte)
returns aStream[Byte]
-
The function value
-1 !=
is extremely terse because shorthand is combined with infix notation. It is equivalent to each of these incantations:item => item != -1
item => item
.!=(-1)
_.!=(-1)
-
_ != -1
This is my preferred style because it is reasonably terse while remaining understandable to most programmers. -1 != _
© Copyright 1994-2024 Michael Slinn. All rights reserved.
If you would like to request to use this copyright-protected work in any manner,
please send an email.
This website was made using Jekyll and Mike Slinn’s Jekyll Plugins.