Published 2019-07-02.
Time to read: 8 minutes.
We covered pattern matching and extractors in the
Pattern Matching and
Sealed Classes and Extractors
lectures of the Introduction to Scala course.
This lecture expands on that material and applies it to collections like
Array
, Seq
, List
, IndexedSeq
and subtypes.
Although pattern matching is powerful and cool, it is not always the most efficient or expressive way to write code.
Alternative expressions are shown where appropriate.
This lecture discusses pattern matching operations on Array
s and Seq
and its subtypes such as
List
and Vector
. However, Array
does not support all of the
techniques shown in this lecture, so in order to use those techniques you will need to convert the Array
to a List
.
The Array.toList
method can do the job; this conversion will allow you to use all of the techniques shown in this lecture on Array
s.
As an aside, when the collection classes were reworked for Scala 2.13, they became a horrible mess.
The Array
class
inheritance diagram
is no help in determining where Array.toList
is defined because it is incomplete.
Array.toList
is provided by an implicit conversion from Array[T]
to toIndexedSeq[T]
performed by the
copyArrayToImmutableIndexedSeq
method in scala.LowPriorityImplicits2
;
Predef
is responsible for LowPriorityImplicits
and LowPriorityImplicits2
.
The sample code for this lecture can be found in
courseNotes/
.
These examples work with Array
s and any type of collection that extends Seq
.
Match an Empty Collection
Use the
SeqOps.isEmpty
method which is available for Array
and subclasses of Seq
:
scala> Nil.isEmpty val res0: Boolean = true
scala> Array().isEmpty val res1: Boolean = true
scala> List().isEmpty val res2: Boolean = true
scala> Seq().isEmpty val res3: Boolean = true
scala> Vector().isEmpty val res4: Boolean = true
scala> List.empty.isEmpty val res5: Boolean = true
scala> Seq.empty.isEmpty val res6: Boolean = true
scala> Vector.empty.isEmpty val res7: Boolean = true
scala> Array(1, 2).isEmpty val res8: Boolean = false
scala> List(1, 2, 3).isEmpty val res9: Boolean = false
scala> Seq(1, 2, 3).isEmpty val res10: Boolean = false
scala> Vector().isEmpty val res11: Boolean = false
To test for a non-empty collection, use
SeqOps.nonEmpty
.
scala> Nil.nonEmpty val res0: Boolean = false
scala> Array().nonEmpty val res1: Boolean = false
scala> List().nonEmpty val res2: Boolean = false
scala> Seq().nonEmpty val res3: Boolean = false
scala> Vector().nonEmpty val res4: Boolean = false
scala> List.empty.nonEmpty val res5: Boolean = false
scala> Seq.empty.nonEmpty val res6: Boolean = false
scala> Vector.empty.nonEmpty val res7: Boolean = false
scala> Array(1, 2).nonEmpty val res8: Boolean = true
scala> List(1, 2, 3).nonEmpty val res9: Boolean = true
scala> Seq(1, 2, 3).nonEmpty val res10: Boolean = true
scala> Vector().nonEmpty val res11: Boolean = true

Match Specified Elements
This section shows you several ways of ensuring that a collection starts with a zero element. This test is not generally useful, but the techniques shown are.
The first two ways show interesting ways of working with Scala, but the third way is actually the best practice because it uses standard combinators.
(1) Use Placeholder Splat Syntax
The Scala notation for matching all remaining elements of a Seq
or Array
is _*
.
This is a new usage of the placeholder splat syntax introduced in Splatting a Seq section of the
Classes Part 2 lecture of the
Introduction to Scala course.
When used in a case
statement as shown below the notation _*
matches zero or more remaining elements in the collection.
Placeholders used in match statements do not capture the matched values.
def hasLeadingZero(seq: Seq[Int]): Boolean = seq match { case Seq(0, _*) => true case _ => false }
The hasLeadingZero
method above checks if a collection begins with a zero,
and uses _*
to match all elements that might follow the leading zero.
Placeholder splat syntax will match an empty collection as well as a non-empty collection.
We can write similar code for an Array
:
def hasLeadingZero(array: Array[Int]): Boolean = array match { case Array(0, _*) => true case _ => false }
Both methods have the same name, but accept different parameter types. If both methods are defined in the same scope, Scala’s polymorphism mechanism will invoke the appropriate method based on the argument type passed.
The REPLs for Scala 2 and Scala 3 differ slightly. Scala 3’s REPL allows both methods to be pasted together:
$ scala Welcome to Scala 3.4.2 (11.0.23, Java OpenJDK 64-Bit Server VM). Type in expressions for evaluation. Or try :help.
scala> def hasLeadingZero(seq: Seq[Int]): Boolean = | seq match { | case Seq(0, _*) => true | case _ => false | } | | def hasLeadingZero(array: Array[Int]): Boolean = | array match { | case Array(0, _*) => true | case _ => false | } | def hasLeadingZero(seq: Seq[Int]): Boolean def hasLeadingZero(array: Array[Int]): Boolean
You cannot paste in these methods one at a time into a Scala 2 REPL;
instead, use the REPL’s :paste
command to define them together.
We discussed the REPL’s :paste
command in the
Scala Imports and Packages lecture of the
Introduction to Scala course.
Let’s test these methods:
scala> hasLeadingZero(List(0, 1, 2)) res9: Boolean = true
scala> hasLeadingZero(List(1, 2, 3)) res10: Boolean = false
scala> hasLeadingZero(List(0)) res10: Boolean = true
scala> hasLeadingZero(List(1)) res11: Boolean = false
Now let’s test Array
s:
scala> hasLeadingZero(Array(0, 1, 2)) res12: Boolean = true
scala> hasLeadingZero(Array(1, 2, 3)) res13: Boolean = false
scala> hasLeadingZero(Array(0)) res14: Boolean = true
scala> hasLeadingZero(Array(1)) res15: Boolean = false
(2) Use a Variable Name to Capture the Remaining Sequence
If you wanted to capture the value matched by the placeholder you could use a variable name instead of _*
.
The variable will match whatever is found, which will be a collection of the same type as the original input.
This means that the variable might contain an empty collection.
Here I use a variable called remainder
to hold the remaining part of seq
that is ignored.
def hasLeadingZero2(seq: Seq[Int]): Boolean = seq match { case Seq(0, remainder) => true case _ => false }
We can write similar code for an Array
:
def hasLeadingZero2(array: Array[Int]): Boolean = array match { case Array(0, remainder) => true case _ => false }
Again, you must use :paste
to define them together in the REPL. Let’s test these methods:
scala> hasLeadingZero2(List 1, 2)) res16: Boolean = true
scala> hasLeadingZero2(List(1, 2, 3)) res17: Boolean = false
scala> hasLeadingZero2(Array(0, 1, 2)) res18: Boolean = true
scala> hasLeadingZero2(Array(1, 2, 3)) res19: Boolean = false
(3) The Best Way
Here is a better way to write hasLeadingZero
.
This expression uses headOption
to ensure that the expression does not fail if the collection is empty.
If the resulting Option[Int]
contains Some(value)
, the combinator contains
tests
that the Option
contains a zero.
If either the predicate fails or headOption
returned None
, the resulting value is false
.
def hasLeading0(seq: Seq[Int]): Boolean = seq.headOption.contains(0)
This improved version can be invoked with both List
s and Array
s:
scala> hasLeading0(List(0, 1, 2)) res20: Boolean = true
scala> hasLeading0(List(1, 2, 3)) res21: Boolean = false
scala> hasLeading0(Array(0, 1, 2)) res22: Boolean = true
scala> hasLeading0(Array(1, 2, 3)) res23: Boolean = false
You can run this code by typing:
$ sbt "runMain collections.HasLeadingZero"
Extra Goodness with List and Array
Moving on from pattern matching with Seq
,
here are some examples of pattern matching that can only be done with List
or Array
.
Extracting Members from a Collection of Unknown Size
(1) Use cons
Here is one way to print out the first three tokens of a string, or "Nope"
if the string had less than three tokens. This
example uses the List
constructor method
::
(also known as cons
), which was discussed in the
Immutable Collections lecture earlier in this course.
def extract[T](list: List[T]): String = list match { case x1 :: x2 :: x3 :: rest => s"x1=$x1, x2=$x2, x3=$x3" case _ => "Nope" }
def extract[T](array: Array[T]): String = extract(array.toList)
We can invoke extract with an Array[String]
resulting from splitting a String
into tokens delimited by spaces:
scala> extract("one two three blah blah".split(" ")) res25: String = x1=one, x2=two, x3=three
scala> extract("one two three".split(" ")) res26: String = x1=one, x2=two, x3=three
scala> extract("one two".split(" ")) res27: String = Nope
Inside the extract
method, rest
contains a List
containing the rest of the tokens, so for the first
invocation of extract
it has the value List("blah", "blah")
; for the second invocation it has the
value Nil
.
(2) Use Placeholder Splat
def extract2[T](list: List[T]): String = list match { case List(x1, x2, x3, _*) => s"x1=$x1, x2=$x2, x3=$x3" case _ => "Nope" }
def extract2[T](array: Array[T]): String = array match { case Array(x1, x2, x3, _*) => s"x1=$x1, x2=$x2, x3=$x3" case _ => "Nope" }
Again both methods are defined in the same scope,
but because they accept different parameter types, Scala’s support for polymorphism invokes the appropriate method.
As before, you must use the REPL’s :paste
command in order to define both polymorphic methods at once:
scala> extract2("one two three blah blah".split(" ")) res28: String = x1=one, x2=two, x3=three
scala> extract2("one two three blah blah".split(" ").toList) res29: String = x1=one, x2=two, x3=three
scala> extract2("one two three".split(" ").toList) res30: String = x1=one, x2=two, x3=three
scala> extract2("one two".split(" ").toList) res31: String = Nope
(3) Use a Variable Name to Capture the Remaining Sequence
Instead of using placeholder splat syntax, which causes the matched remainder of the collection to be inaccessible,
you can capture it by using a variable, in this case in a variable called remainder
.
Although remainder
’s type is not declared,
it has the same type as the type of the variable that was used to match
:
def extract3[T](list: List[T]): String = list match { case List(x1, x2, x3, remainder) => s"x1=$x1, x2=$x2, x3=$x3" case _ => "Nope" }
def extract3[T](array: Array[T]): String = extract3(array.toList)
Output is as you would expect, plus you could inspect the value of remainder
within the code for the first case
expression.
You can run this program by typing:
$ sbt "runMain collections.ListExtractAnyLen"
Output is as shown above.
Extracting Members While Asserting Collection Size
This code matches against a String
that has exactly three tokens,
and either returns a formatted String
containing the tokens or or "Nope"
otherwise:
"one two three".split(" ").toList match { case x1 :: x2 :: x3 :: Nil => s"x1=$x1, x2=$x2, x3=$x3" case _ => "Nope" }
Here is an alternative implementation that uses Array
’s unapply
method:
"one two three".split(" ") match { case Array(x1, x2, x3) => s"x1=$x1, x2=$x2, x3=$x3" case _ => "Nope" }
You can run this code by typing:
$ sbt "runMain collections.ListExtractAssertLen" x1=one, x2=two, x3=three
Output for all of these implementations is the same.
Extra Goodness with IndexedSeq / Vector
Recall from the
Collections Overview
lecture that the IndexedSeq
trait is optimized for random access,
and Vector
is an implementation of IndexedSeq
.
All of the following examples with work with Array
, List
and other Seq
trait implementations,
but they will take a long time for large collections compared to IndexedSeq
trait implementations such as Vector
.
For that reason I do not show versions of these examples that accept Array
.
Match head and tail
To get the head
and tail
, use the Seq
method called
+:
(prepend),
which was discussed in the
Immutable Collections lecture earlier in this course.
def maybeHeadTail[A](vector: Vector[A]): Option[(A, Vector[A])] = vector match { case head +: tail => Some(head -> tail) case _ => None }
We can call maybeHeadTail
with a nonEmpty
Vector
.
Notice how the REPL shows the matching type, which is Option[Tuple2[Int, Vector[Int]]]
in this case:
scala> maybeHeadTail(Vector(1, 2, 3, 4)) res32: Option[(Int, Vector[Int])] = Some((1,Vector(2, 3, 4)))
Let’s call maybeHeadTail
with a Vector
that only has one element,
which yields a tail with no elements:
scala> maybeHeadTail(Vector(1)) res33: Option[(Int, Vector[Int])] = Some((1,Vector()))
Let’s call maybeHeadTail
with a Vector
that has no elements,
which yields None
instead of a head and a tail.
Notice that the matching type is Option[Tuple2[Nothing, Vector[Nothing]]]
,
which does not match the type that the compiler expects for the first case statement.
scala> maybeHeadTail(Vector()) res34: Option[(Nothing, Vector[Nothing])] = None
You can run this code by typing:
$ sbt "runMain collections.VectorHeadTail"
Match init and last
The Immutable Collections lecture discussed init
and last
.
Notice how similar this method is to the preceding.
It uses the :+
(append) operator instead of the prepend operator.
def maybeInitLast[A](vector: Vector[A]): Option[(Vector[A], A)] = vector match {
case init :+ last => Some(init -> last)
case _ => None
}
We can call maybeInitLast
with a nonEmpty
Vector
. Notice how the REPL shows the matching type,
which is Option[Tuple2[Vector[Int], Int]]
in this case:
scala> maybeInitLast(Vector(1, 2, 3, 4)) res35: Option[(Vector[Int], Int)] = Some((Vector(1, 2, 3),4))
Let’s call maybeInitLast
with a Vector
that only has one element,
which yields an init with no elements:
scala> maybeInitLast(Vector(1)) res36: Option[(Vector[Int], Int)] = Some((Vector(),1))
Let’s call maybeInitLast
with a Vector
that has no elements,
which yields None
instead of a head and a tail.
Again, notice that the matching type is Option[Tuple2[Vector[Nothing], Nothing]]
,
which does not match the type that the compiler expects for the first case statement.
scala> maybeInitLast(Vector()) res37: Option[(Vector[Nothing], Nothing)] = None
You can run this code by typing:
$ sbt "runMain collections.VectorInitLast"
Scala 2.13’s Pattern Matching on String
This is very cool!
Scala 2.13 introduced pattern matching on String using
scala.StringContext.unapplySeq()
.
The unapplySeq
method is a version of unapply
for collections,
and Scala treats String
like a collection of Char
.
Here are some simple examples, inspired by the code examples in the Scaladoc, and pasted into a REPL.
The matching substring allow the non-matching substring to be captured into a new variable called name
:
scala> val s"Hello, $name" = "Hello, James" name: String = James
A MatchError
results if the strings don’t match at all:
scala> val s"Goodbye, $name" = "Hello, James" scala.MatchError: Hello, James (of class java.lang.String) ... 28 elided
You might want to use this with a match or partial function so MatchError
s are handled invisibly.
scala> "Hello, James" match { case s"Hello, $name" => println(s"Greeted $name") case x => println(s"Could not parse ’$x’") } Greeted James
Sadly, a compiler bug means this capability is not yet ready for production code. The following blows up the compiler:
scala> "Goodbye, James" match {
case s"Hello, $name" => println(s"Greeted $name")
case x => println(s"Could not parse ’$x’")
}
Scala 2.13.1 will fix this bug.
Multiple matches are supported:
scala> val s"$greeting, $name" = "Hello, James" greeting: String = Hello name: String = James
List.unapplySeq
also in concert with String
’s pattern matching:
scala> List("Not a match", "name: suffix", "Who goosed the moose?").map { case s"name: $value" => value case s"Who goosed the $animal?" => animal case _ => "" } res4: List[String] = List("", suffix, moose)
Regular expressions also work:
scala> val TimeSplitter = "([0-9]+)[.:]([0-9]+)".r TimeSplitter: scala.util.matching.Regex = ([0-9]+)[.:]([0-9]+)
scala> val s"The time is ${ TimeSplitter(hours, mins) }" = "The time is 10.50" hours: String = 10 mins: String = 50
Again, using match
provides a way to handle parse errors gracefully:
scala> "The time is 10.50" match { case s"The time is ${ TimeSplitter(hours, mins) }" => Some((hours, mins)) case _ => None } res6: Option[(String, String)] = Some((10,50))
Aliases and Sequence Element Matching
A partial function can express complex logic. Aliases and sequence matching can make the logic much easier to understand. The Sealed Classes and Extractors lecture of the Introduction to Scala course introduced match aliases. Let’s have a little fun while exploring how to use aliases with sequence matching.
Your mission is to eat dessert, but you are on a diet.
Meals with only 1 course do not have dessert, so you can only eat dessert as part of a 3 course meal.
You can feed the first course to the dog if you don’t like it, but the dog cannot eat spinach since that is poisonous to dogs.
You need to be able to determine which meals you can order that either are low calorie,
or have a tasty enough item that is worth breaking the diet for.
Let’s start by defining a Food
case class:
case class Food(name: String, calories: Int, yumminess: Int) { def dogCanEat = name!="Spinach" }
And let’s define a Menu
case class:
case class Menu(foods: Food*) { def totalCalories = foods.map(_.calories).sum
def mostYucky = foods.sortBy(_.yumminess).head
def mostTasty = foods.sortBy(_.yumminess).last
def isWorthOrdering: Boolean = { //println(s"totalCalories=$totalCalories; mostYucky.yumminess=${mostYucky.yumminess}; mostTasty.yumminess=${mostTasty.yumminess}") totalCalories<200 || mostTasty.yumminess>=9 }
override def toString = s"${foods.map(_.name).mkString("_")}: totalCalories=$totalCalories; mostYucky=${mostYucky.name}; mostTasty=${mostTasty.name}" }
Now we can define the menu:
val course1a = Food("Spinach", 10, 2) val course2a = Food("Turnips", 10, 1) val course1b = Food("Peas", 50, 5) val course2b = Food("Potatoes", 110, 6) val dessert = Food("BelgianChocolate", 135, 9)
val menu1 = Menu(course1a) val menu2 = Menu(course1a, course2a) val menu3 = Menu(course1a, course2a, dessert) val menu4 = Menu(course1b, course2a, dessert) val menu5 = Menu(course1b, course2b, dessert)
Now we can get to the point of this example.
Notice how the alias allows the entire matched item to be given a name (menu
),
while still ensuring that the matched list has exactly three elements, and that the remaining constraints are satisfied.
This complex logic is easy to understand and maintain.
We start by creating a List[Menu]
containing all available Menu
instances and filter
the List
.
Filter is a higher-order FunctionN
that accepts a lambda or named FunctionN
as discussed in the
Higher-Order Functions lecture earlier in this course.
As we learned in the Partial Functions lecture,
you can supply a partial function whenever a regular FunctionN
is expected.
This partial function has two cases: one returns true
and the other returns false
.
The first case is the interesting one.
It features an alias called menu
of type Menu
.
The Menu constructor is a varargs Array[Menu]
, and this case will only match for menus that have exactly three courses,
named c1
, c2
and c3
.
Then the menu item that is currently being matched is worth ordering, or if we can feed the first course to the dog.
If Menu.unapply
matches because the Menu
has three courses and the guard passes then this case returns true
.
val result = List(menu1, menu2, menu3, menu4, menu5) filter { case menu @ Menu(c1, c2, c3) if menu.isWorthOrdering || c1.dogCanEat => true case _ => false }
println(result.mkString("\n"))
You can run this example by typing:
$ sbt "runMain collections.MatchAlias"
Output is as follows. Notice that menu1
and menu2
were filtered out because they did not contain 3 courses.
Spinach_Turnips_BelgianChocolate: totalCalories=155; mostYucky=Turnips; mostTasty=BelgianChocolate Peas_Turnips_BelgianChocolate: totalCalories=195; mostYucky=Turnips; mostTasty=BelgianChocolate Peas_Potatoes_BelgianChocolate: totalCalories=295; mostYucky=Peas; mostTasty=BelgianChocolate
Exercise – Extract File Type from Path
Write a method that accepts a String
containing an arbitrary file name, which may or may not include a path (subdirectories).
Use list pattern matching to parse out the file type.
Because the file name might just consist of a dot, the method should return an Option[String]
.
For example:
- Given
prefix.page.html
, returnSome(html)
. - Given
blah.png
, returnSome(png
). - Given
dir/path/to/a.b.c.d
, returnSome(d)
.
Hints
Solution
This solution works for any number of periods in a file name:
def findExtension(s: String): Option[String] = s.reverse.split("\\.") match { case Array(ext, _*) if ext.nonEmpty && !s.endsWith(".") && s.contains(".") => Some(ext.reverse) case _ => None }
The string s
is first reversed, and then split wherever a period occurred into an Array[String]
,
which is matched against a case using placeholder splat syntax.
The pattern requires at least two strings in the Array[String
], and the first String
must not be empty;
this had to be tested for because of how split
works for various types of input.
The guard also ensures that the original string did not end with a period and that there was at least one period .
Try it in the REPL to see:
scala> "prefix.page.html".split("\\.") res0: Array[String] = Array(prefix, page, html)
scala> ".page.html".split("\\.") res1: Array[String] = Array("", page, html)
scala> "page.html".split("\\.") res2: Array[String] = Array(page, html)
scala> ".html".split("\\.") res3: Array[String] = Array("", html)
scala> "page.".split("\\.") res4: Array[String] = Array(page)
scala> "page".split("\\.") res5: Array[String] = Array(page)
The extracted file name extension is then reversed back to its original direction. Let’s try it in the REPL:
scala> findExtension("html") res6: Option[String] = None
scala> findExtension(".html") res7: Option[String] = Some(html)
scala> findExtension("page.html") res8: Option[String] = Some(html)
scala> findExtension("path/prefix.page.html") res9: Option[String] = Some(html)
To run this program, type:
$ sbt "runMain solutions.ExtractFileType"
© Copyright 1994-2024 Michael Slinn. All rights reserved.
If you would like to request to use this copyright-protected work in any manner,
please send an email.
This website was made using Jekyll and Mike Slinn’s Jekyll Plugins.