Published 2015-03-04.
Last modified 2015-08-09.
Time to read: 7 minutes.
The material in this course relies heavily upon the material presented in these preceding lectures:
- Functions are First Class
- Parametric Types (aka generics)
- Lambda Review & Drill
- Immutable Collections
- Mutable Collections
This lecture shows you progressively more sophisticated memoization techniques,
starting with a baby memoization that generates output to show how it works,
to a Memoize
trait,
a Memoizer
class and two techniques for memoizing methods and functions that accept more than one parameter.
This lecture also shows how a memoized lambda could have unit tests written for it, thereby addressing one of the primary complaints about lambdas.
The sample code for this lecture can be found in
courseNotes/
.
Review
The Recursion and Functional Programming lecture of the
Introduction to Scala
course showed the following code that memoized the results of the fn
method.
It uses an immutable.HashMap
(discussed in the Immutable Collections lecture)
to store values that needed to be permanently stored, and a mutable.WeakhashMap
(discussed in the Mutable Collections lecture)
to store a potentially large number of computed values for quick access.
object FibMem extends App { val defaultMap = immutable.HashMap(0 -> BigInt(0), 1 -> BigInt(1)) val cache = mutable.WeakHashMap[Int, BigInt]().withDefault(defaultMap)
def fn(n: Int): BigInt = { @tailrec def fibIter(count: Int, fibN1: BigInt, fibN2: BigInt): BigInt = if (count == n) { cache += n -> fibN1 fibN1 } else fibIter(count + 1, fibN2, fibN1 + fibN2)
fibIter(0, 0L, 1L) }
def fibMem(n: Int): BigInt = cache.getOrElse(n, fn(n))
Time.fibTest(fibMem) }
The memoization code is intertwined with the other logic; this was necessary because we had not yet discussed the material presented in the Higher-Order Functions and Parametric Types lectures. This lecture demonstrates techniques that factors out memoization from the rest of the program.
The code examples shown in the remainder of this lecture assume that all values are equally important, so only a WeakHashMap
is employed throughout.
If you need to memoize a recursive method which starts from fixed values then you should use an
immutable.HashMap
to store the initial value(s), as shown in the above example.
Baby Steps
Normally only Function1
s are memoized, which is to say that methods and functions that accept only one parameter are commonly memoized.
We’ll explore two techniques later in this lecture to memoize methods and Function
s that accept more than one parameter.
N
Here is an example of a MemoizeBaby
trait that can be mixed into any object or class that contains a
method or FunctionN
that you would like to memoize.
It just consists of one method, memoize
, that accepts two parameters: a Function1
to be memoized,
and a name for the method that will be output to ease your understanding while learning.
trait MemoizeBaby { /** Version just for learning. * Transform given Function1 instance into another Function1 instance backed by a WeakHashMap for memoization of parameter/result pairs. * @param f Must be Function1 (single argument) * @param name displayed for this memoized Function, only present for learning purposes; remove for production code */ def memoize[Key, Value](f: Key => Value, name: String="") = { /** Each memoized Function has its own cache */ val cache = collection.mutable.WeakHashMap.empty[Key, Value] (key: Key) => { if (cache.keySet.contains(key)) println(s"Retrieving $name($key)=${cache(key)}") cache.getOrElseUpdate(key, f(key)) } } }
An empty cache is created, then a Function1[Key, Value]
is defined that references the cache,
and because the Function1
is the last expression in the method, it is returned as the value of the method.
Key
and Value
are parametric types, and could have been written as single letters
like [T, U]
, but I wrote it them using full words because it made their purposes more clear.
The Function1
returned by the memoize
method has a reference to the cache created within the method,
so the cache will remain in memory after memoize
returns.
This style of programming can lead to memory leaks, but this is not a mistake, it was done on purpose.
Just beware that the cache will remain until all references to it are eventually destroyed.
Let’s say we want to memoize the method1
method, defined within a console app that has the MemoizeBaby
trait mixed in.
object Memo extends App with MemoizeBaby { def method1(i: Int): Int = { val result = i * 2 + 1 println(s"Computing f1($i)=$result") result } }
We learned about method lifting in the
Functions are First Class
lecture of the Introduction to Scala course, so we know that
method1
can be lifted from the object
, trait
or class
that it was defined in and embedded in a new instance of a Function1[Int, Int]
.
Scala automatically lifts methods as required, so when we write the following, method1
is lifted into a
Function1
called f1
.
scala> val f1 = Memo.memoize(method1, "f1") f1: Int => Int = <function1>
The type returned by memoize is Int => Int
, or written in long form, Function1[Int, Int]
.
If this seems unfamiliar to you, please review the Functions are First Class and
More Fun With Functions lectures of the
Introduction to Scala course.
If you just need practice on recognizing these forms, please work through the
Lambda Review & Drill lecture from that course.
The f1
instance of Function1
is a memoized version of method1
.
We can invoke a memoized version just the same as the original method or Function1
.
scala> f1(1) Computing f1(1)=3 res0: Int = 3
scala> f1(1) Retrieving f1(1)=3 res1: Int = 3
The first time f1
was invoked with an argument of 1, the returned value was computed.
All subsequent invocations with that argument are satisfied by returning the cached value, instead of recomputing it.
The memoization cache is created by the memoize
method each time it is invoked, which means that any number of methods in the given class or object can be memoized by mixing in the MemoizeBaby
trait.
If we add the method2
method to the Memo
object, we can memoize it also.
def method2(i: Int): Int = { val result = i * 6 - 21 println(s"Computing f2($i)=$result") result }
We can see that the caches for each memoized method are separate.
scala> val f2 = Memo.memoize(method2, "f2") f2: Int => Int = <function1>
scala> f2(1) Computing f2(1)=-15 res0: Int = -15
scala> f1(1) Retrieving f1(1)=3 res1: Int = 3
You can run the above code by typing:
$ sbt "runMain Memo"
Production Version of Memoize Trait
This version does away with the training wheels,
in that the second parameter to the previous version of memoize
is not present any more.
trait Memoize { /** Transform given `Function1` instance into another `Function1` instance backed by a `WeakHashMap` for memoization of parameter/result pairs. * @param f Must be `Function1` (single argument) */ def memoize[Key, Value](f: Key => Value) = { /** Each memoized Function has its own cache */ val cache = collection.mutable.WeakHashMap.empty[Key, Value] (key: Key) => { cache.getOrElseUpdate(key, f(key)) } } }
Exercise – Mixing in the Memoize Trait
Memoize method1
and method2
above using the production version of the Memoize
trait.
You can reference these methods in your code as cache.Memo.method1
and cache.Memo.method2
.
Test the resulting memoized methods with a variety of input values.
Solution
package solutions
object MemoProd extends App with cache.Memoize { import cache.Memo.{method1, method2}
val f1: Int => Int = memoize(method1) val f2: Int => Int = memoize(method2)
// Compute values; side effect: memoizes results into caches 1 to 3 foreach { i => println(s"f1(i)=${f1(i)}") println(s"f2(i)=${f2(i)}") }
// Fetch first 3 results from memo caches and compute the other values 1 to 6 foreach { i => println(s"f1(i)=${f1(i)}") println(s"f2(i)=${f2(i)}") } }
You can run this solution by typing:
$ sbt "runMain solutions.MemoProd"
The output of f1
and f2
shows when a value is computed for the first time.
Computingf1(1)=3 f1(i)=3 Computing f2(1)=-15 f2(i)=-15 Computing f1(2)=5 f1(i)=5 Computing f2(2)=-9 f2(i)=-9 Computing f1(3)=7 f1(i)=7 Computing f2(3)=-3 f2(i)=-3 f1(i)=3 f2(i)=-15 f1(i)=5 f2(i)=-9 f1(i)=7 f2(i)=-3 Computing f1(4)=9 f1(i)=9 Computing f2(4)=3 f2(i)=3 Computing f1(5)=11 f1(i)=11 Computing f2(5)=9 f2(i)=9 Computing f1(6)=13 f1(i)=13 Computing f2(6)=15 f2(i)=15
Memoizing Arbitrary Methods
If you don’t want to mix in the Memoize
trait into the enclosing class or trait that contains the
method that you wish to memoize, or you can’t because you don’t have access to the source code or because it is a lambda, fear not!
You can use the technique shown in this section to memoize any Function1
, including anonymous Function1
s.
Recall that another name for an anonymous FunctionN
is lambda function, or simply lambda
(discussed in the Functions are First Class lecture of the
Introduction to Scala course).
The private[this]
access modifier was discussed in the
Deceptively Familiar: Access Modifiers lecture of the same course.
class Memoizer[Key, Value](f: Key => Value) { private[this] val vals = collection.mutable.Map.empty[Key, Value] def apply(x: Key): Value = if (vals.keySet.contains(x)) { vals(x) } else { val y = f(x) vals += x -> y y } }
object Memoizer { def apply[Key, Value](function1: Key => Value) = new Memoizer(function1) }
The Memoizer
object has an apply
method, which we have seen used as a factory method to create class instances many times before.The Memoizer
class also has an apply
method, which means that an instance of the class can invoke that apply
default method.
For example, let’s create an instance of the Memoizer
class called memoizedDoubler
which memoizes a lambda and then implicitly invoke the Memoizer
class’s apply
method.
scala> val memoizedDoubler = Memoizer((_:Int) * 2) memoize: Memoizer[Int,Int] = Memoizer@28e19366
scala> memoizedDoubler(3) res2: Int = 6
Because the memoized lambda is stored in a variable, we could write unit tests for it, which removes one of the primary complaints about lambdas.
Exercise – Wrapping a Method in the Memoizer Class
The following code uses the same calculatePiFor
and time
methods we’ve seen before in
order to compute Pi
to arbitrary precision and to measure the length of time it takes to perform the computation.
Your task is to use the Memoizer
class to compute piMemoized
.
object MemoDemo extends App { val calculatePiFor: Int => Double = (decimals: Int) => { var acc = 0.0 for (i <- 0 until decimals) acc += 4.0 * (1 - (i % 2) * 2) / (2 * i + 1) acc }
/** Measure execution time of the given block of code */ def time[T](msg: String)(block: => T): T = { val t0 = System.nanoTime() val result: T = block val elapsedMs = (System.nanoTime() - t0) / 1000000 println(s" Elapsed time for $msg: " + elapsedMs + "ms") result }
val piMemoized = ???
def doIt(n: Int): Double = time(s"pi to $n decimals")(piMemoized(n))
doIt(2000000) doIt(2000000) doIt(3000000) doIt(3000000) }
Solution
The solution may seem almost trivial, but do you understand it?
val piMemoized = Memoizer(calculatePiFor)
You can run this solution by typing:
$ sbt "runMain solutions.MemoDemo"
Memoizing Higher-Arity FunctionNs
If you have a method or FunctionN
that requires multiple arguments, you have two choices:
-
Use a tuple of the arguments as the key for one of the
memoize
functions we used earlier. This approach effectively folds an extra dimension’s worth of requests and responses into a 2D space. Lookups are a bit slower than for scalar keys, and memory usage could explode geometrically, but the results are still obtained much faster than without memoization. -
Create a partially applied function from the
FunctionN
to be memoized, and only memoize the partially applied function. This means that the remaining parameters must be provided If you need a review of partially applied functions and how they relate toFunctionN
s, see the Partially Applied Functions lecture before going any further.
Let’s explore each of these approaches. Each of them is fancy in its own special way.
If you need to memoize a method or FunctionN
with arity > 2 you could combine both of these approaches.
Memoizing Using a Tuple as a Key
The type parameter Key
could just as well be a TupleN[Type1, Type2...
TypeN]
instead of a scalar type like Int
or String
.
For example, let’s memoize a method that accepts three parameters.
def dupChop(string: String, times: Int, maxLength: Int): String = { val result = string * times result.substring(0, math.min(result.length, maxLength)) }
The dupChop
method can be converted to a Function1
that accepts a tuple quite easily.
As we learned in the Functions are First Class lecture,
we can lift the method to a Function3
by following it with a space, then an underscore, like this.
scala> dupChop _ res3: (String, Int, Int) => String = <function3>
Note that a Function3[String, Int, Int, String]
was returned,
but we wanted a Function1[Tuple3[String, Int, Int], String]
,
which can be written in shorthand as ((String, Int, Int)) => String
.
To do this, simply call the tupled
method on the Function3
.
scala>scala> (dupChop _).tupled
res4: ((String, Int, Int)) => String = <function3>
The syntax is easier to read when written using postfix notation, so I’ll enable that syntax as discussed in the Scala Imports and Packages lecture of the Introduction to Scala course.
scala> import scala.language.postfixOps import scala.language.postfixOps
scala> val tupledDupChop = dupChop _ tupled tupledDupChop: ((String, Int, Int)) => String = <function1>
Now that we have a Function1
called tupledDupChop
that accepts a
Tuple3[String, Int, Int]
and returns a String
.
Lets create an instance of the necessary type of tuple and then pass it to tupledDupChop
.
scala> val tuple = ("hi ", 3, 10) tuple: (String, Int, Int) = ("hi ",3,10)
scala> tupledDupChop(tuple) res5: String = "hi hi hi "
As you can see, we got the expected result from invoking tupledDupChop
: "hi hi hi"
.
We can memoize this tupled Function1
in two ways.
- By mixing the
Memoize
trait into the enclosingobject
,trait
orclass
- By wrapping
tupledDupChop
in aMemoizer
instance.
Here is an example of the latter approach:
scala> val tdc = Memoizer(tupledDupChop) tdc: Memoizer[(String, Int, Int),String] = Memoizer@75bd28d
scala> tdc(tuple) res6: String = "hi hi hi "
scala> tdc(("bye ", 5, 20)) res7: String = "bye bye bye bye bye "
You can run this code by typing:
$ sbt "runMain MemoizeTuple"
Memoizing Partially Applied Functions
By binding a FunctionN
parameter we can create a partially applied function.
Let’s define a method called method
that accepts two parameters.
scala> def method(i: Int, string: String): String = string * i method1: (i: Int, string: String)String
As you will see next, there is no difference in how memoization works regarding how the parameters are bound.
Binding a Parameter to a Variable
Now we’ll bind the first parameter to the value of key1
and lift it into a
partially applied Function1
using placeholder syntax.
scala> val key1 = 1 key1: Int = 1
scala> val pf1 = method(key1, _: String) pf1: String => String = <function1>
scala> pf1("hi ") res8: String = "hi "
When we invoked the partially applied function pf1
we just supplied the value of the second parameter,
and the result was returned.
Now lets memoize the partially bound Function1
and invoke it.
scala> val memoizedF1 = memoize(pf1) memoizedF1: String => String = <function1>
scala> memoizedF1("hi ") res10: String = "hi "
Binding a Parameter to a Constant
Let’s try that again, but instead of providing the value to be bound via a variable, we’ll provide a constant.
scala> val pf2 = method(2, _: String) pf2: String => String = <function1>
scala> pf2("bye ") res8: String = "bye bye "
Now lets memoize the partially bound Function1
and invoke it.
scala> val memoizedF2 = memoize(pf2) memoizedF2: String => String = <function1>
scala> memoizedF2("bye ") res11: String = "bye bye "
Binding a Parameter to the Result of a Computation
Let’s try that again, but this time provide a computation that will provide the value for the bound variable.
scala> val pf42 = method(key1*42, _: String) pf42: String => String = <function1>
scala> pf42("42 ") res9: String = "42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 "
Now lets memoize the partially bound Function1
and invoke it.
scala> val memoizedF42 = memoize(pf42) memoizedF42: String => String = <function1>
scala> memoizedF1("42 ") res12: String = "42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 "
© Copyright 1994-2024 Michael Slinn. All rights reserved.
If you would like to request to use this copyright-protected work in any manner,
please send an email.
This website was made using Jekyll and Mike Slinn’s Jekyll Plugins.