Mike Slinn

Closures

— Draft —

Published 2015-03-10. Last modified 2015-11-19.
Time to read: 5 minutes.

Scala closures are not well understood by many programmers, however an understanding of them is crucial to your ability to write correct programs. This information is necessary in order to work through the Higher-Order Functions of the Intermediate Scala course.

A solid understanding of closures is also essential in order to work with multithreaded code as described in the Intermediate Scala course. The Future Bad Habits and Exercise lecture from that same course discusses closures further.

The sample code for this lecture can be found in courseNotes/src/main/scala/Closures.scala.

Definition

A closure is a function reference or a lambda function together with any variables referenced in the lexical scope where the function is invoked. A free variable (also known as an open binding) is a variable used in a method or function that is neither a locally defined variable nor a parameter of that method or function; in other words, a free variable / open binding is a variable used within a method or FunctionN, but defined in an enclosing scope. When the closure is defined, any free variables / open bindings are copied (bound) from the environment as it existed at that point in time into the closure.

To elaborate on the definition I just gave, a Scala closure is actually a data structure that stores a FunctionN together with an environment; the closure environment is a mapping of the FunctionN’s free variables to the value or storage location the variable names were bound to at the time the closure was created. A FunctionN is not a closure if its implementation only references its parameters, which means that it has no free variables / open bindings.

Here is a more succinct definition: a closure is a FunctionN whose open bindings (free variables) \ have been closed by (or bound to) the lexical environment, resulting in a closed expression, or closure.

Objects are data structures with functions.
Closures are functions with data.

Example

Let’s look at an example, where we define a Function1[Int, String] called repeat:

Scala REPL
scala> val msg = "Blah.
"
msg: String = "Blah.
"
scala>
val repeat: Int => String = msg * (_: Int) repeat: Int => String = <function1>
scala>
println(repeat(3)) Blah. Blah. Blah.

msg is a free variable that is bound to repeat because it is referenced in the body of that function; we say that repeat captures or closes over msg by copying the then-current value of the outer msg variable into the body of the repeat Function1 when the Function1 is created.

A closure often references a variable in an outer scope.

Scala REPL
scala> object Outer {
     |   val msg2 = "One more time! "
     |   object Inner {
     |     val repeat2: Int => String = msg2 * (_: Int)
     |     println(repeat2(3))
     |   }
     | }
defined object Outer
scala>
Outer.Inner One more time! One more time! One more time! res0: Outer.Inner.type = Outer$Inner$@71ba266a

Here we see that repeat2 closes over msg2. As we shall learn in the Futures & Promises lecture in the Intermediate Scala course, you need to be careful with closures that exist on another thread.

You can run this code by typing:

Shell
$ sbt "runMain Closures"

Wikipedia has a good article on closures.

Closure Improvements with Scala 2.12

This section is paraphrased from the Scala 2.12M3 release notes.

Scala 2.12 emits closures (as JVM byte code) in the same style as Java 8.

For each lambda the compiler generates a method containing the lambda body. At runtime, this method is passed as an argument to the LambdaMetaFactory provided by the JDK, which creates a closure object.

Compared to Scala 2.11, the new scheme has the advantage that the compiler does not generate an anonymous class for each lambda anymore. This leads to significantly smaller JAR files, less memory usage and faster code.

Q&A (Supplemental)

This is an edited transcript of a Q & A with a student about closures.

Question 1

I read this StackOverflow thread about closures. I understand everything until this:

Scala solves his problem by replacing var token in the closure by reference objects. Given a class:

Scala code from Stackoverflow
class Ref[A](var content: A)

The code is first replaced b.

Scala code
val iRef = new Ref[Int](0)
l.foreach { a =>
  println(iRef.content + ": " + a);
  iRef.content += iRef.content + 1
}
println(s"There are $i elements in the list")

This is done of course only to var that happens to be taken by a closure, not to every var. Doing that, the var has been replaced by a val, the actual variable value has been moved into the heap. Now, the closure can be done as usual, and it works.

Scala code
class SomeFreshName(iRef: Ref[Int]) ...

What is going on here?

Didier Dupont, the author of the second answer that he referenced, is very good with Scala, however his English is weak and so his explanations can be confusing. Daniel Sobral, who commented on Didier’s answer, is also very strong with Scalas, and his English is also excellent. This StackOverflow answer, and the one before it, are both good discussions about closures.

The block in question contains an instance of Ref. As with Java, when an instance of a class is passed as a parameter into a Scala method, only a reference to the class instance is received by the method, not a copy of the entire object. This means that the properties of the reference which can be accessed within the method are actually the properties of the original object, not a copy of the object. If the method returns a property from the object reference, that prevents the original object from being garbage collected because it must remain in memory in order for the property to be available for inspection.

I suggest you write a short console application to test it out. Warning: the Scala REPL’s handling of closures is significantly different than how closures are handled in a real Scala program. Worksheets resemble the REPL in this regard. If instead you use a debugger (IntelliJ IDEA or Visual Studio Code with Metals) on a real Scala program, and place breakpoints at various locations, you will be able to examine the stack at various points. This will show you the variable references held by closures. You will be able to disambiguate the variable references in the debugger by paying attention to their hex addresses.

Question 2

What is going on here?

Scala REPL
scala> var msg2 = "Three "
msg2: String = Three
scala>
val test: Int => String = msg2 * (_: Int) test: Int => String = <function1>
scala>
println(test(3)) Three Three Three
scala>
msg2 = "Two " msg2: String = Blah6
scala>
println(test(2)) Two Two

The above shows how a Function1 (test) references external state (msg2) as a closure. msg2 is a free variable because it is defined outside the scope of the test body. Each time test is evaluated the current value of msg2 is referenced.

As we will learn in Future Bad Habits and Exercise lecture of the Intermediate Scala course, accessing mutable free variables from a closure in a different thread context often introduces data value inconsistencies.

Question 3

What is going on here?

Scala REPL
scala> class Cell(var x:Int)
defined class Cell
scala>
var c = new Cell(1) c: Cell = Cell@2e385cce
scala>
val f1 = () => c.x f1: () => Int = <function0>
scala>
println(f1()) 1
scala>
c = new Cell(10) c: Cell = Cell@5e21e98f
scala>
println(f1()) 10

The Scala compiler recognizes that the Function0 (f1) references a property of a class instance (c.x). Unlike the previous example, the compiler binds the Function0 to the variable c instead of a copy of c. Each time f1 is evaluated, c is dereferenced and the property x from the Cell reference is retrieved.

BTW, if this example was executed in a multithreading environment, such that f1 was created in one execution context and it was evaluated in another execution context, the explanation would be more complex, and the results would be different, as well as being a source of hard-to-find bugs. I discuss this in the MultiThreading lecture of the Intermediate Scala course.


* indicates a required field.

Please select the following to receive Mike Slinn’s newsletter:

You can unsubscribe at any time by clicking the link in the footer of emails.

Mike Slinn uses Mailchimp as his marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp’s privacy practices.