Mike Slinn

Functions are First Class

— Draft —

Published 2014-01-13. Last modified 2016-07-14.
Time to read: 6 minutes.

This lecture lays the foundation for Scala's functional programming capability. It covers a lot of ground, and is referred to by many lectures in the Intermediate Scala course. You need to know this material well if you want to become a good Scala programmer.

Scala is a unique blend of object-oriented programming and functional programming. In Scala, functions are first-class objects. We will make a distinction between methods and functions in a moment.

The sample code for this lecture can be found in courseNotes/src/main/scala/Fun.scala.

Function Literals

A function literal is one possible syntax for defining a Scala function, and is an example of syntactic sugar. This syntax is useful for passing a function as an argument to a method. Here is an example of a function literal.

Scala REPL
scala> (a: Int, b: Double) => (a * b).toString
res0: (Int, Double) => String = <function2>
scala>
res0(2, 3) res1: String = 6.0

As you can see, function literals are defined by indicating the arguments that they accept, followed by a right arrow, followed by the implementation. The type of the above is (Int, Int) => Int. We can define a type alias to assist us when referencing that type.

Scala REPL
scala> type IntDblToStr = (Int, Double) => String
defined type alias IntDblToStr 

Binding to a variable

You can bind a function literal to a variable. Now you can reference the function and pass it parameters when you invoke it.

Scala REPL
scala> val mulStr = (a: Int, b: Double) => (a * b).toString
mulStr: (Int, Double) => String = <function2>
scala>
mulStr(1, 2) res3: String = 2.0

We could declare the type of the variable that references the function if we like:

Scala REPL
scala> val mulStr2: IntDblToStr = (a: Int, b: Double) => (a * b).toString
add: (Int, Double) => String = <function2>
scala>
mulStr2(1, 2) res3: String = 2.0

Placeholder Syntax

If you write an expression with an underscore in it, Scala will assume that you are defining a function literal and the underscore is a variable whose value will be supplied by a function parameter. You must provide the type of the variable unless it is somehow obvious, as shown.

Scala REPL
scala> val addFour = (_: Int) + 4
addFour: Int => Int = <function1>
scala>
addFour(10) res18: Int = 14
scala>
val multiplyThree = (_: Int) * 3 multiplyThree: Int => Int = <function1>
scala>
multiplyThree(20) res19: Int = 60

Desugared Syntax

The mulStr2 function above is desugared to:

Scala REPL
scala> val mulStr3: IntDblToStr =
  new Function2[Int, Double, String] {
    def apply(a: Int, b: Double): String = (a * b).toString
  }
mulStr3: (Int, Double) => String = <function2> 

The desugared function literal type is Function2[Int, Int, String], which is equivalent to (Int, Int) => String. Scaladoc uses the latter convention. The return type is shown last.

As we learned in the Learning Scala Using The REPL 1/3 lecture, the REPL will display the type of an expression following the :type command. Notice that the types of each of the following expressions are the same.

Scala REPL
scala> :type (a: Int, b: Double) => (a * b).toString
(Int, Double) => String
scala>
:type new Function2[Int, Double, String] { def apply(a: Int, b: Double): String = (a * b).toString } (Int, Double) => String

Function1 Shorthand

The Scaladoc hierarchy diagram for Function1 shows (T1) => R instead of writing Function1[T1, R]. This is consistent throughout all Scaladoc, not just the documentation for Function1 – although the names of the parametric types are arbitrary and therefore may change.

These three type definitions are synonyms:

(T1) => R
T1 => R
Function1[T1, R]

The Lambda Review & Drill lecture will give you additional practice on recognizing these forms.

Methods vs. Functions

Scala distinguishes between methods and functions. Methods are part of an object definition, and do not stand alone. Functions are first-class objects, which means they can be passed around.

You can think of a Scala object (ultimately derived from Any) as a container for holding properties and methods. Functions are a subset of Scala objects, which can be passed around as parameters and manipulated. Functions can have arity 1 through arity 22. (Note that the Scaladoc only shows Function1 and Function2, but the Scaladoc for the other 20 FunctionN definitions also exist.) Although Scala 2.11 removes the arity 22 limitation for case classes, Functions are unaffected by this improvement.

Methods can accept functions as arguments, but methods cannot take methods as arguments. However, you can convert a method to a function through a process called lifting. Methods can be lifted (converted) to Functions. Lifting wraps a method within a new Function instance. We will discuss lifting in more detail in a moment.

No-Argument Function Literals

The syntax for a no-argument function literal is somewhat intuitive. Here is the definition for a function that accepts no arguments, expressed as (), which is a synonym for Unit. Although its return type is not specified, the Scala compiler knows that System.getProperty returns String, so it assigns the same return type to the function.

Scala REPL
scala> val ud = () => System.getProperty("user.dir")
ud: () => String = <function0>
scala>
ud() res4: String = /home/mslinn

The desugared syntax for the above uses Function0 because no arguments are accepted by the anonymous function.

Scala REPL
scala> val ud2 = new Function0[String] {
  def apply(): String = System.getProperty("user.dir")
}
ud2: () => String = <function0>
scala>
ud2() res5: String = /home/mslinn

Notice that because the function does not accept arguments, only the return type String is shown as a parametric type.

Method lifting is also known as eta-expansion

Lifting a Method to a Function

You can also create a FunctionN by using the same syntax as was displayed in the REPL after you lifted a method into a function. For example, here is how to lift a method into an equivalent FunctionN that accepts the same arguments. First let’s define a singleton object that has a method repeat which repeats a string a given number of times.

Scala REPL
scala> object R1 {
  def repeat(string: String, times: Int): String = string * times
}
defined module R1
scala>
R1.repeat("asdf ", 3) res6: String = asdf asdf asdf

Scala 2 allowed an after a method name to lift the method from it original object container into a FunctionN container. This would allow the value to be stored into vals and not require defs.

Scala 3 disallows eta-expansion via underscore. Instead of writing R1.repeat _, to lift a function, it now must be written as ((string: String, times: Int) => R1.repeat). The code in src/main/scala/Fun.scala so it compiles with Scala 2 and 3:

Scala REPL
scala> val liftedFunction = ((string: String, times: Int) => R1.repeat)
function: String => (Int => String) = <function1>
scala>
liftedFunctionscala> ("asdf ", 3) res7: String = asdf asdf asdf

As you can see, the results of running a method and its equivalent FunctionN are identical.

To define a FunctionN without having to lift it from an object.

  1. Write the function type in terms of the arguments types and returned type, without variable names
  2. Repeat the pattern with with the variable names that correspond to each type. The FunctionN type is highlighted in yellow (it includes the return type, String), and the function implementation is highlighted in blue. Notice that the FunctionN type is on the left side of the equals sign and the implementation is on the right of the equals sign.
Scala REPL
scala> val f2: (String, Int) => String = (arg1, arg2) => arg1 * arg2
f2: (String, Int) => String = <function2>
scala>
f2("asdf ", 4) res10: String = asdf asdf asdf asdf

The above definition for f2 is somewhat redundant; we can use placeholder syntax to simplify the code to:

Scala code
val f2: (String, Int) => String = _ * _

The REPL displays the type of f2 using syntactic sugar. Another way to write the type is Function2[String, Int, String] as shown below. Using this syntax, the input parametric types precede the return type.

Scala REPL
scala> val f3: Function2[String, Int, String] = (arg1, arg2) => arg1 * arg2
f3: (String, Int) => String = <function2>
scala>
f3("asdf ", 4) res11: String = asdf asdf asdf asdf

The above definition for f3 is somewhat redundant; we can use placeholder syntax to simplify the code to:

Scala REPL
scala> val f3: Function2[String, Int, String] = _ * _
f3: (String, Int) => String = $$Lambda$856/0x0000000840552040@71c17a57 

We’ll discuss currying in the Partially Applied Functions lecture of the Intermediate Scala course. Partially applied functions use lifting extensively.

Exercise – Passing Functions as parameters

Given the following program:

Scala code
package solutions
 object FunSel extends App {
  type StringOp = (String, Int) => String
   def blackBox(f: StringOp, string: String, n: Int): String = f(string, n)
   val fn1: StringOp = _ substring _    // infix syntax
  //val fn1: StringOp = _.substring(_) // postfix syntax
  val fn2: StringOp = _ * _
   println(s"""fn1 supplied with "good/bad dog" and 5 gives: ${fn1("bad/good dog", 4)}""")
  println(s"""fn2 supplied with "arf " and 3 gives: ${fn2("arf ", 3)}""")
   println(s"""blackBox(fn1, "string", 3) = ${blackBox(fn1, "string", 3)}""")
  println(s"""blackBox(fn1, "string", 3) = ${blackBox(fn2, "string", 3)}""")
}
  1. What does the program do?
  2. How can you modify it so the string YourAnswerHere is replaced with some code that invokes fn1 and fn2?

Solution

StringOp defines the signature for the functions passed to blackBox; they accept a String and an Int parameter and return a String. blackBox executes whatever method that it receives and passes in two parameters: a String and an Int. blackBox was written to work with any function that accepts the proper signature.

Two functions are defined, both written with a combination of infix and placeholder syntax. Infix syntax should only be used for purely functional methods (methods with no side-effects).

  1. fn1 takes a substring of the first parameter it is passed using the Java String.substring method, which must be a String (remember that StringOp defines the parameters). The second parameter specifies where to start the substring. The substring continues to the end of the string. fn1 could be written using postfix notation like this:
    Scala code
    val fn1: StringOp = _.substring(_)
  2. fn2 repeats the first parameter it is passed, which must be a String. The second parameter, an Int, specifies the number of times to repeat the string.

The two println statements should be rewritten as follows:

Scala code
println(s"""fn1 supplied with "good/bad dog" and 5 gives: ${fn1("bad/good dog", 4)}""")
println(s"""fn2 supplied with "arf " and 3 gives: ${fn2("arf ", 3)}""")

You can run the solution as follows:

Shell
$ sbt "runMain solutions.FunSel"
fn1 supplied with "good/bad dog" and 5 gives: good dog
fn2 supplied with "arf " and 3 gives: arf arf arf 

SAM Types

The main reason Scala 2.12+ requires Java 8 is because of much greater efficiency representing functions, especially lambdas. Scala programs that use a lot of lambdas will be significantly smaller and faster when run on Scala 2.12-M5 or later. As an example, the popular scalaz functional programming library for Scala is 55% smaller when compiled on Scala 2.12-M5: 5 MB instead of 11 MB when compiled on Scala 2.11.8. It also runs faster and has less memory churn.

Prior to Scala 2.12-M5, the Scala compiler implemented function literals by creating an anonymous class for each function literal. Newer versions of the Scala compiler implement function literals using Java 8’s native function literal support. The Scala 2.12-M5 release notes say.

For each lambda the compiler generates a method containing the lambda body, and emits an invokedynamic that will spin up a lightweight class for this closure using the JDK’s LambdaMetaFactory.

Compared to Scala 2.11, the new scheme has the advantage that, in most cases, the compiler does not need to generate an anonymous class for each closure. This leads to significantly smaller JAR files.

Here is some background. The runtime representation of a lambda is a functional interface, which is also referred to as a interface that defines a single abstract method (SAM type). Java 8 includes a number of functional interfaces, such as Runnable and Comparable; for for information see the java.util.fuction package Javadoc. Java 8’s lambda support is defined in JSR 335, also known as Project Lambda. The Java 8+ JVM provides a new instruction, invokedynamic, which allows lambdas to be implemented without creating anonymous inner classes.

The More Fun With Functions lecture will show you SAM types in action.


* indicates a required field.

Please select the following to receive Mike Slinn’s newsletter:

You can unsubscribe at any time by clicking the link in the footer of emails.

Mike Slinn uses Mailchimp as his marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp’s privacy practices.