Saturday, December 3, 2016

Tagless final effects à la Ermine Writers

This is the first of a four-part series on tagless-final effects.

“Finally tagless” notation is a nice way to get many of the benefits of free monads without paying so dearly in allocation of steps.

Watching John DeGoes’s 2015 presentation of free applicatives, it struck me that I didn’t quite like the notation of “finally tagless” demonstrated therein. To me, threading an algebra compares unfavorably to having an algebra in the program’s scope. The latter is the approach taken by the implementation of the Ermine Writers.

I think the style of effectful programs written like this will be appealing to programmers transitioning from out-of-control side effects to functional programming. By avoiding the sequence of intermediate command data structures that characterizes free monad effects, it saves significant runtime cost; the implementation of the interpreter should also be more obvious to the newcomer. On the other hand, it preserves the idea of multiple interpreter-dependent output types, and with it the testability benefit.

While this style does away with the abstract command structures, it preserves the limitations on available effects provided by good effect libraries. It does this by means of type-level abstraction rather than by means of a specific effect structure. This means that the abstraction is enforced, but erased; the concrete structures are the same as the interpreter output, though the code choosing effects can’t tell what that is.

There’s no library for me to mandate, or that you have to adopt. I recommend that you have a library like Scalaz or Cats with Monad, IO (for interpreters), and related functionality available, but it’s not a requirement. The code involved in adopting this style is specific to your use case.

While this is a good alternative to free monads, eff, and the like, it integrates well with them, too. You can combine these effects with other systems as convenient, either to implement interpreters or to produce effects within effect-abstract code, especially if you incorporate Monad.

How can this all be accomplished? With higher-kinded types, that is, abstraction over type constructors.

Declaring an algebra

As with other designs for constrained effects, you need to declare the various effects you’re going to support. For the sake of a little exoticism, I’m going to declare a simple filesystem interface.

trait FSAlg {
  // I'm using an abstract type constructor (i.e. FSAlg
  // is higher-kinded) for the effect type, but this
  // converts readily to a type parameter F[_] on FSAlg,
  // like Scanner in Ermine
  type F[A]

  def listDirectory(pathname: String): F[List[String]]

  def readFile(pathname: String): F[String]

  def writeFile(pathname: String, contents: String): F[Unit]
}

Writing an effectful program

Instances of FSAlg supply concrete operations for the F type constructor, and so must choose a concrete F as well. Concrete programs that choose which effects to perform should be abstract in F under this design approach.

Instances of FSAlg can be defined in typeclass style (in which case you should use a type parameter for F instead of a type member), or passed as normal arguments.

The other thing ‘effectful programs’ do in this style is return an F; specifically, the F associated with the algebra instance being passed in. In this way, this style radically departs from “dependency injection” or “the strategy pattern”—the “dependency” has a concrete influence on the public return type.

Organizing the program as a set of standalone methods makes it easy to use path-dependent style to return the correct type of effect, when using a type member.

def write42(alg: FSAlg)(pathname: String): alg.F[Unit] =
  alg.writeFile(pathname, "42")

A typeclass version would look more like

def write42[F[_]](pathname: String)(implicit alg: FSAlg[F]): F[Unit] =
  alg.writeFile(pathname, "42")

To avoid passing around the alg everywhere, you might put a whole group of methods under a class, and have the class take the algebra as a constructor parameter. This is a straightforward translation for the typeclass or simple type parameter approach.

class MyProg[F[_]](implicit alg: FSAlg[F]) {
  // several methods using F and alg
}

This is the organization style of Ermine Writers; each individual Writer class (e.g. HTMLWriter) is similar to MyProg.

Doing this with a type member is a little trickier; if you simple put a alg: FSAlg argument in the constructor, you’ll “forget” the F, existential-style. You can either put a bounded type parameter on the class for the alg:

class MyProg[Alg <: FSAlg](alg: Alg)

or a higher-kinded parameter, via the Aux pattern.

// object FSAlg
  type Aux[F0[_]] = FSAlg {type F[X] = F0[X]}

// replacing MyProg
class MyProg[F[_]](alg: FSAlg.Aux[F])

I think the latter yields easier-to-understand method types and type errors, but all of the above alternatives have equal power. So choose whatever seems nice, and change it later if you like.

Writing an interpreter

When writing the effectful program, you’re condemned to be free: you have to choose the effects to perform. The “interpreter”, which “executes” the effect, is more of a guided exercise. You must extend the algebra trait, FSAlg, implementing all of the abstract members.

object IOFulFSAlg extends FSAlg {
  import java.io.File, java.nio.file.{Files, Paths}

  type F[A] = () => A  // your choice!

  def listDirectory(pathname: String): () => List[String] =
    () => new File(pathname).list().toList

  def readFile(pathname: String): () => String =
    () => new String(Files readAllBytes (Paths get pathname))

  def writeFile(pathname: String, contents: String): () => Unit =
    () => Files write (Paths get pathname, contents.getBytes)
}

The key is to choose an F type—you can almost consider it an implementation detail of the class—that will allow you to implement the methods without side-effecting when they are called. I’ve made a good starter choice above, but the real magic happens when I choose more interesting Fs.

A test interpreter

With F abstract in “real” programs, we can choose different ones for different interpreters. Here we use one that allows simulation of the algebra methods without performing any side effects or mutation.

final case class Directory(
  listing: Map[String, Either[String, Directory]])

final case class Err(msg: String)

object TestMapAlg extends FSAlg {
  type F[A] = Directory => (Either[Err, A], Directory)

  private def splitPath(p: String) =
    p.split('/').toList

  private def readLocation(dir: Directory, p: List[String])
      : Option[Either[String, Directory]] =
    p match {
      case List() => Some(Right(dir))
      case k +: ks =>
        dir.listing get k flatMap {
          case r@Left(_) =>
            if (ks.isEmpty) None else Some(r)
          case Right(subd) => readLocation(subd, ks)
        }
    }

  def listDirectory(p: String): F[List[String]] =
    dir => (readLocation(dir, splitPath(p)) match {
              case None => Left(Err(s"No such file or directory $p"))
              case Some(Left(_)) =>
                Left(Err(s"$p is not a directory"))
              case Some(Right(Directory(m))) => Right(m.keys.toList)
            }, dir)

  def readFile(pathname: String): F[String] =
    dir => (readLocation(dir, splitPath(pathname)) match {
              case None => Left(Err(s"No such file or directory $pathname"))
              case Some(Right(_)) =>
                Left(Err(s"$pathname is a directory"))
              case Some(Left(c)) => Right(c)
            }, dir)

  def writeFile(pathname: String, contents: String): F[Unit] =
    dir => {
      def rec(subdir: Directory, path: List[String]): Either[Err, Directory] = 
        path match {
          case List(filename) => 
            Right(Directory(subdir.listing + ((filename, Left(contents)))))
          case dirname +: subpath =>
            val subsubdir = subdir.listing get dirname
            subsubdir match {
              case Some(Left(_)) =>
                Left(Err(s"$dirname is not a directory"))
              case Some(Right(d)) =>
                rec(d, subpath)
              case None =>
                rec(Directory(Map()), subpath)
            }
        }
      rec(dir, splitPath(pathname)) match {
        case Left(e) => (Left(e), dir)
        case Right(newdir) => (Right(()), newdir)
      }
    }
}

Executing the effects

The implementations of effectful programs in this scheme can’t tell what F is. But the code that chooses the interpreter and passes it to that abstract program does know. Accordingly, the type returned by invoking the program will change according to the F type of the interpreter you pass in.

scala> write42(IOFulFSAlg)("hello.txt")
res1: IOFulFSAlg.F[Unit] = <function0>

scala> res1()
// hello.txt appears on my disk. Guess what's in it?

scala> write42(TestMapAlg)("hello.txt")
res2: TestMapAlg.F[Unit] = <function1>

scala> res2(Directory(Map()))
res4: (Either[Err,Unit], Directory) =
  (Right(()),Directory(Map(hello.txt -> Left(42))))

As the invoker of the interpreter, this very concrete level—the first truly concrete segments of code I’ve shown in this post—is responsible for supplying the “execution environment”. It’s here that side effects—if any!—happen. For the first example, we can invoke the zero-argument function and watch the side effects happen. In the second case, we can make up a test Directory and inspect the resulting tuple for the final Directory state, and error if any.

Otherwise, the usual rules of the interpreter pattern apply; by inventing new instances of FSAlg, we can choose different things that should happen in the effects of the various algebra methods.

Effects must be delayed

You may be tempted to use an F like this:

type F[A] = A

and then do something like the IOFul implementation without the leading () =>. This will seem to work, but effectively prevent effectful programs from doing functional programming.

We can see why via a simple counterexample. Consider this simple program,

readFile("hello.txt")
// ...
writeFile("hello.txt", "33")

According to the rules of FP, the below program must always do the same thing as the above program.

val wf = writeFile("hello.txt", "33")
readFile("hello.txt")
// ...
wf

If calling writeFile performs a side effect right away, this will not hold true.

For a similar reason, naive memoization of the side effects’ results will also break FP. Consider this program:

readFile("hello.txt")
// ...
writeFile("hello.txt", "33")
// ...
readFile("hello.txt")

In FP, I can factor these two readFile calls to a val. If readFile memoizes with a local variable, though, the second use of that val gives the wrong file contents. (Of course, if you don’t have any effects in your algebra that can change the results of later effects, this is no problem!)

Otherwise, you don’t have to be wildly principled about the purity of your interpreters’ F choices, while still granting the benefits of pure FP to your effectful programs. Ermine writers’ interpreters use something like

type F[A] = java.sql.Connection => A

and it’s perfectly fine.

Relaxed rules in the interpreter from abstraction in the program

“Local mutation” is a broadly accepted way to implement pure functions. Even Haskell supports it, still without breaking the rules of the language, via the ST abstraction (covered in chapter 14 of Functional Programming in Scala). This is usually taken to refer strictly to mutation local to a certain dynamic scope, though; the universal quantification trick in ST is precisely meant to enforce the dynamic scope of mutable variables.

The Ermine Writers show us a different story, though. When using a concrete type for hiding side effects, like IO, you must be very careful to hide the runner, lest the side-effect be exposed.

Type-level abstraction such as in tagless-final changes both of these parts of the typical ‘local mutation’ method of design.

  1. Instead of being dynamically scoped, control over side effects is statically, or lexically scoped, to the code in the interpreter. This lends a new dimension to the idea of a side-effecting “shell” for a pure program—the “shell” is syntactic, not based on the patterns of function calls at runtime.
  2. The only care required to not break a type-level abstraction is to not pass something that would break it in the algebra, and to follow the Scalazzi Safe Scala Subset and avoid use of reified type information. (This will be discussed in more detail in part 3, “Working with the abstract F.)

The degree to which you can “break the rules” in your interpreter is directly proportional to the degree to which you enforce abstraction in effectful programs. Following the approach of examples in this article, there remains a great deal of freedom to experiment with interpreters that exploit your favorite mutation techniques to speed up interpretation. For example, a less naive memoization of readFile and listDirectory is admissible under the current algebra.

By contrast, if you expose too much detail about the F functor to effectful programs, then your interpreter becomes severely constrained. Suppose you define in the abstract algebra

def asReader[A](fa: F[A]): () => A

This may be expedient, but effectively demands that every interpreter works like IOFul; others cannot be safely implemented.

Still to come

  1. The role of Monad;
  2. Working with the abstract F;
  3. How much is this “dependency injection”?

Also, Adelbert Chang is covering “Monadic EDSLs in Scala” in a series over on Typelevel blog; he’s taking a different route to many of the same ideas as this series. I suggest checking out both his series and this one to find the most comfortable route for you.

This article was tested with Scala 2.12.0.

No comments: