Damian Nadales Damian Nadales - 29 days ago 9
Scala Question

Practical free monads for system-tests DSL: concurrency and error handling

I'm trying to write a DSL for writing system tests in Scala. In this DSL I don't want to expose the fact that some operations might take place asynchronously (because they are implemented using the web-service under test for instance), or that errors might occur (because the web-service might not be available, and we want the test to fail). In this answer this approach is discouraged, but I don't completely agree with this in the context of a DSL for writing tests. I think the DSL will get unnecessary polluted by the introduction of these aspects.

To frame the question, consider the following DSL:

type Elem = String

sealed trait TestF[A]
// Put an element into the bag.
case class Put[A](e: Elem, next: A) extends TestF[A]
// Count the number of elements equal to "e" in the bag.
case class Count[A](e: Elem, withCount: Int => A) extends TestF[A]

def put(e: Elem): Free[TestF, Unit] =
Free.liftF(Put(e, ()))

def count(e: Elem): Free[TestF, Int] =
Free.liftF(Count(e, identity))

def test0 = for {
_ <- put("Apple")
_ <- put("Orange")
_ <- put("Pinneaple")
nApples <- count("Apple")
nPears <- count("Pear")
nBananas <- count("Banana")
} yield List(("Apple", nApples), ("Pears", nPears), ("Bananas", nBananas))


Now assume we want to implement an interpreter that makes use of our service under test to put and count the elements in the store. Since we make use of the network, I'd like that the
put
operations take place asynchronously. In addition, given that network errors or server errors can occur, I'd like the program to stop as soon as an error occurs. To give an idea of what I want to achieve, here is an example of mixing the different aspects in Haskell by means of monad transformers (that I cannot translate to Scala).

So my question is, which monad
M
would you use for an interpreter that satisfies the requirements above:

def interp[A](cmd: TestF[A]): M[A]


And in case M is a monad tranformer, how would you compose them using the FP library of your choice (Cats, Scalaz).

Answer

Task (scalaz or better fs2) should satisfy all of the requirements, it doesn't need monad-transformer as it's already has Either inside (Either for fs2, \/ for scalaz). It also has a fast-fail behavior you need, same as right-biased disjunction/xor.

Here are several implementations that are known to me:

Regardless of monad-transformer absence, you still kinda need lifting when using Task:

  • from value to Task or
  • from Either to Task

But yes, it does seem to be simpler than monad transformers especially in respect to the fact monads are hardly composable - in order to define monad transformer you have to know some other details about your type besides being a monad (usually it requires something like comonad to extract value).

Just for advertising purposes, I would also add that Task represents stack-safe trampolined computation.

However, there are some projects focused on extended monadic composition, like Emm-monad: https://github.com/djspiewak/emm, so you can compose monad transformers with Future/Task, Either, Option, List and so on and so force. But, IMO, it's still limited in comparison with Applicative composition - cats provides universal Nested data type that allows to easily compose any Applicative, you can find some examples in this answer - the only disadvantage here is that it's hard to build a readable DSL using Applicative.

For example, as there is no FutureT/TaskT transformer you can't build effects like type E = Option |: Task |: Base (Option from Task) as such flatMap would require extraction of value from the Future/Task.

As a conclusion, I can say that from my experience Task really comes in hand for do-notation based DSLs: I had a complex external rule-like DSL for async computations and when I decided to migrate it all to Scala-embedded version Task really helped - I literally converted external-DSL to Scala's for-comprehension. Another thing we considered is having some custom type, like ComputationRule with a set of type classes defined over it along with conversions to Task/Future or whatever we need, but this was because we didn't use Free-monad explicitly.


You might even not need Free-monad here assuming you don't need an ability to switch interpreters (which might be true for just system tests). In that case Task might be the only thing you need - it's lazy (in comparison with Future), truly functional and stack-safe:

 trait DSL {
   def put[E](e: E): Task[Unit]
   def count[E](e: E): Task[Int]
 }

 object Implementation1 extends DSL {

   ...implementation
 }

 object Implementation2 extends DSL {

   ...implementation
 }


//System-test script:

def test0(dsl: DSL) = {
  import dsl._
  for {
    _ <- put("Apple")
    _ <- put("Orange")
    _ <- put("Pinneaple")
    nApples <- count("Apple")
    nPears <- count("Pear")
    nBananas <- count("Banana")
  } yield List(("Apple", nApples), ("Pears", nPears), ("Bananas", nBananas))
 }

So you can switch implementation by passing different "interpreter" here:

test0(Implementation1).unsafeRun
test0(Implementation2).unsafeRun

Differences/Disadvantages (in comparison with http://typelevel.org/cats/datatypes/freemonad.html):

  • you stuck with Task type, so you can't collapse it to some other monad easily.
  • implementation is resolved in runtime when you pass an instance of DSL-trait (instead of natural transformation), you can easily abstract it using eta-expansion: test0 _. Polymorphic methods (put, count) are naturally supported by Java/Scala, but poly functions aren't so it's easier to pass instance of DSL containing T => Task[Unit] (for put operation) than making synthetic polymorphic function DSLEntry[T] => Task[Unit] using natural-transform DSLEntry ~> Task.

  • no explicit AST as instead of pattern matching inside natural transformation - we use static dispatch (explicitly calling a method, which will return lazy computation) inside DSL trait

Actually, you can even get rid of Task here:

 trait DSL[F[_]] {
   def put[E](e: E): F[Unit]
   def count[E](e: E): F[Int]
 }

 def test0[M[_]: Monad](dsl: DSL[M]) = {...}

So here it might even become a matter of preference especially when you're not writing an open-source library.

Putting it all together:

import cats._
import cats.implicits._

trait DSL[F[_]] {
   def put[E](e: E): F[Unit]
   def count[E](e: E): F[Int]
 }

def test0[M[_]: Monad](dsl: DSL[M]) = {
    import dsl._
    for {
      _ <- put("Apple")
      _ <- put("Orange")
      _ <- put("Pinneaple")
      nApples <- count("Apple")
      nPears <- count("Pear")
      nBananas <- count("Banana")
    } yield List(("Apple", nApples), ("Pears", nPears), ("Bananas", nBananas))
 }

object IdDsl extends DSL[Id] {
   def put[E](e: E) = ()
   def count[E](e: E) = 5
}

Note that cats have a Monad defined for Id, so:

scala> test0(IdDsl)
res2: cats.Id[List[(String, Int)]] = List((Apple,5), (Pears,5), (Bananas,5))

simply works. Of course, you can choose Task/Future/Option or any combination if you prefer. As a matter of fact, you can use Applicative instead ofMonad`:

def test0[F[_]: Applicative](dsl: DSL[F]) = 
  dsl.count("Apple") |@| dsl.count("Pinapple apple pen") map {_ + _ }

scala> test0(IdDsl)
res8: cats.Id[Int] = 10

|@| is a parallel operator, so you can use cats.Validated instead of Xor, be aware that |@| for Task isn't executed (at least in older scalaz version) in parallel (parallel operator not equals parallel computation). You can also use a combination of both:

import cats.syntax._

def test0[M[_]:Monad](d: DSL[M]) = {
    for {
      _ <- d.put("Apple")
      _ <- d.put("Orange")
      _ <- d.put("Pinneaple")
      sum <- d.count("Apple") |@| d.count("Pear") |@| d.count("Banana") map {_ + _ + _}
    } yield sum
 }

scala> test0(IdDsl)
res18: cats.Id[Int] = 15