Mike Vanier (mvanier) wrote,
Mike Vanier
mvanier

Yet Another Monad Tutorial (part 3: The Monad Laws)

In the previous article I talked about the two fundamental monadic operations in the Monad type class: the bind operator (>>=) and the return function. In this article I'll complete the definition of the Monad type class and talk about monad laws.

The full Monad type class

Let's look at the entire definition of the Monad constructor class:

  class Monad m where
    (>>=) :: m a -> (a -> m b) -> m b
    return :: a -> m a
    (>>) :: m a -> m b -> m b
    fail :: String -> m a

We see the familiar >>= operator and the return function, with the expected types, but there is also one other operator called >> and one other function called fail. What do they mean?

The fail function is basically a very primitive kind of error reporting function. It is called when the >>= operator can't bind the value of type a to the input of the function of type a -> m b because of a pattern match error. I don't want to go into the details of this here because it's boring; see the documentation on the Haskell web site for more on this if you care. Most of the time you don't need to concern yourself with fail.

The >> operator is a bit more interesting. It has the type:

  (>>) :: m a -> m b -> m b

What this operator does is act as a monadic sequencing operator. Specifically, it's a version of monadic apply (>>= or "bind") which throws away the unpacked value of type a before executing the "action" of type m b. It's defined as follows:

  mv1 >> mv2  =  mv >>= (\_ -> mv2)

From this we can see that whatever value is unpacked from the monadic value mv1, it is discarded and then the final monadic value mv2 is returned. This turns out to be useful when the unpacked value has type () i.e. the unit type. A good example is the putStrLn function:

  putStrLn :: String -> IO ()

Imagine if you wanted to print two strings, one after another, with newlines at the end of each string. You can write this as follows:

  (putStrLn "This is string 1.") >> (putStrLn "This is string 2.")

It turns out that the >> operator has lower precedence than normal function application (via juxtaposition of the arguments to the function), so we can leave off the parentheses:

  putStrLn "This is string 1." >> putStrLn "This is string 2."

Regardless, why does this work? Let's look at the types:

  (putStrLn "This is string 1.") :: IO ()
  (putStrLn "This is string 2.") :: IO ()

So what the >> operator is doing is combining two monadic values of type IO () to generate one resulting monadic value of type IO (). Let's take the >> operator and specialize it for this case:

  (>>) :: m a -> m b -> m b

Here, m is IO and a and b are both (), so here,

  (>>) :: IO () -> IO () -> IO ()

This makes it at least plausible that what >> is doing is to run the two string printing "actions" one after another.

Here's a more complicated example. We want to read a line of text from the terminal and print it back out twice. We can do it as follows:

 readAndPrintLineTwice :: IO ()
 readAndPrintLineTwice = getLine >>= (\s -> (putStrLn s >> putStrLn s))

Because of operator precedences, we can write this without parentheses as:

 readAndPrintLineTwice :: IO ()
 readAndPrintLineTwice = getLine >>= \s -> putStrLn s >> putStrLn s

Let's figure out what this means. getLine is a monadic value ("action") which reads a line of text from the terminal. The >>= operator "unpacks" this line of text from the monadic value and binds it to the name s (because \s -> putStrLn s >> putStrLn s is the monadic function that is the second argument to >>=). Then the string called s is used by the monadic value putStrLn s >> putStrLn s, which prints out the string twice in sequence.

If this still seems mysterious to you, it's not your fault. There is something odd going on under the covers here, but I won't be able to explain exactly what until I talk about state monads later. But at least you should be able to follow what is happening, even if you aren't clear on exactly how it all happens.

Right now, I want to take a step back and look at monad laws, which have a big impact on what the definitions of the >>= operator and the return function are going to be for any given monad. After this, we'll get back to more practical matters.

The Three Laws of Monadics

Lots of important laws come in groups of three: Newton's three laws of motion, the three laws of thermodynamics, Asimov's Three Laws of Robotics, Kepler's three laws of planetary motion, etc. etc. Monads are no different, except that the "Three Laws of Monadics" are, of course, far more important than the others ;-)

In order for the >>= operator and the return function to be a valid definition of these operators/functions for a particular monad, they have to have the correct types for that monad. So, for instance, the definitions of >>= and return for the Maybe monad will have the types:

  (>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b
  return :: a -> Maybe a

and for the IO monad they would have the types:

  (>>=) :: IO a -> (a -> IO b) -> IO b
  return :: a -> IO b

However, this isn't enough. These operators/functions are also required to satisfy three "monad laws". The monad laws are actually very simple; they simply guarantee that monadic function composition behaves in a reasonable way. I'll give you the "nice" version of the monad laws first and then show you the (ugly) way in which they are usually described. (You'll thank me — the nice way is really much easier to understand.)

The nice version

Here is the nice definition of the three monad laws, in terms of monadic function composition (recall that the (>=>) operator is the monadic function composition operator):

  1. return >=> f       ==    f
  2. f >=> return       ==    f
  3. (f >=> g) >=> h    ==    f >=> (g >=> h)

What do these laws tell us?

Laws 1 and 2 tell us what return is supposed to be: it's an identity for monadic function composition (rule 1 says that return is a left identity, and rule 2 says that return is a right identity). In other words, composing a monadic function f with return (on either side) just gives you f back. This is analogous to the fact that 0 is an identity under addition of integers or 1 is an identity under multiplication of integers; in each case, the identity element combined with an arbitrary value using the appropriate operation just gives you the value back.

Law 3 tells us that monadic function composition is associative: when we want to compose together three monadic functions (f, g, and h), it doesn't matter which one we compose first. This is analogous to the fact that addition or multiplication are associative when applied to integers.

Why are these laws plausible? Let's look at the corresponding "laws" that regular function composition obeys:

  1. id . f       ==  f
  2. f  . id      ==  f
  3. (f . g) . h  ==  f . (g . h)

where id is the identity function. Notice the similarity? Composing a function with the identity function on either the right or left side gives the original function back, and function composition is associative. Similarly, monadic function composition should be associative, and return should be the monadic equivalent of the identity function, in order for monadic function composition to be as well-behaved as regular function composition.

What's the significance of these laws from the programmer's standpoint? Since we want our monads to behave in a sensible way, we want our definitions of return and >>= for each monad to obey these laws. This often can help us figure out what the correct definitions of return and >>= have to be for a given monad, and we'll see this later. [Note that the monad laws as I've defined them are in terms of the >=> operator and not the >>= operator, but we'll see a version using the >>= operator below that is equivalent.]

However, there is a catch: Haskell does not enforce the monad laws! The only thing Haskell cares about is that the definitions of return and >>= for a particular monad have the correct types. Whether they also obey the monad laws or not is up to the programmer.

Many people have thus asked "Why can't Haskell enforce the monad laws?" And the answer is simple: Haskell isn't powerful enough! To get a programming language and environment powerful enough to allow you to state and enforce monad laws from within the language, you would need to have something like a theorem prover. Theorem provers are fascinating, and for all I know they may represent the future of programming, but they are much more complicated than conventional programming languages. If you're interested in this, there is a well-respected theorem prover called Coq, and it's available here. But in Haskell, it's up to the programmer who writes the definition of any particular monad to make sure that the monad laws are upheld in that definition.

The ugly version

The problem with the nice version of the monad laws is that we don't define the >=> operator directly when defining a monad as an instance of the Monad type class; instead we define the >>= operator and then >=> follows from that as I've shown above. So if we want to constrain our definitions of return and >>= for any particular monad, we should have a version of the monad laws that only involve return and >>=. And this is what most Haskell textbooks and online documentation usually refer to when they discuss monad laws, even though it's far less intuitive than the definitions I gave in the last section.

In terms of the >>= operator and the return function, the three monad laws are:

  1. return x >>= f    ==  f x
  2. mv >>= return     ==  mv
  3. (mv >>= f) >>= g  ==  mv >>= (\x -> (f x >>= g))

where the types of the various values are:

  mv :: m a
  f  :: a -> m b
  g  :: b -> m c

for some types a, b, and c and some monad m.

Deriving the ugly version of the monad laws from the nice version

For fun, let's see how we can take the nice version of the monad laws and derive the ugly version from it. We will use this definition in the derivations:

  f >=> g  =  \x -> (f x >>= g)

which I discussed above.

Law 1:

  return >=> f            ==    f
  \x -> (return x >>= f)  ==    \x -> f x
  return x >>= f          ==    f x           -- Q.E.D.

Note that \x -> f x is identical to f as described above.

Law 2:

  f >=> return            ==    f
  \x -> (f x >>= return)  ==    \x -> f x
  f x >>= return          ==    f x
  let mv == f x
  mv >>= return           ==    mv            -- Q.E.D.

Law 3:

  (f >=> g) >=> h                 ==    f >=> (g >=> h)
  \x -> ((f >=> g) x >>= h)       ==    \x -> (f x >>= (g >=> h))
  (f >=> g) x >>= h               ==    f x >>= (g >=> h)
  (\y -> (f y >>= g)) x >>= h     ==    f x >>= (\y -> (g y >>= h))
  evaluate (\y -> (f y >>= g)) x on LHS to: (f x >>= g)
  (f x >>= g) >>= h               ==    f x >>= (\y -> (g y >>= h))
  let mv == f x
  (mv >>= g) >>= h                ==    mv >>= (\y -> (g y >>= h))
  substitute f for g, g for h
  (mv >>= f) >>= g                ==    mv >>= (\y -> (f y >>= g)) 
  substitute x for y in RHS
  (mv >>= f) >>= g                ==    mv >>= (\x -> (f x >>= g))   -- Q.E.D.

The evaluate (\y -> ...) x step just applies the (\y -> ...) function to x to get the result. This works by substituting x for y in the body of (\y -> ...) (the body is the part marked ...) to give the result. In functional programming lingo, this is called beta-reduction, which is the basic way functions get evaluated. The last step, where x is substituted for the bound variable y, is correct in the same way that these two functions:

  \x -> f x
  \y -> f y

are the same (the name of the formal argument is irrelevant). In functional programming lingo, we say that the two functions are alpha-equivalent. The other steps should be straightforward.

What's the point?

The monad laws can occasionally be useful when writing code; you can always take the longer form of an expression and rewrite it as the shorter form (for instance, replace all expressions of the form return x >>= f with just f x). However, the main utility of monad laws is in deriving the definitions of return and >>= for particular monads, which we'll see later in this series.

To wrap up this article, I want to show you a neat syntactic shorthand that can make writing monadic code a whole lot more pleasant.

The "do-notation"

Recall the readAndPrintLineTwice function we defined above:

 readAndPrintLineTwice :: IO ()
 readAndPrintLineTwice = getLine >>= \s -> putStrLn s >> putStrLn s

The good thing about this function is that it can be defined on a single line. The bad thing is that it's not exactly the most readable line in the world. The Haskell language designers noticed that monadic definitions were often hard to read and thought up a really nice bit of syntactic sugar to make these kinds of definitions more readable.

The basis of this syntactic sugar is the observation that a large number of operations in monadic code have one of the following two forms:

  -- Form 1.
  -- mv :: m a
  -- f  :: a -> m b

  mv >>= \x -> f x

  -- Form 2.
  -- mv  :: m a
  -- mv2 :: m b

  mv >> mv2

Therefore, a notation was figured out that made both of these forms very readable. It starts with the keyword do followed by some monadic operations. Here are our two examples in the do notation:

  -- Form 1, do-notation.

  do v <- mv
     f v

  -- Form 2, do-notation.

  do mv
     mv2

In form 1, the first line says that we take the monadic value mv and "unpack" it into a regular value called v. The second line just says that we then compute f v. The result of f v is the result of the entire expression.

In form 2, the first line says that we "perform" the monadic value ("action") mv. The second line says that we "perform" the monadic value ("action") mv2. So this is just a notation to describe the sequencing of mv and mv2 using the >> operator.

What the Haskell compiler does is convert the do-notation versions of Form 1 and Form 2 into the versions without do. This is just a syntactic transformation, and the meaning is identical. Furthermore, you can mix the two forms, and have multiple forms of each types. For instance:

  -- mv  :: m a
  -- v1  :: a
  -- f   :: a -> m b
  -- v2  :: b
  -- mv3 :: m c

  do v1 <- mv
     v2 <- f v1
     mv3
     return v2

This means exactly the same thing as:

  mv >>= (\v1 ->
    (f v1 >>= (\v2 ->
       (mv3 >> 
          (return v2)))))

Or, without the parentheses, as:

  mv >>= \v1 ->
    f v1 >>= \v2 ->
      mv3 >> return v2

You can imagine that as the monadic expressions get larger, the do form will continue to be easy to read, while the form without do (called the "desugared" form) will often be very hard to read. That's why this notation exists. What's cool is that this notation works for all monads, not just for any particular one.

You're also allowed to mix do-notation with the desugared notation, like this:

  do v1 <- mv
     v2 <- f v1
     mv3 >> return v2

Sometimes this is useful, but it can also often lead to ugly hard-to-read code.

Let's look at what our earlier examples would look like with the do-notation.

  -- Read a line, then print it.
  readAndPrintLine :: IO ()
  readAndPrintLine =
    do line <- getLine 
       putStrLn line

  -- Print two lines, one after another.
  -- Not a function.
  do putStrLn "This is string 1."
     putStrLn "This is string 2."

  -- Read a line and print it twice.
  readAndPrintLineTwice :: IO ()
  readAndPrintLineTwice =
    do line <- getLine
       putStrLn line
       putStrLn line

In all three cases, the do-notation makes the code significantly easier to read. Interestingly, it has the added benefit (or drawback, depending on your perspective) of making Haskell code look like imperative code! If we read a do-expression top to bottom, it looks like it was written in an imperative language that uses <- as an assignment statement. For instance, readAndPrintLine can be described as: "use getLine to read a line, which you put into the variable line; then use putStrLn to print out that variable." This is emphatically not what's actually happening (for instance, line is not a variable), but you can read the code as if it is. When you're dealing with code that does a lot of input and output, it can be very convenient to be able to write it this way.

The do-notation includes other features as well. For instance, you can incorporate let statements and case statements into the body of a do expression, and this is often handy. I won't go into this here, though, because it's pretty routine — see any Haskell tutorial for more details.

Next time

In the next installment I'll start deriving monads, starting with the Maybe monad (for computations that can fail) and continuing with the list monad (for computations that can return multiple results).

Subscribe

  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

  • 4 comments