In the previous article I talked about the two fundamental monadic operations in
the Monad
type class: the bind operator (>>=
) and the return
function. In
this article I'll complete the definition of the Monad
type class and talk
about monad laws.
The full Monad
type class
Let's look at the entire definition of the Monad
constructor class:
class Monad m where
(>>=) :: m a -> (a -> m b) -> m b
return :: a -> m a
(>>) :: m a -> m b -> m b
fail :: String -> m a
We see the familiar >>=
operator and the return
function, with the
expected types, but there is also one other operator called >>
and one
other function called fail
. What do they mean?
The fail
function is basically a very primitive kind of error reporting
function. It is called when the >>=
operator can't bind the value of type a
to the input of the function of type a -> m b
because of a pattern match
error. I don't want to go into the details of this here because it's boring;
see the documentation on the
Haskell web site
for more on this if you care. Most of the time you don't need to concern
yourself with fail
.
The >>
operator is a bit more interesting. It has the type:
(>>) :: m a -> m b -> m b
What this operator does is act as a monadic sequencing operator. Specifically,
it's a version of monadic apply (>>=
or "bind") which throws away the unpacked
value of type a
before executing the "action" of type m b
. It's defined as
follows:
mv1 >> mv2 = mv >>= (\_ -> mv2)
From this we can see that whatever value is unpacked from the monadic value
mv1
, it is discarded and then the final monadic value mv2
is returned. This
turns out to be useful when the unpacked value has type ()
i.e. the unit
type. A good example is the putStrLn
function:
putStrLn :: String -> IO ()
Imagine if you wanted to print two strings, one after another, with newlines at the end of each string. You can write this as follows:
(putStrLn "This is string 1.") >> (putStrLn "This is string 2.")
It turns out that the >>
operator has lower precedence than normal function
application (via juxtaposition of the arguments to the function), so we can
leave off the parentheses:
putStrLn "This is string 1." >> putStrLn "This is string 2."
Regardless, why does this work? Let's look at the types:
(putStrLn "This is string 1.") :: IO ()
(putStrLn "This is string 2.") :: IO ()
So what the >>
operator is doing is combining two monadic values of type IO
()
to generate one resulting monadic value of type IO ()
. Let's take the
>>
operator and specialize it for this case:
(>>) :: m a -> m b -> m b
Here, m
is IO
and a
and b
are both ()
, so here,
(>>) :: IO () -> IO () -> IO ()
This makes it at least plausible that what >>
is doing is to run the two
string printing "actions" one after another.
Here's a more complicated example. We want to read a line of text from the terminal and print it back out twice. We can do it as follows:
readAndPrintLineTwice :: IO ()
readAndPrintLineTwice = getLine >>= (\s -> (putStrLn s >> putStrLn s))
Because of operator precedences, we can write this without parentheses as:
readAndPrintLineTwice :: IO ()
readAndPrintLineTwice = getLine >>= \s -> putStrLn s >> putStrLn s
Let's figure out what this means. getLine
is a monadic value ("action") which
reads a line of text from the terminal. The >>=
operator "unpacks" this line
of text from the monadic value and binds it to the name s
(because \s ->
putStrLn s >> putStrLn s
is the monadic function that is the second argument to
>>=
). Then the string called s
is used by the monadic value putStrLn s >>
putStrLn s
, which prints out the string twice in sequence.
If this still seems mysterious to you, it's not your fault. There is something odd going on under the covers here, but I won't be able to explain exactly what until I talk about state monads later. But at least you should be able to follow what is happening, even if you aren't clear on exactly how it all happens.
Right now, I want to take a step back and look at monad laws, which have a big
impact on what the definitions of the >>=
operator and the return
function
are going to be for any given monad. After this, we'll get back to more
practical matters.
The Three Laws of Monadics
Lots of important laws come in groups of three: Newton's three laws of motion, the three laws of thermodynamics, Asimov's Three Laws of Robotics, Kepler's three laws of planetary motion, etc. etc. Monads are no different, except that the "Three Laws of Monadics" are, of course, far more important than the others ;-)
In order for the >>=
operator and the return
function to be a valid
definition of these operators/functions for a particular monad, they have to
have the correct types for that monad. So, for instance, the definitions of
>>=
and return
for the Maybe
monad will have the types:
(>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b
return :: a -> Maybe a
and for the IO
monad they would have the types:
(>>=) :: IO a -> (a -> IO b) -> IO b
return :: a -> IO b
However, this isn't enough. These operators/functions are also required to satisfy three "monad laws". The monad laws are actually very simple; they simply guarantee that monadic function composition behaves in a reasonable way. I'll give you the "nice" version of the monad laws first and then show you the (ugly) way in which they are usually described. (You'll thank me — the nice way is really much easier to understand.)
The nice version
Here is the nice definition of the three monad laws, in terms of monadic
function composition (recall that the (>=>)
operator is the monadic
function composition operator):
1. return >=> f == f
2. f >=> return == f
3. (f >=> g) >=> h == f >=> (g >=> h)
What do these laws tell us?
Laws 1 and 2 tell us what return
is supposed to be: it's an identity for
monadic function composition (rule 1 says that return
is a left identity,
and rule 2 says that return
is a right identity). In other words,
composing a monadic function f
with return
(on either side) just gives
you f
back. This is analogous to the fact that 0
is an identity under
addition of integers or 1
is an identity under multiplication of integers;
in each case, the identity element combined with an arbitrary value using the
appropriate operation just gives you the value back.
Law 3 tells us that monadic function composition is associative: when we want
to compose together three monadic functions (f
, g
, and h
), it doesn't
matter which one we compose first. This is analogous to the fact that
addition or multiplication are associative when applied to integers.
Why are these laws plausible? Let's look at the corresponding "laws" that regular function composition obeys:
1. id . f == f
2. f . id == f
3. (f . g) . h == f . (g . h)
where id
is the identity function. Notice the similarity? Composing a
function with the identity function on either the right or left side gives the
original function back, and function composition is associative. Similarly,
monadic function composition should be associative, and return
should be the
monadic equivalent of the identity function, in order for monadic function
composition to be as well-behaved as regular function composition.
What's the significance of these laws from the programmer's standpoint? Since
we want our monads to behave in a sensible way, we want our definitions of
return
and >>=
for each monad to obey these laws. This often can help us
figure out what the correct definitions of return
and >>=
have to be for a
given monad, and we'll see this later. [Note that the monad laws as I've
defined them are in terms of the >=>
operator and not the >>=
operator, but
we'll see a version using the >>=
operator below that is equivalent.]
However, there is a catch: Haskell does not enforce the monad laws! The
only thing Haskell cares about is that the definitions of return
and >>=
for
a particular monad have the correct types. Whether they also obey the monad
laws or not is up to the programmer.
Many people have thus asked "Why can't Haskell enforce the monad laws?" And the answer is simple: Haskell isn't powerful enough! To get a programming language and environment powerful enough to allow you to state and enforce monad laws from within the language, you would need to have something like a theorem prover. Theorem provers are fascinating, and for all I know they may represent the future of programming, but they are much more complicated than conventional programming languages. If you're interested in this, there is a well-respected theorem prover called Coq, and it's available here. But in Haskell, it's up to the programmer who writes the definition of any particular monad to make sure that the monad laws are upheld in that definition.
The ugly version
The problem with the nice version of the monad laws is that we don't define the
>=>
operator directly when defining a monad as an instance of the Monad
type
class; instead we define the >>=
operator and then >=>
follows from that as
I've shown above. So if we want to constrain our definitions of return
and
>>=
for any particular monad, we should have a version of the monad laws that
only involve return
and >>=
. And this is what most Haskell textbooks and
online documentation usually refer to when they discuss monad laws, even though
it's far less intuitive than the definitions I gave in the last section.
In terms of the >>=
operator and the return
function, the three monad laws
are:
1. return x >>= f == f x
2. mv >>= return == mv
3. (mv >>= f) >>= g == mv >>= (\x -> (f x >>= g))
where the types of the various values are:
mv :: m a
f :: a -> m b
g :: b -> m c
for some types a
, b
, and c
and some monad m
.
Deriving the ugly version of the monad laws from the nice version
For fun, let's see how we can take the nice version of the monad laws and derive the ugly version from it. We will use this definition in the derivations:
f >=> g = \x -> (f x >>= g)
which I discussed above.
Law 1:
return >=> f == f
\x -> (return x >>= f) == \x -> f x
return x >>= f == f x -- Q.E.D.
Note that \x -> f x
is identical to f
as described above.
Law 2:
f >=> return == f
\x -> (f x >>= return) == \x -> f x
f x >>= return == f x
let mv == f x
mv >>= return == mv -- Q.E.D.
Law 3:
(f >=> g) >=> h == f >=> (g >=> h)
\x -> ((f >=> g) x >>= h) == \x -> (f x >>= (g >=> h))
(f >=> g) x >>= h == f x >>= (g >=> h)
(\y -> (f y >>= g)) x >>= h == f x >>= (\y -> (g y >>= h))
evaluate (\y -> (f y >>= g)) x on LHS to: (f x >>= g)
(f x >>= g) >>= h == f x >>= (\y -> (g y >>= h))
let mv == f x
(mv >>= g) >>= h == mv >>= (\y -> (g y >>= h))
substitute f for g, g for h
(mv >>= f) >>= g == mv >>= (\y -> (f y >>= g))
substitute x for y in RHS
(mv >>= f) >>= g == mv >>= (\x -> (f x >>= g)) -- Q.E.D.
The evaluate (\y -> ...) x
step just applies the (\y -> ...)
function to x
to get the result. This works by substituting x
for y
in the body of (\y
-> ...)
(the body is the part marked ...
) to give the result. In functional
programming lingo, this is called beta-reduction, which is the basic way
functions get evaluated. The last step, where x
is substituted for the bound
variable y
, is correct in the same way that these two functions:
\x -> f x
\y -> f y
are the same (the name of the formal argument is irrelevant). In functional programming lingo, we say that the two functions are alpha-equivalent. The other steps should be straightforward.
What's the point?
The monad laws can occasionally be useful when writing code; you can always take
the longer form of an expression and rewrite it as the shorter form (for
instance, replace all expressions of the form return x >>= f
with just f x
).
However, the main utility of monad laws is in deriving the definitions of
return
and >>=
for particular monads, which we'll see later in this series.
To wrap up this article, I want to show you a neat syntactic shorthand that can make writing monadic code a whole lot more pleasant.
The "do-notation"
Recall the readAndPrintLineTwice
function we defined above:
readAndPrintLineTwice :: IO ()
readAndPrintLineTwice = getLine >>= \s -> putStrLn s >> putStrLn s
The good thing about this function is that it can be defined on a single line. The bad thing is that it's not exactly the most readable line in the world. The Haskell language designers noticed that monadic definitions were often hard to read and thought up a really nice bit of syntactic sugar to make these kinds of definitions more readable.
The basis of this syntactic sugar is the observation that a large number of operations in monadic code have one of the following two forms:
-- Form 1.
-- mv :: m a
-- f :: a -> m b
mv >>= \x -> f x
-- Form 2.
-- mv :: m a
-- mv2 :: m b
mv >> mv2
Therefore, a notation was figured out that made both of these forms very
readable. It starts with the keyword do
followed by some monadic
operations. Here are our two examples in the do
notation:
-- Form 1, do-notation.
do v <- mv
f v
-- Form 2, do-notation.
do mv
mv2
In form 1, the first line says that we take the monadic value mv
and "unpack"
it into a regular value called v
. The second line just says that we then
compute f v
. The result of f v
is the result of the entire expression.
In form 2, the first line says that we "perform" the monadic value ("action")
mv
. The second line says that we "perform" the monadic value ("action")
mv2
. So this is just a notation to describe the sequencing of mv
and mv2
using the >>
operator.
What the Haskell compiler does is convert the do
-notation versions of Form
1 and Form 2 into the versions without do
. This is just a syntactic
transformation, and the meaning is identical. Furthermore, you can mix the
two forms, and have multiple forms of each types. For instance:
-- mv :: m a
-- v1 :: a
-- f :: a -> m b
-- v2 :: b
-- mv3 :: m c
do v1 <- mv
v2 <- f v1
mv3
return v2
This means exactly the same thing as:
mv >>= (\v1 ->
(f v1 >>= (\v2 ->
(mv3 >>
(return v2)))))
Or, without the parentheses, as:
mv >>= \v1 ->
f v1 >>= \v2 ->
mv3 >> return v2
You can imagine that as the monadic expressions get larger, the do
form
will continue to be easy to read, while the form without do
(called the
"desugared" form) will often be very hard to read. That's why this notation
exists. What's cool is that this notation works for all monads, not just
for any particular one.
You're also allowed to mix do
-notation with the desugared notation, like
this:
do v1 <- mv
v2 <- f v1
mv3 >> return v2
Sometimes this is useful, but it can also often lead to ugly hard-to-read code.
Let's look at what our earlier examples would look like with the
do
-notation.
-- Read a line, then print it.
readAndPrintLine :: IO ()
readAndPrintLine =
do line <- getLine
putStrLn line
-- Print two lines, one after another.
-- Not a function.
do putStrLn "This is string 1."
putStrLn "This is string 2."
-- Read a line and print it twice.
readAndPrintLineTwice :: IO ()
readAndPrintLineTwice =
do line <- getLine
putStrLn line
putStrLn line
In all three cases, the do
-notation makes the code significantly easier to
read. Interestingly, it has the added benefit (or drawback, depending on your
perspective) of making Haskell code look like imperative code! If we read a
do
-expression top to bottom, it looks like it was written in an imperative
language that uses <-
as an assignment statement. For instance,
readAndPrintLine
can be described as: "use getLine
to read a line, which you
put into the variable line
; then use putStrLn
to print out that variable."
This is emphatically not what's actually happening (for instance, line
is
not a variable), but you can read the code as if it is. When you're dealing
with code that does a lot of input and output, it can be very convenient to be
able to write it this way.
The do
-notation includes other features as well. For instance, you can
incorporate let
statements and case
statements into the body of a do
expression, and this is often handy. I won't go into this here, though,
because it's pretty routine — see any Haskell tutorial for more
details.
Next time
In the next installment I'll start deriving monads, starting with the Maybe
monad (for computations that can fail) and continuing with the list monad (for
computations that can return multiple results).