Road to Haskeller #21 - Weak Head Normal Form

Last Edited: 7/27/2024

The blog post introduces how lazy evaluation and seq are implemented in Haskell.

Haskell & WHNF

Reduction and Normal Form

The process of evaluating an expression until all the function applications have been expanded is called reduction, and the expression resulting from reduction that cannot be further reduced is called normal form. To understand what I mean by that, let's look at an example:

square :: Int -> Int
square x = x * x
 
fst :: (a, b) -> a
fst (x, y) = x

Now that we've defined the necessary functions, let's look at how reduction could be performed on a specific expression:

fst (sqare 3, square 4)
-- -> fst (3*3, square 4) (square)
-- -> fst (9, square 4) (*)
-- -> fst (9, 4*4) (square)
-- -> fst (9, 16) (*)
-- -> 9 (fst)

The above process of resulting in 9, which is in a normal form that cannot be further reduced, is an example of reduction. To be more precise, the above is an example of innermost reduction, which reduces all the arguments of the functions before applying the functions. However, we can notice that such a reduction strategy is not the best as it calculates square 4 unnecessarily. Here, we could have used outermost reduction for fewer reduction steps.

-- Outermost Reduction
fst (sqare 3, square 4)
-- -> square 3 (fst)
-- -> 3*3 (square)
-- -> 9 (*)

As the above reduction steps illustrate, outermost reduction aims to apply the function definition first before reducing the arguments of the function. One great thing about outermost reduction is that the Church-Rosser Theorem states that if there is one terminating reduction, then outermost reduction will terminate, too. This means that using outermost reduction guarantees us that we can terminate reduction on all expressions that have a way of terminating. To get an intuitive sense of the above, let's look at an example:

loop = 1 + loop
 
-- Outermost Reduction
fst (4, loop)
-- -> 4 (fst)
 
-- Innermost Reduction
fst (4, loop)
-- -> fst (4, (1+loop)) (loop)
-- -> fst (4, (1+(1+loop))) (loop)
-- :
-- Never Terminates!

The above demonstrates how there are multiple cases where outermost reduction can terminate while other reduction strategies cannot, due to how outermost reduction can often discard unnecessary arguments.

Graph Outermost Reduction

Although we would like to use outermost reduction due to the Church-Rosser Theorem, there are certain cases where innermost reduction leads to fewer computations:

-- Innermost Reduction
square (1+2)
-- -> square 3 (+)
-- -> 3*3 (square)
-- -> 9 (*)
 
-- Outermost Reduction
square (1+2)
-- -> (1+2)*(1+2) (square)
-- -> 3*(1+2) (+)
-- -> 3*3 (+)
-- -> 9 (*)

As we can see from the above, the outermost reduction has one extra step compared to innermost reduction because outermost reduction sometimes has to perform computations separately on the same argument. To make sure that outermost reduction performs at least the same steps as innermost reduction, we can share the outcome of the computation when the argument is the same (sharing), using graph outermost reduction.

-- Graph Outermost Reduction
square (1+2)
--     sqare
--      / \
--     _   _
--      \ /
--       +
--      / \
--    1     2

By recognizing the case where arguments are replicated and using a thunk _ (just like how we use let bindings to store a value in a variable and use that in multiple places), you can store the result of the computation and share that result accordingly, so that we can eliminate unnecessary computations. Haskell uses this graph outermost reduction strategy to ensure reduction termination and lazy evaluation with the fewest computations.

Weak Head Normal Form

Now that we understand how lazy evaluation is implemented, let's try to understand how the seq function is implemented. In the last article on strictness, the seq function was defined such that the first argument is evaluated before returning the second argument. However, to be more precise, the seq function is defined such that the first argument is in weak head normal form (WHNF) before returning the second argument, not in normal form.

For the data values, an expression is in WHNF when an expression is fully evaluated up to at least the first data constructor. Hence, the following shows how we convert data values into WHNF:

1+1 -- -> 2
(1, 2+2) -- -> (1, 2+2) 
-- We don't need to evaluate inside of tuple
[True && False] -- -> _ : _
-- We don't need to evaluate inside of list (_ is thunk)
data myData = Data Int
Data (2+2) -- -> A _
-- We don't need to evaluate inside of Data (_ is thunk)

As tuples and lists are both data constructors, we do not need to evaluate further. This also applies to your custom datatype, as in the case of myData. The functions can also be in WHNF when they are fully expressed in the form of an anonymous function.

f x = <expr> --- Not WHNF
f = \x -> <expr> --- WHNF
 
-- Not WHNF
f [] = 0
f x:xs = x + f xs
 
-- WHNF
f = \xs -> 
  case xs of 
  [] -> 0
  x:xs' -> x + f xs'

As seq forces the first argument to be evaluated only until it is in WHNF, seq behaves as follows. (Note: the :sprint command in GHCi can print an expression without evaluating it.)

swap :: (a, b) -> (b, a)
swap (x, y) = (y, x)
 
x = 1 + 2 : Int
a = (x, x+1)
b = swap a
 
-- ghci> :sprint a
-- a = _
 
-- ghci> :sprint b
-- b = _
 
-- ghci> seq a ()
-- a = (_, _)
 
-- ghci> seq b ()
-- b = (_, _)
 
-- ghci> seq a ()
-- a = (_, _)

At first, a and b are not in WHNF as they are not evaluated at all. Hence, if you call seq on them, it will just evaluate once to arrive at (_, _) for both in WHNF. However, even if we run seq again on a or b, (_, _) is already in WHNF, so it does not evaluate further. It is only until we print it out normally that the result is needed and everything gets evaluated.

Exercises

This is an exercise section where you can test your understanding of the material introduced in the article. I highly recommend solving these questions by yourself after reading the main part of the article. You can click on each question to see its answer.

Resources