It has been almost half year since my last blog post on OCaml. Well, actually I haven't even touched OCaml for that long time. My current job (in Python) got much busier. But finally, things ~~camls~~ calms down a bit now. I can at least have time sitting in front of my 2008 macbook pro, opening terminal, doing

```
opam update
opam upgrade
opam switch 4.02.3
eval `opam config env`
```

Hmmmm, I like this feeling.

**typeocaml** is back now. I digged my typeocaml notebook from one of our 78 packing boxes, and went through the list of *many things about OCaml* that I planned to write about. Those items still look familiar but obviously not very clear in my mind now. One should never put a vacuum space in things beloved. They would fade.

Anyway, enough chit-chat, back to the topic.

This post is about visualising the randomness of random number generators. It will be lightweight and is just something I did for fun with OCaml. We will know a good way to test randomness and also learn how to create a *jpg* picture using an OCaml image library: **camlimages**. I hope it would be tiny cool.

Say, we got a function `var ran_gen : int -> int`

from nowhere. It claims to be a good random integer generator with uniform distribution, which takes a bound and "perfectly" generates a random integer within [0, bound). The usage is simple but the question is *can you trust it* or *will the numbers it generates be really random*?

For example, here is a rather simple `ran_gen_via_time`

:

```
let decimal_only f = f -. (floor f)
let ran_via_time bound =
((Unix.gettimeofday() |> decimal_only) *. 100,000,000. |> int_of_float) mod bound
(*
Unix.gettimeofday() returns float like 1447920488.92716193 in second.
We get the decimal part and then amplify it to be integer like, i.e., 1447920488.92716193 ==> 0.92716193 ==> 92716193.
Then we mod it by bound to get the final "random" number.
*)
```

This generator is based on the *timestamp* when it is called and then mod by the *bound*. Will it be a good generator? My gut tells me *not really*. If the calls to the function has some patterns, then eaily the numbers become constant.

For example, if the *bound* is 10, then at *t* I make first call, I will get `t mod 10`

. If I call at *t + 10* again, I will get `(t+10) mod 10`

which is actually `t mod 10`

. If I call it every 10 seconds, then I get a constant. Thus this generator's randomness would not be as good as it claims to be.

For any given random number generator, we have to really make sure its randomness is good enough to suite our goal, especially when we have to rely on them for trading strategies, gaming applications, online gambling hosting, etc.

However, we also need to be aware of that most of random generators are not perfect (*Random number generator* and *Pseudorandom number generator*). What we often do is just to see whether the randomness of a given generator can reach a certain level or not.

Chi-squared test is a fairly simple and common methodology to test randomness mathematically.

Say, we have a generator producing random integers between [0, 10) with uniform distribution. So ideally, if the generator is perfect, and if we ran 1000 times, then there would be 100 of *0*, 100 of *1*, ..., 100 of *9* , right? For the test, we can just try the generator *N* times, and see whether the frequency of each number generated somehow matches or is close to *N / bound*.

Of course, the frquencies of numbers would not exactly match expectation. So we need some mathematics to measure the level of matching.

*k*is the number of possible candidates - e.g.,*k = 10*for [0, 10)*Ei*is the expected frequency for each candidate -*i = 1, 2, k**Oi*is the frequency for each candidate produced by the generator -*i = 1, 2, k*

Thus we can get `X^2 = (O1-E1)^2/E1 + ... + (Ok-Ek)^2/Ek`

Essentially, the bigger `X^2`

is, the matching is more unlikely. If we really want to see how unlikely or likely they match each other, then we need to check `X^2`

against the *chi square distribution table* like below:

- The
*Degrees of Freedom*is our`k-1`

(if*k = 1*, then our freedom is*0*, which means we just have one to choose all the time and don't have any freedom). - The bold headings of columns are the probabilities that the observations match the expectations.
- The many numbers in those cells are values of
`X^2`

.

For example, say in our case above, `k = 10`

, so *Degree of freedom is 9*.

If we get `X^2`

as *2*, which is less than *2.088*, then we can say we have more than *0.99* probability that the observations match the expectations, i.e., our random generator follows uniform distribution very well.

If we get `X^2`

as *23*, which is bigger than *21.666*, the probability of matching is less than *0.01*, so our generator at least is not following uniform distribution.

This is roughly how we could use Chi-squared test for randomness checking. Of course, the description above is amateur and I am just trying to put it in a way for easier understanding.

Let's admit one thing first: Math can be boring. Although math can describe things most precisely, it does not make people like me feel intuitive, or say, straightforward. If I can directly see problems and the solutions via my eyes, I would be much happier and this was the reason why I tried to visualise the randomness of generators.

The way of the visualisation is fairly simple.

- We create a canvas.
- Each pixel on the canvas is a candidate random number that the generator can produce.
- We run the generator for lots of times.
- The more times a pixel gets hit, we draw a deeper color on it.
- In the end, we can directly feel the randomness depending on the color distribution on the canvas.

Initially we have such a canvas.

We use the random generator generating numbers. If a slot get a hit, we put a color on it.

If any slot keeps been hit, we put deeper and deeper color on it.

When the generator finishes, we can get a final image.

From the resulting image, we can see that several numbers are really much deeper than others, and we can directly get a feeling about the generator.

Of course, this is just trivial. Normally, we can get much bigger picture and see the landscape of the randomness, instead of just some spots. Let's get our hands dirty then.

Assuming the essentials of OCaml, such as *ocaml* itself, *opam* and *ocamlbuild*, have been installed, the only 3rd party library we need to get now is camlimagges.

Before invoke `opam install camlimages`

, we need to make sure *libjpeg* being installed first in your system. Basically, *camlimages* relies on system libs of *libjpeg*, *libpng*, etc to save image files in respective formats. In this post, we will save our images to `.jpg`

, so depending on the operating system, we can just directly try installing it by the keyword of *libjpeg*.

For example,

*mac*->`brew install libjpeg`

;*ubuntu*->`apt-get install libjpeg`

;*redhat*->`yum install libjpeg`

After *libjpeg* is installed, we can then invoke `opam install camlimages`

.

In addition, for easier compiling or testing purposes, maybe ocamlbuild `opam install ocamlbuild`

and utop `opam install utop`

could be installed, but it is up to you.

```
brew install libjpeg
opam install camlimages
opam install ocamlbuild
opam install utop
```

The presentation of an *RGB* image in memory is a bitmap, which is fairly simple: just an 2-d array (*width* x *height*) with each slot holding a color value, in a form like *(red, green, blue)*. Once we have such a bitmap, we can just save it to the disk in various formats (different commpression techniques).

So the process could be like this:

- Create a matrix array, with certain size (
*width*and*height*) - Manipulate the values (colors) of all slots via random generated numbers
- Convert the matrix bitmap to a real image
- Save the image to a
*jpg*file

First, let's create a matrix.

```
open Camlimages
(* w is the width and h is the height *)
let bitmap = Array.make_matrix w h {Color.r = 255; g = 255; b = 255}
(*
Color is a type in camlimages for presenting RGB colors,
and initially white here.
*)
```

When we generate a random number, we need to map it to a slot. Our image is a rectangle, i.e., having rows and columns. Our random numbers are within a bound *[0, b)*, i.e., 1-d dimension. A usual way to convert from 1-d to 2-d is just divide the number by the width to get the row and modulo the number by the width to get the col.

```
let to_row_col ~w ~v = v / w, v mod w
```

After we get a random number, we now need to fill its belonging slot darker. Initially, each slot can be pure white, i.e., `{Color.r = 255; g = 255; b = 255}`

. In order to make it darker, we simply just to make the *rgb* equally smaller.

```
let darker {Color.r = r;g = g;b = b} =
let d c = if c-30 >= 0 then c-30 else 0
in
{Color.r = d r;g = d g;b = d b}
(*
The decrement `30` really depends on how many times you would like to run the generator
and also how obvious you want the color difference to be.
*)
```

And now we can integrate the major random number genrations in.

```
(*
ran_f: the random number generator function, produce number within [0, bound)
fun: int -> int
w, h: the width and height
int
n: the expected times of same number generated
int
Note in total, the generator will be called w * h * n times.
*)
let random_to_bitmap ~ran_f ~w ~h ~n =
let bitmap = Array.make_matrix w h {Color.r = 255; g = 255; b = 255} in
let to_row_col ~w ~v = v / w, v mod w in
let darker {Color.r = r;g = g;b = b} = let d c = if c-30 >= 0 then c-30 else 0 in {Color.r = d r;g = d g;b = d b}
in
for i = 1 to w * h * n do
let row, col = to_row_col ~w ~v:(ran_f (w * h)) in
bitmap.(row).(col) <- darker bitmap.(row).(col);
done;
bitmap
```

We will use the module *Rgb24* in *camlimages* to map the matrix to an image.

```
let bitmap_to_img ~bitmap =
let w = Array.length bitmap in
let h = if w = 0 then 0 else Array.length bitmap.(0) in
let img = Rgb24.create w h in
for i = 0 to w-1 do
for j = 0 to h-1 do
Rgb24.set img i j bitmap.(i).(j)
done
done;
img
```

Module *Jpeg* will do the trick perfectly, as long as you remembered to install *libjpeg* before *camlimages* arrives.

```
let save_img ~filename ~img = Jpeg.save filename [] (Images.Rgb24 img)
```

By grouping them all together, we get our master function.

```
let random_plot ~filename ~ran_f ~w ~h ~n =
let bitmap = random_to_bitmap ~ran_f ~w ~h ~n in
let img = bitmap_to_img ~bitmap in
save_img ~filename ~img
```

All source code is in here https://github.com/MassD/typeocaml_code/tree/master/visualise_randomness

OCaml standard lib provides a random integer genertor:

```
val int : int -> int
Random.int bound returns a random integer between 0 (inclusive) and bound (exclusive). bound must be greater than 0 and less than 2^30.
```

Let's have a look what it looks like:

```
let _ = random_plot ~filename:"random_plot_int.jpg" ~ran_f:Random.int ~w:1024 ~h:1024 ~n:5
```

Is it satisfying? Well, I guess so.

Let's try it on the `ran_gen_via_time`

generator we invented before:

```
let decimal_only f = f -. (floor f)
let ran_via_time bound =
((Unix.gettimeofday() |> decimal_only) *. 100,000,000. |> int_of_float) mod bound
let _ = random_plot ~filename:"random_plot_time.jpg" ~ran_f:ran_via_time ~w:1024 ~h:1024 ~n:5
```

Is it satisfying? Sure not.

Apparently, there are quite some patterns on images. Can you identify them?

For example,

One pattern is the diagonal lines there (parrallel to the red lines I've added).

Of course, visualisation of randomness is nowhere near accurately assess the quality of random geneators. It is just a fun way to feel its randomness.

I hope you enjoy it.

Pointed by Ono-Sendai on hacker news, it might be better to use `png`

rather than `jpg`

.

I've tried and the results for the bad `ran_gen_via_time`

are like:

Seems not that different from my eyes. But anyway it was a good point.

https://github.com/MassD/typeocaml_code/blob/master/visualise_randomness/plot_random.ml has been updated for `png`

support. **Please remember to install libpng-devel for your OS for png saving support**.

@h4nnes has suggested me to try the *fortuna generator* from nocrypto.

```
opam install nocrypto
```

and

```
let _ = Nocrypto_entropy_unix.initialize()
let _ = random_plot ~filename:"random_plot_fortuna" ~ran_f:Nocrypto.Rng.Int.gen ~w:1024 ~h:1024 ~n:5
```

and we get

It is a very nice generator, isn't it?

]]>In this post, we will talk about producing permuations using OCaml. Generating permutations was actually one of my first self-homeworks when I started to learn OCaml years ago. It can be a good exercise to train our skills on *list*, *recursion*, foundamental *fold*, *map*, etc, in OCaml. Also it shows the conciseness of the OCaml as a language.

We will first present **2 common approaches** for generating all permutations of a list of elements.

Then we introduce the Johnson Trotter algorithm which enable us to generate one permutation at a time.

Although Johnson Trotter algorithm uses imperative array, it provides us the opportunity to implement a stream (using stream list or built-in `Stream`

module) of permutations, i.e., we generate a permutation only when we need. We will describe that as last.

We can generate permutations using *recursion* technique. As we descriped in Recursion Reloaded, let's first assume we already get a function that can produce all permuations.

What exactly `f`

will generate is not our concern for now, but we are sure that given a list (say 3 distinct elements), `f`

will produce a list of permutations (totally 6 in this example).

So now what if our original list has one more new element?

What should we do to combine the new element together with the old list of permutations, in order to generate a new list of permuatations?

Let's first take a look at how to combine the new element with one permutation.

A good way, like shown above, is to insert the new element into all possible positions. Easy, right?

So for the new list of permutations, we just insert the new element into all possible positions of all old permutations.

First let's implement the function that combines an element with a permutation (which is actually a normal list).

```
(* note that in order to preserve certain order and also show the conciseness of the implementation, no tail-recursive is used *)
let ins_all_positions x l =
let rec aux prev acc = function
| [] -> (prev @ [x]) :: acc |> List.rev
| hd::tl as l -> aux (prev @ [hd]) ((prev @ [x] @ l) :: acc) tl
in
aux [] [] l
```

Now the main permutations generator.

```
let rec permutations = function
| [] -> []
| x::[] -> [[x]] (* we must specify this edge case *)
| x::xs -> List.fold_left (fun acc p -> acc @ ins_all_positions x p ) [] (permutations xs)
```

Here is the Gist.

There is another way to look at the permutations.

For a list of 3 elements, each element can be the head of all permutations of the rest elements. For example, *blue* is the head of the permutations of *green* and *yellow*.

So what we can do is

- Get an element out
- Generate all permutations on all other elements
- Stick the element as the head of every permutation
- We repeat all above until we all elements have got their chances to be heads

First we need a function to remove an element from a list.

```
let rm x l = List.filter ((<>) x) l
```

The `rm`

above is not a must, but it will make the meaning of our following `permutations`

clearer.

```
let rec permutations = function
| [] -> []
| x::[] -> [[x]]
| l ->
List.fold_left (fun acc x -> acc @ List.map (fun p -> x::p) (permutations (rm x l))) [] l
(* The List.fold_left makes every element have their oppotunities to be a head *)
```

Here is the Gist

So far in all our previous posts, the tail-recusive has always been considered in the codes. However, when we talk about permutations, tail-recusive has been ignored. There are two reasons:

At first, it is not possible to make recusion tail-recusive everywhere for the two solutions. The best we can do is just make certain parts tail-recusive.

Secondly, permutation generation is a *P* problem and it is slow. If one tries to generate all permutations at once for a huge list, it would not be that feasible. Thus, when talking about permutations here, it is assumed that no long list would be given as an argument; hence, we assume no stackoverflow would ever occur.

In addition, ignoring tail-recusive makes the code cleaner.

If a list is huge indeed and we have to somehow use possibly all its permutations, then we can think of making the permutations as a *stream*.

Each time we just generate one permutation, and then we apply our somewhat `use_permuatation`

function. If we need to continue, then we ask the stream to give us one more permutation. If we get what we want in the middle, then we don't need to generate more permutations and time is saved.

If we still have to go through all the permutations, time-wise the process will still cost us much. However, we are able to avoid putting all permutations in the memory or potentiall stackoverflow.

In order to achieve *a stream of permutations*, we need Johnson Trotter algorithm and a stream.

The advantage of this algorithm is its ability to generate a new permutation based on the previous one, via simple `O(n)`

operations (the very first permutation is the list itself). This is ideal for our adoption of stream.

The disadvantage, epecially for OCaml, is that it needs an mutable array. Fortunately, we can encapsulate the array in a module or inside a function, without exposing it to the outside world. Thus, certain level of safety will be maintained.

Personally I think this algorithm is very clever. Johnson must have spent quit much time on observing the changes through all permutations and set a group of well defined laws to make the changes happen naturally.

The first assumption of this algorithm is that the array of elements are initially sorted in ascending order (*[1]*).

If in some context we cannot sort the original array, then we can attach additional keys, such as simple integers starting from `1`

, to every element. And carry on the algorithm based on that key.

For example, if we have `[|e1; e2; e3; e4|]`

and we do not want to sort it, then we just put an integer in front of each element like `[|(1, e1); (2, e2); (3, e3); (4, e4)|]`

. All the following process can target on the key, and only when return a permutation, we output the `e`

in the tuple.

For simplicity, we will have an example array `[|1; 2; 3; 4|]`

, which is already sorted.

The key idea behind the algorithm is to move an element (or say, switch two elements) at a time and after the switching, we get our new permutation.

For any element, it might be able to move either *Left* or *Right*, i.e., switch position with either *Left* neighbour or *Right* one.

So we will attach a

direction-L(initially) orR- to every element.

Even if an element has a direction, it might be able to move towards that direction. Only if the element has a **smaller** neighbour on that direction, it can move.

For example,

`4`

and `2`

are movable, because the neighbours on their *left* are smaller.

`3`

is not movable, because `4`

is not smaller.

`1`

is not movable, because it doesn't have any neighbour on its *left*.

As we described before, the algorithm makes a new permutation by moving an element, i.e., switch an element with the neighbour on its *direction*.

What if there are more than one elmeent is movable? We will choose the largest one.

Each time, when we are about to generate a new permutation, we simply scan the array, find the largest movable element, and move it.

If we cannot find such an element, it means we have generated all possible permutations and we can end now.

For example,

Although in the above case, `2`

, `3`

and `4`

are all movable, we will move only `4`

since it is largest.

The whole process will end if no element is movable.

Note that **this scan is before the movement**.

**After we make a movement**, immediately we need to scan the whole array and flip the directions of elements that are larger than the element which is just moved.

**Direction**

```
type direction = L | R
let attach_direction a = Array.map (fun x -> (x, L)) a
```

**Move**

```
let swap a i j = let tmp = a.(j) in a.(j) <- a.(i); a.(i) <- tmp
let is_movable a i =
let x,d = a.(i) in
match d with
| L -> if i > 0 && x > (fst a.(i-1)) then true else false
| R -> if i < Array.length a - 1 && x > (fst a.(i+1)) then true else false
let move a i =
let x,d = a.(i) in
if is_movable a i then
match d with
| L -> swap a i (i-1)
| R -> swap a i (i+1)
else
failwith "not movable"
```

**Scan for the larget movable element**

```
let scan_movable_largest a =
let rec aux acc i =
if i >= Array.length a then acc
else if not (is_movable a i) then aux acc (i+1)
else
let x,_ = a.(i) in
match acc with
| None -> aux (Some i) (i+1)
| Some j -> aux (if x < fst(a.(j)) then acc else Some i) (i+1)
in
aux None 0
```

**Scan to flip larger**

```
let flip = function | L -> R | R -> L
let scan_flip_larger x a =
Array.iteri (fun i (y, d) -> if y > x then a.(i) <- y,flip d) a
```

**Permutations generator**

```
let permutations_generator l =
let a = Array.of_list l |> attach_direction in
let r = ref (Some l) in
let next () =
let p = !r in
(match scan_movable_largest a with (* find largest movable *)
| None -> r := None (* no more permutations *)
| Some i ->
let x, _ = a.(i) in (
move a i; (* move *)
scan_flip_larger x a; (* after move, scan to flip *)
r := Some (Array.map fst a |> Array.to_list)));
p
in
next
(* an example of permutation generator of [1;2;3].
Every time called, generator() will give either next permutation or None*)
let generator = permutations_generator [1;2;3]
> generator();;
> Some [1; 2; 3]
> generator();;
> Some [1; 3; 2]
> generator();;
> Some [3; 1; 2]
> generator();;
> Some [3; 2; 1]
> generator();;
> Some [2; 3; 1]
> generator();;
> Some [2; 1; 3]
> generator();;
> None
```

Here is the Gist.

Like said before, although we use array and `ref`

for the impelmentation, we can hide them from the interface `permutations_generator`

. This makes our code less fragile, which is good. However, for OCaml code having imperative parts, we should not forget to put `Mutex`

locks for thread safety.

Now it is fairly easy to produce a stream of permutations via built-in Stream.

```
let stream_of_permutations l =
let generator = permutations_generator l in
Stream.from (fun _ -> generator())
```

**[1]**. The array can be descending order, which means later on we need to put all initial directions as *R*.

Our easter egg happens to be Pearl 3.

A function $ f $ can have the following properties:

- $ f $ takes two arguments: $ x $ and $ y $
- Both $ x $ and $ y $ are natural numbers, i.e., non-negative integers
- $ f $ also returns natural numbers
- $ f $ is strictly increasing in each argument. This means if $ x $ increases or descreases, the according result of $ f $ will also increase or descrease. The same applies on $ y $.
Now we are given such a function $ f $ and a natural number

`z`

, and we want to find out all pairs of $ x $ and $ y $ that makes $ f (x, y) = z $.In OCaml world, this problem requires us to write a function

`let find_all f z = ...`

which returns a list of`(x, y)`

that satisfy`f x y = z`

, assuming the supplied`f`

meets the requirements above.

This problem seems not that difficult. A trivial brute-force solution comes out immediately. We can simply try every possible value for $ x $ and $ y $ on $ f $, and accumulate all $ (x, y) $ satisfying $ f (x, y) = z $ to the list. There is just one silly question:

Can we try all infinite $ x $ and $ y $?

Of course we cannot. We should try to set a range for both $ x $ and $ y $, otherwise, our solution would run forever. Then how? From some simple math analysis, we can conclude something about $ f $, even if we won't know what $ f $ will be exactly.

Here are what we know so far:

- $ x >= 0 $
- $ y >= 0 $
- $ z >= 0 $
- $ x, y, z $ are all integers, which means the unit of increment or decrement is $ 1 $
- $ f $ goes up or down whenever $ x $ or $ y $ goes up or down.
- $ f (x, y) = z $

From first 4 points, if we pick a value $ X $ for $ x $, we know we can descrease $ x $ from $ X $ at most $ X $ times. It is also true for $ y $ and $ z $.

Then let's assume $ X $ and $ Y $ makes $ f $ equal to $ Z $ ($ X, Y, Z $ are assumed to be values), i.e.,

$$ f (X, Y) = Z $$

Let's now fix $ Y $ and try descreasing $ x $ to $ X-1 $. From point 5, we know

$$ f (X-1, Y) = Z_{X-1} < f (X, Y) = Z $$

We can continue to decrease $ x $ until $ 0 $:

$$ f (0, Y) = Z_0 < f (1, Y) = Z_1 < ... < f (X-2, Y) = Z_{X-2} < f (X-1, Y) = Z_{X-1} < f (X, Y) = Z $$

Together with point 3 ($ z >= 0 $), we can simplify the above inequalities like:

$$ 0 <= Z_0 < Z_1 < ... < Z_{X-2} < Z_{X-1} < Z $$

We can see that through this descreasing of $ x $, the number of results ($ Z_0, Z_1, ..., Z_{X-1} $) is $ X $. How many possible values between $ 0 $ (inclusive) and $ Z $ (exclusive)? Of course the answer is $ Z $, right? So we can get $$ 0 <= X <= Z $$ and similarly, $$ 0 <= Y <= Z $$.

Thus, for any given $ x, y, z $, their relationship is

$$ 0 <= x, y <= Z $$

We obtained the range and now our brute-force solution will work.

It is simple enough.

```
let find_all_bruteforce f z =
let rec aux acc x y =
if y > z then aux acc (x+1) 0
else if x > z then acc
else
let r = f x y in
if r = z then aux ((x, y)::acc) x (y+1)
else aux acc x (y+1)
in
aux [] 0 0
```

The time complexity is $ O(log(z^2)) $. To be exact, we need $ (z + 1)^2 $ calculation of $ f $. It will be a bit slow and we should seek for a better solution

The fact that $ f $ is an increasing function on $ x $ and $ y $ has been used to retrieve the range of $ x $ and $ y $. We actually can extract more from it.

If we fix the somewhat value of $ y $, then increase $ x $ from $ 0 $, then the result of $ f $ will increase and naturally be sorted. If we fix $ x $, and increase $ y $ from $ 0 $, the same will happen that $ f $ will increase and be sorted.

We also know $ 0 <= x, y <= Z $; thus, we can create a matrix that has $ z + 1 $ rows and $ z + 1 $ columns. And each cell will be the result of $ f (x, y) $, where $ x $ is the row number and $ y $ is the column number.

The best thing from this matrix is that **all rows are sorted and so are all columns**. The reason is that $ f $ is a strictly increasing function on $ x $ and $ y $.

This matrix converts the original problem to the one that **now we have a board of unrevealed cards and we need to seek for an efficient strategy to find all cards that we want**.

The trivial solution in the previous section has a simplest strategy: simply reveal all cards one by one and collect all that are satisfying.

Let's take an example simple function $ f (x, y) = x * y $ and assume $ z = 6 $.

If we employ the trivial solution, then of course we will find all cards we want.

However, we can see that we need to reveal $ 36 $ cards for $ 4 $ targets.

What a waste! But how can we improve it?

Maybe just start from somewhere, reveal one card, depend on its value and then decide which next card to reveal?

Let's just try starting from the most natural place - the top-left corn, where $ x = 0, y = 0 $.

- We get $ R $.
- If $ R > z $, our problem is solved. This is because the 3 possible next cards must be bigger than $ R $ as the bigger $ x $ or $ y $ are the larger the results are. So we do not need to move any more.
- What if $ R < z $? then we have to try to reveal all 3 cards, which makes not much sense since we may still anyway reveal all cards.

We need a better way.

Before we continue to think, let's try another simpler problem first.

Given a value

`k`

and two sorted lists of integers (all being distinct), find all pairs of integers`(i, j)`

where`i + j == k`

.For example,

`[1; 2; 8; 12; 20]`

and`[3; 6; 9; 14; 19]`

and`k = 21`

is given, we should return`[(2, 19); (12, 9)]`

.

Of course, we can just scan all possible pairs to see. But it is not efficient and we waste the execellent hint: *sorted*.

Since sorted list is the fundamental condition for *merge*, how about we try to do something similar? Let's put two sorted lists in parallel and for the first two elements $ 1 $ and $ 3 $, we know $ 1 + 3 = 4 < 21 $.

It is too small at this moment, what should we do? We know we should move rightwards to increase, but do we move along `list 1`

or `list 2`

or both?

We don't know actually, because each possible movement may give us chances to find good pairs. If we just take all possible movements, then it makes no sense as in the end as it will just try every possible pair. Hence, we just need to find a way to restrict our choices of movements.

How about we put one list in its natural order and put the other in its reversed order?

Now, $ 1 + 19 = 20 < 21 $. It is again too small. What shall we do? Can we move along `list 2`

? We cannot, because the next element there is definitely smaller and if we move along, we will get even smaller sum. So moving along `list 1`

is our only option.

If `i + j = k`

, then good, we collect `(i, j)`

. How about the next move? We cannot just move along any single list because the future sum will forever be either bigger than `k`

or smaller. Thus, we move along both because only through increasing `i`

and decreasing `j`

may give us changes to find the target sum again.

What if `i + j > k`

? It is easy to see that the only option for us is the next element in `list 2`

.

Let's come back to our *sorted* matrix problem. We do not have simple two sorted lists any more, but it is not difficult to see that we should **start from bottom-left corner** (rows are descending and columns are ascending). And depending on what we get from $ R = f (x, y) $, we just change direction (either up, right-up or right), and towards the top-right corner:

- We move
*right*along $ y $ until we reach the*ceiling*(the smallest number that is bigger than $ z $), and then change to*up* - We move
*up*along $ x $ until we reach the*floor*(the bigger number that is smaller than $ z $), and then change to*right* - Whenever we reach $ R = f (x, y) = z $, we change to
*right-up*.

We can use an example to explain why we want the *ceiling* (similarly explaining why the *floor*).

When we reach the *ceiling*, we know all cells on its left can be dicarded because those are definitely smaller than $ z $.

In this way, the card revealing process for $ f (x, y) = x * y $ looks like this:

```
let find_all_zigzag f z =
let rec aux acc x y =
if x < 0 || y > z then acc
else
let r = f x y in
if r < z then aux acc x (y+1)
else if r> z then aux acc (x-1) y
else aux (r::acc) (x-1) (y+1)
in
aux [] z 0
```

The time complexity is $ O(Z) $ and in the worst case, we need to calculate $ 2 * Z + 1 $ times of $ f $.

This solution is a linear one and there is still room for improvement.

There is a quite popular interview question which is very similar to the pearl 3.

We have a matrix of numbers. All rows are sorted and so are all columns. Find the coordinates of a given number.

Although it seems all numbers are already computed and put in the matrix, both this problem and pearl 3 are actually identical.

Zig-zag solution is actually scan along $ x $ axis and $ y $ axis by turns. The scan on each turn is linear. But wait, each row and column are sorted, right? Does it ring a bell?

When a sequence of elements are sorted, and if we have a target to find, then we of course can try *binary search*.

Simply say, in order to improve the *zig-zag* solution, we just replace the *linear scan* part with *binary search*.

For each round of binary search, we cannot just search for the given value of $ z $ only. Instead,

- Our target is the
**ceiling**when we binary search**horizontally**, i.e., along $ y $, from left to right. - Our target is the
**floor**when we binary search**vertically**, i.e., along $ x $, from down to up. - During our search, if we find a target card, we can simply stop.

The next question is *when a round of binary search stops?*

When we reach an card equal to $ z $, of course we can stop immediately.

If we never reach an ideal card, then anyway the search will stop because there will eventually be no room on either left side or right side to continue. And that stop point will definitely be the *ceiling* or the *floor* of $ z $.

Again, because we need the *ceiling* for *horizon* and *floor* for *vertical*, we need to adjust it after we finish the binary search.

```
type direction = H | V
let rec binary_search z g p q =
if p + 1 >= q then p
else
let m = (p + q) / 2 in
let r = g m in
if r = z then m
else if r > z then binary_search z g p (m-1)
else binary_search z g (m+1) q
(* adjust ceiling or floor *)
let next_xy z f x y direction =
let r = f x y in
match direction with
| _ when r = z -> x, y
| H when r > z -> x-1, y
| H -> x-1, y+1
| V when r < z -> x, y+1
| V -> x-1, y+1
let find_all_zigzag_bs f z =
let rec aux acc (x, y) =
if x < 0 || y > z then acc
else
let r = f x y in
if r = z then aux (f x y::acc) (x-1, y+1)
else if r < z then
let k = binary_search z (fun m -> f x m) y z in
aux acc (next_xy z f x k H)
else
let k = binary_search z (fun m -> f m y) 0 x in
aux acc (next_xy z f k y V)
in
aux [] (z, 0)
```

The reason why this pearl is called *Saddleback Search* is (quoted from the book),

(It is) an important search strategy, dubbed saddleback search by David Gries...

I imagine Gries called it that because the shape of the three-dimensional plot of f , with the smallest element at the bottom left, the largest at the top right and two wings, is a bit like a saddle.

For example, if we plot $ f (x, y) = x * y $ , we can see this

Does it look like a saddle?

]]>As we described in the previous post, leftist tree is a binary tree based functional heap. It manipulates the tree structure so that the left branches are always the longest and operations follow the right branches only. It is a clever and simple data structure that fulfills the purpose of heap.

In this post, we present another functional heap called Binomial Heap (*[1]*). Instead of being a single tree strucutre, it is a list of *binomial tree*s and it provides better performance than leftist tree on *insert*.

However, the reason I would like to talk about it here is not because of its efficiency. The fascination of binomial heap is that **if there are N elements inside it, then it will have a determined shape, no matter how it ended up with the N elements**. Personally I find this pretty cool. This is not common in tree related data structures. For example, we can not predict the shape of a leftist tree with N elements and the form of a binary search tree can be arbitrary even if it is somehow balanced.

Let's have a close look at it.

Binomial tree is the essential element of binomial heap. Its special structure is why binomial heap exists. Understanding binomial tree makes it easier to understand binomial heap.

Binomial tree's definition does not involve the values associated with the nodes, but just the structure:

- It has a rank
*r*and*r*is a natural number. - Its form is
*a root node*with*a list of binomial trees*, whose ranks are strictly*r-1*,*r-2*, ...,*0*. - A binomial tree with rank
*0*has only one root, with an empty list.

Let's try producing some examples.

From point 3, we know the base case:

Now, how about *rank 1*? It should be a root node with a sub binomial tree with rank `1 - 1 = 0`

:

Let's continue for *rank 2*, which should have *rank 1* and *rank 0*:

Finally *rank 3*, can you draw it?

If we pull up the left most child of the root, we can see:

This means a binomial tree with rank *r* can be seen as *two* binomial trees with the same rank *r-1*. Furthermore, because that *two*, and rank *0* has one node, then in term of the number of nodes, for a binomial tree with **rank r, it must have $ 2^r $ nodes, no more, no less**.

For example, rank *0* has 1 node. Rank *1* is 2 *rank 0*, so rank *1* has $ 2 * 1 = 2 $ nodes, right? Rank *2* then has $ 2 * 2 = 4 $ nodes, and so on so forth.

Note that $ 2^r = 1 + 2^r-1 + 2^r-2 + ... + 2^0 $ and we can see that a rank *r* tree's structure fits exactly to this equation (the *1* is the root and the rest is the children list).

The definition tells us that a rank *r* tree is a root plus a list of trees of rank *r-1*, *r-2*, ..., and *0*. So if we have a binomial tree with an arbitrary rank, can we just insert it to another target tree to form a rank *r* tree?

For example, suppose we have a rank *1* tree, can we insert it to the target tree below for a rank *3* tree?

Unfortunately we cannot, because the target tree won't be able to exist in the first place and it is not a valid binomial tree, is it?

Thus in order to have a rank *r* tree, we must have two *r-1* trees and link them together. When linking, we need to decide which tree is the new root, depending on the context. For the purpose of building a *min heap* later, we assume we always let the root with the *smaller* key be the root of the new tree.

Defining a binomial tree type is easy:

```
(* Node of key * child_list * rank *)
type 'a binomial_t = Node of 'a * 'a binomial_t list * int
```

Also we can have a function for a singleton tree with rank *0*:

```
let singleton_tree k = Node (k, [], 0)
```

Then we must have `link`

function which promote two trees with same ranks to a higher rank tree.

```
let link ((Node (k1, c1, r1)) as t1) ((Node (k2, c2, r2)) as t2) =
if r1 <> r2 then failwith "Cannot link two binomial trees with different ranks"
else if k1 < k2 then Node (k1, t2::c1, r1+1)
else Node (k2, t1::c2, r2+1)
```

One possibly interesting problem can be:

Given a list of $ 2^r $ elements, how to construct a binomial tree with rank

r?

We can borrow the idea of *merging from bottom to top* for this problem.

```
let link_pair l =
let rec aux acc = function
| [] -> acc
| _::[] -> failwith "the number of elements must be 2^r"
| t1::t2::tl -> aux (link t1 t2 :: acc) tl
in
aux [] l
let to_binomial_tree l =
let singletons = List.map singleton_tree l in
let rec aux = function
| [] -> failwith "Empty list"
| t::[] -> t
| l -> aux (link_pair l)
in
aux singletons
```

If we split a binomial tree into levels and pay attention to the number of nodes on each level, we can see:

So from top to bottom, the numbers of nodes on levels are *1*, *3*, *3* and *1*. It happens to be the coefficients of $ (x+y)^3 $ .

Let's try rank *4*:

They are *1*, *4*, *6*, *4* and *1*, which are the coefficients of $ (x+y)^4 $ .

The number of nodes on level *k* ( 0 <= *k* <= *r*) matches $ {r}\choose{k} $, which in turn matches **the kth binomial coefficient of $ (x+y)^r $. This is how the name binomial tree came from**.

A binomial heap is essentially a list of binomial trees with distinct ranks. It has two characteristics:

- If a binomial heap has
*n*nodes, then its shape is determined, no matter what operations have been gone through it. - If a binomial heap has
*n*nodes, then the number of trees inside is`O(logn)`

.

The reason for the above points is explained as follows.

As we already knew, a binomial tree with rank *r* has $ 2^r $ nodes. If we move to the context of binary presentation of numbers, then a rank *r* tree stands for the case where there is a list of bits with only the *rth* slot turned on.

Thus, for *n* number of nodes, it can be expressed as a list of binomial trees with distinct ranks, because the number *n* is actually a list of bits with various slots being *1*. For example, suppose we have 5 nodes (ignoring their values for now), mapping to a list of binomial trees, we will have:

This is where binomial heap comes from.

- Since a number
*n*has determined binary presentation, a binomial heap also has fixed shape as long as it has*n*nodes. - In addition, because
*n*has`O(logn)`

effective bits, a binomial heap has`O(logn)`

binomial trees. - If we keep each binomial tree having the min as the root, then for a binomial heap, the overall minimum elements is on of those roots.

Let's now implement it.

It is easy.

```
type 'a binomial_heap_t = 'a binomial_t list
```

When we *insert* a key `k`

, we just create a singleton binomial tree and try to insert the tree to the heap list. The rule is like this:

- If the heap doesn't have a rank
*0*tree, then directly insert the new singleton tree (with rank*0*) to the head of the list. - If the heap has a rank
*0*tree, then the two rank*0*tree need to be linked and promoted to a new rank*1*tree. And we have to continue to try to insert the rank*1*tree with the rest of the list that potentiall starts with a existing rank*1*tree. - If there is already a rank
*1*tree, then link and promot to rank*2*... so on so forth, until the newly promoted tree has a slot to fit in.

Here are two examples:

The *insert* operation is actually the addition between *1* and *n* in binary presentation, in a revered order.

```
let insert k h =
let rec aux acc (Node (_, _, r1) as bt1) = function
| [] -> List.rev (bt1::acc)
| (Node (_, _, r2) as bt2)::tl ->
if r1 = r2 then aux acc (link bt1 bt2) tl
else if r1 < r2 then List.rev_append acc (bt1::bt2::tl)
else aux (bt2::acc) bt1 tl
in
aux [] (singleton_tree k) h
```

If the heap is full as having a consecutive series of ranks of trees starting from rank *0*, we need `O(logn)`

operations to finish the *insert*. However, once it is done, most of the lower rank slots are empty (like shown in the above figure). And for later new *insert*, it won't need `O(logn)`

any more. Thus, The time complexity of *insert* seems to be `O(logn)`

, but **actually amortised O(1)**.

Note the above *insert* description is just for demonstration purpose. Like in Leftist tree, *merge* is the most important operation for binomial heap and *insert* is just a simpler *merge*.

The *merge* is like this:

- Get the two heads (
`bt1`

and`bt2`

) out of two heaps (`h1`

and`h2`

). - If
`rank bt1 < rank bt2`

, then`bt1`

leaves first and continue to merge`the rest of h1`

and`h2`

. - If
`rank bt1 > rank bt2`

, then`bt2`

leaves first and continue to merge`h1`

and`the rest of h2`

. - If
`rank bt1 = rank bt2`

, then`link bt1 bt2`

, add the new tree to`the rest of h1`

and merge the new`h1`

and`the rest of h2`

.

I will skip the digram and directly present the code here:

```
let rec merge h1 h2 =
match h1, h2 with
| h, [] | [], h -> h
| (Node (_, _, r1) as bt1)::tl1, (Node (_, _, r2) as bt2)::tl2 ->
if r1 < r2 then bt1::merge tl1 h2
else if r1 > r2 then bt2::merge h1 tl2
else merge (link bt1 bt2::tl1) tl2
(* a better and simpler version of insert *)
let insert' k h = merge [singleton_tree k] h
```

The time complexity is `O(logn)`

.

We just need to scan all roots and get the min key.

```
let get_min = function
| [] -> failwith "Empty heap"
| Node (k1, _, _)::tl ->
List.fold_left (fun acc (Node (k, _, _)) -> min acc k) k1 tl
```

For achieve

`O(1)`

, we can attach a`minimum`

property to the heap's type. It will always record the min and can be returned immediately if requested. However, we need to update this property wheninsert,mergeanddelete_min. Like every other book does, this modification is left to the readers as an exercise.

*delete_min* appears as a little bit troublesome but actually very neat.

- We need to locate the binomial tree with
*min*. - Then we need to merge
`the trees on its left`

and`the trees on its right`

to get a new list. - It is not done yet as we need to deal with the
*min*binomial tree. - We are lucky that
**a binomial tree's child list is a heap indeed**. So we just need to merge`the child list`

with the new list from point 2.

```
let key (Node (k, _, _)) = k
let child_list (Node (_, c, _)) = c
let split_by_min h =
let rec aux pre (a, m, b) = function
| [] -> List.rev a, m, b
| x::tl ->
if key x < key m then aux (x::pre) (pre, x, tl) tl
else aux (x::pre) (a, m, b) tl
in
match h with
| [] -> failwith "Empty heap"
| bt::tl -> aux [bt] ([], bt, []) tl
let delete_min h =
let a, m, b = split_by_min h in
merge (merge a b) (child_list m |> List.rev)
```

| | get_min | insert | merge | delete_min | |---------------|-----------------------------------------|----------------|---------|------------| | Leftist tree | O(1) | O(logn) | O(logn) | O(logn) | | Binomial heap | O(logn), but can be improved to be O(1) | Amortised O(1) | O(logn) | O(logn) |

**[1]** Binomial Heap is also introduced in Purely Functional Data Structures.

Heap is one of most important data structure, where the minimum of all elements can always be easily and efficiently retrieved.

In imperative world, binary heap (implemented via *array*) is frequently used. Here is an example:

The (min) heap on the right hand side is a full binary tree indeed. The root is always the min and recursively, a root (of any sub tree) is always smaller than its two children. Note that **we just need to keep partial order for heap**, not total order like binary search tree. For example, `1`

needs to be smaller than its two children, but the relationship between the two children does not matter. Thus, the left child and right child can be either `10`

and `3`

or `3`

and `10`

. Moreover, the big node `17`

can be in the right branch of `1`

while its parent `3`

is smaller than `10`

.

The array based implementation is actually beautiful. Essentially, it uses two tricks to simulate the binary heap tree:

- If the index of a root is
`i`

, then its left child index is`2*i+1`

and right child index is`2*i+2`

. - The relationship of left child and right child is not important, so the two child slots in the array for a root can be fully used. (
*Can we use array to implement a binary search tree?*)

The time complexities on operations are:

*get_min*:`O(1)`

*insert*:`O(logn)`

*delete_min*:`O(logn)`

*merge*:`O(n)`

Although its `merge`

is not satisfying, this data structure is adorable and gives us the most compact space usage on array.

However, unfortunately we cannot have it in pure functional world where mutable array is not recommended.

I would like to first explore two possible functional approaches for heap (the list based in this section and the direct binary tree based in next section) . They are trivial or even incomplete, but the ideas behind may lead to the invention of leftist tree.

List is basically the replacement of array in functional world and of course, we can use it for heap. The idea is simple:

- When
*insert*`x`

, linearly compare`x`

with all existing elements one by one and insert it to the appropriate place where the element on its left hand side is smaller and the element on its right is equal or bigger. - When
*get_min*, it is as simple as returning the head. - When
*delete_min*, remove the head. - When
*merge*, do a standard*merge*like in*mergesort*.

The time complexities are:

*get_min*:`O(1)`

*insert*:`O(n)`

*delete_min*:`O(1)`

*merge*:`O(n)`

We can see that in this design, the list is fully sorted all the time. And also it is a tree, with just one branch always.

Of course, the performance of *insert* and *merge* is `O(n)`

which is unacceptable. This is due to the adoption of just one leg which is crowded with elements and all elements have to be in their natural order to fit in (remember a parent must be smaller than its kid(s)?). Eventually the fact of *total order is not necessary* is not leveraged at all.

Hence, we need **at least two branches** so that we may be able to spread some elements into different branches and the comparisons of the children of a root may be avoided.

Since the array based binary heap is a binary tree, how about implementing it directly in a binary tree form?

We can define a binary tree heap like this:

```
type 'a bt_heap_t =
| Leaf
| Node of 'a * 'a bt_heap_t * 'a bt_heap_t
```

The creation of the type is easy, but it is hard to implement those operations. Let's take *insert* as an example.

The *attempt 1* in the diagram above illustrates the idea behind the array based binary heap. When we insert an element, ideally we should put it in the last available place at the bottom level or the first place at a brand new level if already full, then we do a *pull up* by comparing and swapping with parents one by one. Obviously, we cannot do it in a pure tree structure since it is not efficient to find the initial place.

So let's try *attempt 2* by directly try to insert from top to bottom. So in the example, `2 < 6`

so `2`

stays and `6`

should continue going down. However, the question now is **which branch it should take?** `10`

or `3`

? This is not trivial to decide and we have to find a good rule.

We will also have problem in *delete_min*.

If we delete the root (min), then we will have two binary trees. Then the question is how to merge them? Do we just take all elements of one of the trees, and insert every element into the other tree? Will this way be efficient even if we are able to design a good *insert*?

If we consider *insert* as *merge* with a single node tree, then the problem of *insert* becomes the problem of *merge*, just like *delete_min*. **Should we design a good merge so that both insert and delete_min are naturally solved**?

Bear all those questions in mind, and now we can start look at leftist tree.

The concerns in answering those questions in the previous section are majorly about the performance. We are dealing with a tree. How the tree getting structure or transformed along the time is very important and it affects the performance a lot.

For example, for the question of *which branch to take when inserting*, if we just always go left, then the left branch will be longer and longer if we insert lots of elements, and the time complexity becomes `O(n)`

.

Also when merging, the two trees are already heaps, i.e., the elements and structure of either tree obeys the principle that a parent must be smaller than its children. Why do we need to destory one tree completely? Why not try to keep the structures of two trees untouched as much as possible? If we can somehow attach a root from one tree directly under a root of another tree, then all children under the first root can stay still and no operations are necessary for them.

Let's see how leftist tree is designed to make sure the good performance.

Leftist tree always keeps the left branches of all roots being the longer and in worst case, they are as long as the right branches. In other word, all right branches of all roots are shortest. This makes sense. When we decide which branch to go, if we know one branch is always shorter than the other, then we should take it. This is because shorter branch means potentially less nodes and the number of future possible operations intends to be smaller.

In order to maintain this property, each node has a **rank**, which indidates **the length of the path between the node and the right most leaf**. For example,

The way of keep ranks up-to-date is like this:

- We always assume
`the rank of the right child <= the rank of the left child`

. - So we always go right.
- If we reach a leaf, then we replace the leaf with our
*element*and update the ranks of all the parents back to the root. Note that the*element*in this context is not necessarily the original new element which is being inserted and we will explain it soon. - When updating the rank of a parent, we first compare the rank of left
`rank_left`

and the rank of right`rank_right`

. If`rank_left < rank_right`

, then we swap the left child and the right child and update the rank of the parent to be`rank_left + 1`

; other wise`rank_right + 1`

A diagram is better than hundreds of words. Let's build a leftist by inserting two elements.

We can see that by always putting the higher rank to the left can make the right branch being shortest all the time. Actually, this strategy is how this design of tree based heap got the name *leftist*, i.e., more nodes intend to be on the left side.

But this is not the full story of leftist. Actually, although the diagram above describes *insert*, it is just for the purpose of demonstration of how ranks are updated.

In leftist tree, the most important operation is *merge* and an *insert* is just a *merge* with a singleton tree that has only the new element and two leaves.

*Merge* is a recursive operation.

- We have two trees to merge:
`merge t1 t2`

. - Compare two roots, if
`k1 > k2`

, we simply switch two trees by`merge t2 t1`

. This is to make sure the tree on the left always has smaller key, for conveniences. - Since
`t1`

has smaller key, its root should stay as the new root. - Because
`t1`

's right branch`r`

is always shortest, we then do`merge r t2`

. - If one of the two trees is leaf, we simply returns the other. And begin the rank updating process back to the root.
- The rank updating is the same as we described in the previous section.

Again, let's see an example.

**type**

```
type 'a leftist =
| Leaf
| Node of 'a leftist * 'a * 'a leftist * int
```

**essentials**

```
let singleton k = Node (Leaf, k, Leaf, 1)
let rank = function Leaf -> 0 | Node (_,_,_,r) -> r
```

**merge**

```
let rec merge t1 t2 =
match t1,t2 with
| Leaf, t | t, Leaf -> t
| Node (l, k1, r, _), Node (_, k2, _, _) ->
if k1 > k2 then merge t2 t1 (* switch merge if necessary *)
else
let merged = merge r t2 in (* always merge with right *)
let rank_left = rank l and rank_right = rank merged in
if rank_left >= rank_right then Node (l, k1, merged, rank_right+1)
else Node (merged, k1, l, rank_left+1) (* left becomes right due to being shorter *)
```

**insert, get_min, delete_min**

```
let insert x t = merge (singleton x) t
let get_min = function
| Leaf -> failwith "empty"
| Node (_, k, _, _) -> k
let delete_min = function
| Leaf -> failwith "empty"
| Node (l, _, r, _) -> merge l r
```

*get_min*:`O(1)`

*insert*:`O(logn)`

*delete_min*:`O(logn)`

*merge*:`O(logn)`

Leftist tree's performance is quite good. Comparing to the array based binary heap, it has a much better *merge*. Moreover, its design and implementation are simple enough. Thus, leftist tree can an excellent choice for heap in the functional world.

In a list of unsorted numbers (not necessarily distinct), such as

The surpassers of an element are all elements whose indices are bigger and values are larger. For example, the element

`1`

's surpassers are`9`

,`5`

,`5`

, and`6`

, so its number of surpassers is 4.

And also we can see that

`9`

doesn't have any surpassers so its number of surpassers is 0.So the problem of this pearl is:

Given an unsorted list of numbers, find the max number of surpassers, O(nlogn) is required.In the answer for the above example is 5, for the element

`-10`

.

As usual, let's put a trivial solution on the table first (*[1]*). It is straightforward:

- For every element, we scan all elements behind it and maintain a
`ns`

as its number of surpassers. - If one element is larger, then we increase the
`ns`

. - After we finish on all elements, the
`max`

of all the`ns`

es is what we are looking for.

The diagram below demonstrates the process on `1`

.

```
let num_surpasser p l = List.fold_left (fun c x -> if x > p then c+1 else c) 0 l
let max_num_surpasser l =
let rec aux ms l =
match ms, l with
| _, [] -> ms
| None, p::tl -> aux (Some (p, num_surpasser p tl)) tl
| Some (_, c), p::tl ->
let ns = num_surpasser p tl in
if ns > c then aux (Some (p, ns)) tl
else aux ms tl
in
aux None l
(* note think the answer as an `option` will make the code seem more complicated, but it is not a bad practice as for empty list we won't have max number of surpassers *)
```

The solution should be easy enough to obtain but its time complexity is `O(n^2)`

which is worse than the required `O(nlogn)`

.

The algorithm design technique *divide and conquer* was mentioned in Recursion Reloaded. I believe it is a good time to properly introduce it now as it provides a elegant approach towards a better solution for pearl 2.

Suppose we want to replace the dragon lady and become the king of the land below (*[2]*).

We are very lucky that we have got a strong army and now the only question is how to overcome the realm.

One "good" plan is *no plan*. We believe in our troops so much that we can just let all of them enter the land and pwn everywhere.

Maybe our army is very good in terms of both *quality* and *quantity* and eventually this plan will lead us to win. However, is it really a good plan? Some soldiers may march to places that have already been overcome; Some soldiers may leave too soon for more wins after one winning and have to come back due to local rebel... the whole process won't be efficient and it cost too much gold on food for men and horses.

Fortunately, we have a better plan.

We divide the land into smaller regions and further smaller ones inside until unnecessary. And for each small region, we put ideal amount of soldiers there for battles. After soldiers finish their assigned region, they don't need to move and just make sure the region stay with us. This is more oganised and more efficient in terms of both gold and time. After all, if we conquer all the tiny regions, who would say we were not the king?

*divide and conquer* in algorithm design is not a universal solution provider, but a problem attacking strategy or paradigm. Moreover, although this classic term has been lasting for quite a long time, personally I would like to add one more action - **accumulate** - to make it appear more complete. Let's check the 3 actions one by one to see how we can apply the techque.

Conceptually this action is simple and we know we need to divide a big problem into smaller ones. But *how to* is non-trivial and really context-sensitive. Generally we need to ask ourselves 2 questions first:

What are the sizes of the smaller sub-problems?

Normally we intend to halve the problem because it can lead us to have a `O(logn)`

in our final time complexity.

But it is not a must. For example 3-way quicksort divides the problem set into 3 smallers ones. 3-way partition can let quicksort have O( $ \log_3{n} $ ). However, do not forget the number of comparison also increases as we need to check equality during partition.

Moreover, sometimes we may have to just split the problem into a sub-problem with size 1 and another sub-problem with the rest, like what we did for the `sum`

function. This kind of *divide* won't give us any performance boost and it turns out to be a normal recursion design.

Do we directly split the problem, or we somehow reshape the problem and then do the splitting?

In *mergesort*, we simply split the problem into 2; while in *quicksort*, we use *partition* to rearrange the list and then obtain the 2 sub-problems.

The point of this question is to bear in mind that we do not have shortcuts. We can have a very simple splitting, but later on we need to face probably more complicated *accumulate*. Like *mergesort* relies on *merge*. Or, we can do our important work during *divide* phase and have a straight *accumulate* (*quicksort* just needs to concatenate the solutions of the two sub-problems with the *pivot* in the middle).

This action implies two things:

- Recursion. We divided the problem, then we need to conquer. How to conquer? We need to apply
*divide and conquer and accumulate*again until we are not able to divide any more. - Edge cases. This means if we cannot divide further, then it is time to really give a solution. For example, let's say our target is a list. If we reach an empty list or an one-element list, then what shall we do? Normally, if this happens, we do not need to
*accumulate*and just return the answer based on the edge cases.

I believe the *conquer* part in the original *divide and conquer* term also implies the *accumulate*. I seperate *accumulate* as explained next.

After we conquer every little area of the land, we should now somehow combine all our accomplishments and really build a kingdom out of them. This is the step of *accumulate*.

A key way to figure out *how to accumulate* is to **start from small**. In *mergesort*, if each of the 2 sub-problems just has one element, then the according answer is a list having that element and we have finished the *conquer* step. Now we have two lists each of which has one element, how can we accumulate them to have a single sorted list? Simple, smaller element goes first into our resulting list. What if we have two sorted list each of which has two elements? The same, smaller element goes first again.

If we decide to divide the problem in a fairly simple way, then *accumulate* is normally non-trivial and also dominates the time complexity. Figuring out a cost-efficient approach of *accumulate* is very important.

Again, *divide and conquer and accumulate* is just a framework for solving an algorithm problem. All the concrete solutions are problem context based and can spread into all 3 steps.

In addition, a fundamental hint to using this techqniue is that if we are given a problem, and we know the future solution is not anyhow related to the problem size, then we should try *divide and conquer and accumulate*

Although pearl 2 asks us to get the max number of surpassers, we can

- Get the number of surpassers for every element (we anyway need to)
- Then do a linear scan for the max one.

The second step is `O(n)`

. If we can achieve the first step in `O(nlogn)`

, the overall time complexity stays as `O(nlogn)`

.

So our current goal is to use *divide and conquer and accumulate* to get all numbers of surpassers.

We have such as list and we want to get a new list that have the same elements and each element is associated with the number of its surpassers. Now we want to divide the original list (problem set).

Can we directly halve the list?

As pearl 2 stated, an element only cares about all elements that are behind it. So if we split the list in the middle, we know the numbers of surpassers for all elements in `sub-problem 2`

do not need any special operations and the answers can directly be part of the future resulting list.

For the elements inside `sub-problem 1`

, the answers are not fully accomplished yet as they will be affected by the elemnts in `sub-problem 2`

. But hey, how we obtain full answers for `sub-problem 1`

with the help of the solutions of `sub-problem 2`

should be the job of *accumulate*, right? For now, I believe halving the problem is a good choice for *divide* as at least we already solve half of the problem directly.

We of course will use recursion. For sub-problems, we further halve them into smaller sub-problems until we are not able to, which means we reach the edge cases.

There are possibly two edge cases we need to conquer:

- Empty list
- A list with only one element.

For empty list, we just need to return an empty list as there is no element for us to count the number of surpassers. For an one-element list, we also return an one-element resulting list where the only element has `0`

surpassers.

Now we finished dividing and conquering like below and it is time to accumulate (take `sub-problem 1`

only for illustration).

It is easy to combine solutions of `sp 111`

and `sp 112`

: just compare `8`

from `sp 111`

with `1`

from `sp112`

, update `8`

's number of surpassers if necessary and we can leave `sp 112`

alone as we talked about during *divide*. The same way can be applied on `sp 121`

and `sp 122`

. Then we get:

Now both `sp 11`

and `sp 12`

have more than one element. In order to get the solution for `sp 1`

, `sp 12`

can stay put. How about `sp 11`

? An obvious approach is just let every element in `sp 11`

to compare every element in `sp 12`

, and update their numbers of surpassers accordingly. This can be a candidate for *accumulate* action, however, it is `O(n^2)`

. We need to accumulate better.

We said in the very beginning of this post during our trivial solution that the original order of the list matters. However, is it still sensitive after we get the solution (for a sub-problem)?

As we can see once the answer of `sp 11`

is obtained, the order between `8`

and `1`

doesn't matter as they don't rely on each for their number of surpassers any more.

If we can obtain the solution in a sorted manner, it will help us a lot. For example, assume the resulting lists for `sp 11`

and `sp 12`

are sorted like this:

Then we can avoid comparing every pair of elements by using *merge* like this:

We can see that `8`

in the left hand side list doesn't have to compare to `-10`

any more. However, this example has not shown the full picture yet. **If keep tracking the length of resulting list on the right hand side, we can save more comparisons**. Let's assume both `sp 1`

and `sp 2`

have been solved as sorted list with lengths attached.

We begin our *merge*.

Have you noticed the fascinating part? Because `-10 < -2`

, without further going down along the resulting list on the right hand side, we can directly update the number of surpassers of `-10`

and get it out. Why? Because `-2`

is the smallest element on the right, and if it is bigger than `-10`

, then the rest of the elements on the right must all be bigger than `-10`

, right? Through only one comparison (instead of 4), we get the number of surpassers.

Thus, **as long as the solutions of all sub-problems are sorted list with the length associated, we can accumulate like this**:

- Compare the heads
`hd1`

and`hd2`

, from two sorted resulting lists`l1`

and`l2`

, respectively - If
`hd1`

>=`hd2`

, then`hd2`

gets out; go to 1 with updated length for`l2`

- if
`hd1`

<`hd2`

, then`hd1`

gets out, and its`ns`

gets updated by adding the length of`l2`

to the existing value; go to 1 with updated length for`l1`

The full process of accumulating `sp 1`

and `sp 2`

is illustrated as follows:

Two things might need to be clarified:

- Although we assumed the resulting lists of sub-problems to be sorted, they will naturally become sorted anyway because we are doing the
*smaller goes first*merging. - We need to attach the lengths to each resulting list on the right and keep updating them because scanning the length of a list takes
`O(n)`

.

Obviously this way of accumulation can give us `O(n)`

. Because at most we can divide `O(logn)`

times, our *divide and conquer and accumulate* solution will be `O(nlogn)`

.

At first, we *divide*. Note that this version of *divide* is actually a kind of `splitting from middle`

, as the original order of the elements before we get any solution is important.

```
(* we have a parameter n to indicate the length of l.
it will be passed by the caller and
in this way, we do not need to scan l for its length every time.
it will return left, right and the length of the right.
*)
let divide l n =
let m = n / 2 in
let rec aux left i = function
| [] -> List.rev left, [], 0
| right when i >= m -> List.rev left, right, n-i
| hd::tl -> aux (hd::left) (i+1) tl
in
aux [] 0 l
```

Now *accumulate*. We put it before writing *conquer* because *conquer* would call it thus it must be defined before *conquer*.

```
let accumulate l1 l2 len2 =
let rec aux acc len2 = function
| l, [] | [], l -> List.rev_append acc l
| (hd1,ns1)::tl1 as left, ((hd2,ns2)::tl2 as right) ->
if hd1 >= hd2 then aux ((hd2,ns2)::acc) (len2-1) (left, tl2)
else aux ((hd1,ns1+len2)::acc) len2 (tl1, right)
in
aux [] len2 (l1, l2)
```

*conquer* is the controller.

```
(* note the list parameter is a list of tuple, i.e., (x, ns) *)
let rec conquer n = function
| [] | _::[] as l -> l
| l ->
let left, right, right_len = divide l n in
let solution_sp1 = conquer (n-right_len) left in
let solution_sp2 = conquer right_len right in
accumulate solution_sp1 solution_sp2 right_len
```

So if we are given a list of numbers, we can now get all numbers of surpassers:

```
let numbers_of_surpassers l =
List.map (fun x -> x, 0) l |> conquer (List.length l)
```

Are we done? No! we should find the max number of surpassers out of them:

```
(* we should always consider the situation where no possible answer could be given by using **option**, although it is a bit troublesome *)
let max_num_surpassers = function
| [] -> None
| l ->
let nss = numbers_of_surpassers l in
Some (List.fold_left (fun max_ns (_, ns) -> max max_ns ns) 0 nss)
```

**[1]** Unless I can see an optimal solution instantly, I always intend to think of the most straightforward one even though it sometimes sounds stupid. I believe this is not a bad habit. Afterall, many good solutions come out from brute-force ones. As long as we anyway have a solution, we can work further based on it and see whether we can make it better.

**[2]** Yes, I believe the dragon lady will end the game and win the throne. It is the circle and fate.

Prof. Richard Simpson Bird is a Supernumerary Fellow of Computation at Lincoln College, Oxford, England, and former director of the Oxford University Computing Laboratory. His research interests include:

- The algebra of programming
- The calculation of algorithms from their specification
- Functional programming
- Algorithm design

In 2010, he wrote the book Pearls of Functional Algorithm Design

This book presents **30 algorithm problems and their functional solutions**. The reason that they are called as *pearls* is what I quote below:

Just as natural pearls grow from grains of sand that have irritated oysters, these programming pearls have grown from real problems that have irritated programmers. The programs are fun, and they teach important programming techniques and fundamental design principles.

These pearls are all classic and togehter with the well presented functional approaches, they become very helpful for ones who really wish to sharpen their functional programming on algorithms. I have so far solved / fully understood 17 pearls and hmm...the rest are indeed difficult, at least for me. Nevertheless, I would like to share my journey of studying this book with you via a series of posts, each of which is a pearl in the book. All the posts will differ from the book in the following ways:

**Only OCaml is used**for the implementations (for the obvious reason: I am a fan of OCaml)**More general (functional) analysis techniques**are adopted. The book heavily focuses on algebraic approach, namely,*design by calculation*, which honestly is too much for me. I mean I can understand the methodology, but cannot master it. So during my study, I did not replicate the algebraic thinking; instead, I tried to sniff around the functional sprit behind the curtain and consider the algebraic part as the source of hints.**Tail-recursive implementation**will be provided if possible. The haskell implementations in the book do not consider much about tail-recursive because a) haskell is lazy; b) haskell's compiler is doing clever things to help recursion not blow. OCaml is different. It is not lazy and does not rely on the compiler to do optimisations on the potential*stackoverflow*. So as OCaml developers, we need to care about tail-recursive explicitly.

Ok. Let's get started.

Given a unordered list of distinct natural numbers, find out the minimum natural number that is not in the list.

For example, if the list is [8; 2; 3; 0; 12; 4; 1; 6], then 5 is the minimum natural number that is missing.

O(n) solution is desired.

The description of an algorithm problem specifies the input, the output and the constraint. Yet, it is more than just telling us what to achieve. Most of the time, the literatures can provide us hints for possible solutions. Let's break down the description first:

- unordered list
- distinct
- natural numbers
- minimum and missing

The input list is not sorted. If this is specified explicitly, it implies that ordering is important here. In other words, if the list was sorted, then the problem would not be a problem any more or at least much easier to solve.

There are no duplicates inside.

All numbers are non-negative integers, i.e., 0, 1, 2, ... This puts a lower boundary on the possible numbers in the input list.

There might be unlimited numbers not in the input list, but we just need to find the smallest one. When our goal is to locate something with certain characteristic, it would normally be a *selection problem*. Moreover, for problems related *min* or *max*, they normally can be solved by somehow bringing sorting; however, sorting will heavily involve moving all numbers around. As our target is only one number, sorting can be an overkill.

From the analysis above, it is obvious that sorting can solve our problem quite easily. We can simply

- Sort the input list
- If the first number is not
`0`

, then the result is`0`

- Scan every consecutive pair (x, y)
- If
`y - x > 1`

then the result is`x + 1`

- If point 4 never happens, then the result is
`the last number plus 1`

.

```
let min_missing_trivial l =
let sl = List.sort compare l in
let rec find = function
| [] -> None
| x::[] -> Some (x+1)
| x::y::_ when y - x > 1 -> Some (x+1)
| _::tl -> find tl
in
(* adding -1 can cover point 2 and make find work for all *)
find ((-1)::sl)
```

Have we solved the problem? Let's have a look. The solution above can achieve the goal indeed. However, the time complexity of this solution is `O(nlogn)`

since the sorting bit is dominating, which is worse than the required `O(n)`

.

We have to try harder.

So, if we do a sorting, we can obtain the answer but it is too slow. Let's have a look again at what we get after sorting.

If the list is sorted, then it provides us a chance where we check the consecutiveness of the numbers and the first gap is what we want. There are two questions though:

Q1. Shall we care about the consecutiveness after

`6`

?

The answer is *no*. Since we are chasing for the *minimum*, i.e., the first missing one, the order of the numbers after the gap doesn't matter any more.

For example, even if `12`

is before `8`

, the result won't be affected.

Q2. Is the order of all the numbers before the gap important?

Let's randomly mess up the order of `0`

, `1`

, `2`

, and `3`

a little:

It seems fine as the messed order of those 4 numbers does not affect the position of `4`

and `6`

. But hang on a minute, something is not right there.

We replied on the consecutiveness of `0`

, `1`

, `2`

, and `3`

to locate the first gap, and the consecutiveness can be checked via the numbers being sorted. Hence, if the order before the gap was not maintained, how could we scan for consecutiveness and find the gap in the first place? It sounds like a chicken and egg thing.

So can we check for the consecutiveness of the numbers without sorting them?

Yes, we can, and now it is the time to ask for help from *distinct natural numbers*.

As we described before, *natural numbers* are integers euqal to or larger than `0`

. This lower bound `0`

plus the constraint of *no duplicates* gives us the opportunity to check for consecutiveness without requiring all numbers being sorted. Let's first see a perfect consecutive sequnce (starting from 0) of natural numbers:

They are sorted of course. Is there any other characteristic? Or say, for a number inside the sequence, how many other numbers are less than it (on its left side)?

For number `4`

, there will exact `4`

natural numbers less than itself. The same thing will apply on any numbers as long as all those belong to a perfect consecutive sequence starting from `0`

.

This is also a two-way mapping, i.e., if we are told that there are `4`

numbers less than `4`

and all of them are before `4`

, we can be sure that all five numbers can form a consecutive sequence. Most importantly, now whether all numbers are in order or not does not matter any more.

What does a perfect consecutiveness imply?

It implies that among the sequence, no one is missing and if anyone is missing, it must be on the right side of the max of the sequence.

What if for a number, the number of smaller ones does not match its own value?

It means the sequence up to the number won't be consecutive, which implies that there must be at least one a natural number missing, right?

Now we have the weapon we want. In order to check consecutiveness, or say, to know the region where the min missing natural number is, we don't need complete sorting any more. Instead, we just to

- pick a number
`x`

from the sequence - put all other smaller numbers to its left and count how many are those in
`num_less`

- put all larger ones to its right
- If
`num_less = x`

, then it means`x`

's left branch is perfect and the missing one must be in its right branch. We repeat the whole process in the sequence of right hand side. - Otherwise, we repeat in the left branch. Note that due to
*distinct*, it is impossible`num_less > x`

.

In this way, we cannot identify the min missing one in one go, but we can narrow down the range where it might be through each iteration and eventually we will find it when no range can be narrowed any more.

Sounds like a good plan? let's try to implement it.

Do you feel familiar with the process we presented above? It is actually a classic `partition`

step which is the key part of *quicksort* we talked about.

Of course, in this problem, it won't be a standard `partition`

as we need to record `num_less`

. The code should be easy enough to write:

```
let partition x l =
let f (num_less, left, right) y =
if y < x then num_less+1, y::left, right
else num_less, left, y::right
in
List.fold_left f (0, [], []) l
```

Also our solver function should be straightword too:

```
let min_missing l =
let rec find lo = function
| [] -> lo
| x::tl ->
let num_less, left, right = partition x tl in
if num_less + lo = x then find (x+1) right
else find lo left
in
find 0 l
(* Why do we introduce `lo`?
I will leave this question to the readers *)
```

For the very 1st iteration, we need to scan all numbers, so costs `O(n)`

for this step. But we can skip out ideally half of the numbers.

So for the 2nd iteration, it costs `O(n*(1/2))`

. Again, further half will be ruled out.

...

So the total cost would be `O(n * (1 + 1/2 + 1/4 + 1/8 + ...)) = O(n * 2) = O(n)`

.

We made it!

This is the first pearl and it is not that difficult, especially if you knew *quickselect* algorithm before.

Actually, many of the pearls involve certain fundamental algorithms and what one need to do is to peel the layers out one by one and eventually solve it via the essentials. If you are interested in pearls, please pay attentions to the tags attached with each pearl as they show what each pearl is really about.

]]>When I wrote the section of *When we need later substitution* in Mutable, I struggled. I found out that I didn't fully understand the recursive memoize myself, so what I had to do was just copying the knowledge from Real World OCaml. Luckily, after the post was published, *glacialthinker* commented in reddit:

(I never thought before that a recursive function can be split like this, honestly. I don't know how to induct such a way and can't explain more. I guess we just learn it as it is and continue. More descriptions of it is in the book.)

This is "untying the recursive knot". And I thought I might find a nice wikipedia or similiar entry... but I mostly find Harrop. :) He actually had a nice article on this many years back in his OCaml Journal. Anyway, if the author swings by, searching for that phrase may turn up more material on the technique.

It greatly enlightened me. Hence, in this post, I will share with you my futher understanding on *recursive memoize* together with the key cure *untying the recursive knot* that makes it possible.

We talked about the simple *memoize* before. It takes a non-recursive function and returns a new function which has exactly the same logic as the original function but with new new ability of caching the *argument, result* pairs.

```
let memoize f =
let table = Hashtbl.Poly.create () in
let g x =
match Hashtbl.find table x with
| Some y -> y
| None ->
let y = f x in
Hashtbl.add_exn table ~key:x ~data:y;
y
in
g
```

The greatness of

`memoize`

is its flexibility: as long as`f`

takes a single argument,`memoize`

can make amemo versionout of it without touching anything inside`f`

.

This means while we create `f`

, we don't need to worry about the ability of caching but just focus on its own correct logic. After we finish `f`

, we simply let `memoize`

do its job. *Memoization* and *functionality* are perfectly separated.

**Unfortunately, the simple memoize cannot handle recursive functions.** If we try to do

`memoize f_rec`

, we will get this:`f_rec`

is a recursive function so it will call itself inside its body. `memoize f_rec`

will produce `f_rec_memo`

which is a little similar as the previous `f_memo`

, yet with the difference that it is not a simple single call of `f_rec arg`

like we did `f arg`

. Instead, `f_rec arg`

may call `f_rec`

again and again with new arguments.

Let's look at it more closely with an example. Say, `arg`

in the recursive process will be always decreased by `1`

until `0`

.

- Let's first od
`f_rec_memo 4`

. `f_rec_memo`

will check the`4`

against`Hashtbl`

and it is not in.- So
`f_rec 4`

will be called for the first time. - Then
`f_rec 3`

,`f_rec 2`

,`f_rec 1`

and`f_rec 0`

. - After the 5 calls, result is obtained. Then
*4, result*pair is stored in`Hashtbl`

and returned. - Now let's do
`f_rec_memo 3`

, what will happen? Obviously,`3`

won't be found in`Hashtbl`

as only`4`

is stored in step 5. - But should
*3, result*pair be found? Yes, it should of course because we have dealt with`3`

in step 4, right? - Why
`3`

has been done but is not stored? - ahh, it is because we did
`f_rec 3`

but not`f_rec_memo 3`

while only the latter one has the caching ability.

Thus, we can use `memoize f_rec`

to produce a memoized version out of `f_rec`

anyway, but it changes only the surface not the `f_rec`

inside, hence not that useful. How can we make it better then?

What we really want for memoizing a recursive function is to blend the *memo* ability deep inside, like this:

Essentially we have to replace `f_rec`

inside with `f_rec_memo`

:

And only in this way, `f_rec`

can be fully memoized. However, we have one problem: **it seems that we have to change the internal of `f_rec`

.

If we can modify `f_rec`

directly, we can solve it easily . For instance of *fibonacci*:

```
let rec fib_rec n =
if n <= 1 then 1
else fib_rec (n-1) + fib_rec (n-2)
```

we can make the memoized version:

```
let fib_rec_memo_trivial n =
let table = Hashtbl.Poly.create () in
let rec fib_rec_memo x =
match Hashtbl.find table x with
| Some y -> y
| None ->
let y = fib_rec_memo (x-1) + fib_rec_memo (x-2) in
Hashtbl.add_exn table ~key:x ~data:y;
y
in
fib_rec_memo
```

In the above solution, we replaced the original `fib_rec`

inside with `fib_rec_memo`

, however, we also changed the declaration to `fib_rec_memo`

completely. In fact, now `fib_rec`

is totally ditched and `fib_rec_memo`

is a new function that blends the logic of *memoize* with the logic of `fib_rec`

.

Well, yes, `fib_rec_memo_trivial`

can achieve our goal, but only for `fib_rec`

specificly. If we need to make a memoized version for another recursive function, then we need to change the body of that function again. This is not what we want. **We wish for a memoize_rec that can turn f_rec directly into a memoized version, just like what the simple memoize can do for f**.

So we don't have a shortcut. Here is what we need to achieve:

- we have to replace the
`f_rec`

inside the body of`f_rec`

with`f_rec_memo`

- We have keep the declaration of
`f_rec`

. - We must assume we can't know the specific logic inside
`f_rec`

.

It sounds a bit hard. It is like giving you a compiled function without source code and asking you to modify its content. And more imporantly, your solution must be generalised.

Fortunately, we have a great solution to create our `memoize_rec`

without any *hacking* or *reverse-engineering* and **untying the recursive knot is the key**.

To me, this term sounds quite fancy. In fact, I never heard of it before *2015-01-21*. After I digged a little bit about it, I found it actually quite simple but very useful (this recursive memoize case is an ideal demonstration). Let's have a look at what it is.

Every recursive function somehow follows a similar pattern where it calls itself inside its body:

Once a recursive function application starts, it is out of our hands and we know it will continue and continue by calling itself until the *STOP* condition is satisfied. **What if the users of our recursive function need some more control even after it gets started?**

For example, say we provide our users `fib_rec`

without source code, what if the users want to print out the detailed trace of each iteration? They are not able unless they ask us for the source code and make a new version with printing. It is not that convenient.

So if we don't want to give out our source code, somehow we need to reform our `fib_rec`

a little bit and give the space to our users to insert whatever they want for each iteration.

```
let rec fib_rec n =
if n <= 1 then 1
else fib_rec (n-1) + fib_rec (n-2)
```

Have a look at the above `fib_rec`

carefully again, we can see that the logic of `fib_rec`

is already determined during the binding, it is the `fib_rec`

s inside that control the iteration. What if we rename the `fib_rec`

s within the body to be `f`

and add it as an argument?

```
let fib_norec f n =
if n <= 1 then 1
else f (n-1) + f (n-2)
(* we actually should now change the name of fib_norec
to something like fib_alike_norec as it is not necessarily
doing fibonacci anymore, depending on f *)
```

So now `fib_norec`

won't automatically repeat unless `f`

tells it to. Moreover, `fib_norec`

becomes a pattern which returns `1`

when `n`

is `<= 1`

otherwise `add`

`f (n-1)`

and `f (n-2)`

. As long as you think this pattern is useful for you, you can inject your own logic into it by providing your own `f`

.

Going back to the printing requirement, a user can now build its own version of `fib_rec_with_trace`

like this:

```
let rec fib_rec_with_trace n =
Printf.printf "now fibbing %d\n" n;
fib_norec fib_rec_with_trace n
```

Untying the recusive knot is a functional design pattern. It turns the recursive part inside the body into a new parameter

`f`

. In this way, it breaks the iteration and turns a recursive function into a pattern where new or additional logic can be injected into via`f`

.

It is very easy to untie the knots for recusive functions. You just give an addition parameter `f`

and replace `f_rec`

everywhere inside with it. For example, for `quicksort`

:

```
let quicksort_norec f = function
| [] | _::[] as l -> l
| pivot::rest ->
let left, right = partition_fold pivot rest in
f left @ (pivot::f right)
let rec quicksort l = quicksort_norec quicksort l
```

There are more examples in Martin's blog, though they are not in OCaml. A formalized description of this topic is in the article *Tricks with recursion: knots, modules and polymorphism* from The OCaml Journal.

Now let's come back to *recursive memoize* problem with our new weapon.

At first, we can require that every recursive function `f_rec`

must be supplied to `memoize_rec`

in the untied form `f_norec`

. This is not a harsh requirement since it is easy to transform `f_rec`

to `f_norec`

.

Once we get `f_norec`

, we of course cannot apply `memoize`

(non-rec version) on it directly because `f_norec`

now takes two parameters: `f`

and `arg`

.

Although we can create `f_rec`

in the way of `let rec f_rec arg = f_norec f_rec arg`

, we won't do it that straightforward here as it makes no sense to have an exactly the same recursive function. Instead, we can for now do something like `let f_rec_tmp arg = f_norec f arg`

.

We still do not know what `f`

will be, but `f_rec_tmp`

is non-recursive and we can apply `memoize`

on it: `let f_rec_tmp_memo = memoize f_tmp`

.

`f_rec_tmp_memo`

now have the logic of `f_norec`

and the ability of memoization. If `f`

can be `f_rec_tmp_memo`

, then our problem is solved. This is because `f`

is inside `f_norec`

controlling the iteration and we wished it to be memoized.

The magic that can help us here is **making f mutable**. Because

`f`

needs to be known in prior and `f_rec_tmp_memo`

is created after, we can temporarily define `f`

as a trivial function and later on after we create `f_rec_tmp_memo`

, we then change `f`

to `f_rec_tmp_memo`

. Let's use a group of bindings to demonstrate:

```
(* trivial initial function and it should not be ever applied in this state *)
let f = ref (fun _ -> assert false)
let f_rec_tmp arg = f_norec !f arg
(* memoize is the simple non-rec version *)
let f_rec_tmp_memo = memoize f_rec_tmp
(* the later substitution made possible by being mutable *)
f := f_rec_tmp_memo
```

The final code for `memoize_rec`

is:

```
let memo_rec f_norec =
let f = ref (fun _ -> assert false) in
let f_rec_memo = memoize (fun x -> f_norec !f x) in
f := f_rec_memo;
f_rec_memo
```

]]>While OCaml is a functional programming language and emphasises pure functional style, it allows *mutable* (variables and values) and hence imperative programming style. The reason is said in Real World OCaml:

Imperative code is of fundamental importance to any practical programming language, because real-world tasks require that you interact with the outside world, which is by its nature imperative. Imperative programming can also be important for performance. While pure code is quite efficient in OCaml, there are many algorithms that can only be implemented efficiently using imperative techniques.

How to write imperative code in OCaml has been well introduced in both OCaml's documentation and user's manual and Real World OCaml. The imperative logic flow is very similar as in other imperative languages. For example, we can write an OCaml *binary search* like this:

```
let binary_search a x =
let len = Array.length a in
let p = ref 0 and q = ref (len-1) in
let loc = ref None in
while !p <= !q && !loc = None && !p >= 0 && !q < len do
let mi = (!p + !q) / 2 in
if x = a.(mi) then loc := Some mi
else if x > a.(mi) then p := mi + 1
else q := mi - 1
done;
!loc
```

*Mutable* or *imperative* could be a snake. People, who are *forced* to use OCaml (for work or course assignments) *while they are not yet convinced by the potential functional power*, may be happy when they know stuff can be written like in Java. They may also intend to (secretly) write imperative code in OCaml as much as possible to just get things done.

It sounds bad and it indeed is. What is the point to code in a heavily imperative way with a functional programming language?

However, I won't be worried even if one does that, because I know it is just the initial avoidance and it won't last long. Why? Because writing imperative code is a bit troublesome anyway. Let's revisit the `binary_search`

we wrote before.

You can see that nothing is straightforward there.

- When we define a mutable variable, we have to use
`ref`

; - When we get the value out, we have to use
`!`

everywhere; - When we assign a value, we can't forget
`:`

before`=`

; - We even need to add
`.`

before`()`

when we access arrays.

Even after we finish the function, it appears quite verbose and a little bit uncomfortable to read, right? This is why I am sure in longer term, one may give up imperative style, just truly learn the functional style and eventually understand OCaml's beautiful side.

So if we will not / should not write everything imperatively, when to leverage the benefits from *mutable*?

In order to manipulate a sequence of elements, we noramlly will use `array`

in imperative languages; however, in pure functional language, we prefer `list`

as it is immutable.

The best thing `array`

offers us is the *O(1)* speed in accessing an element via its index. But we lose this advantage if using `list`

, i.e., we have to do a linear scan, which is `O(n)`

, to get the `ith`

value. Sometimes it would be too slow. To demonstrate this, let's have a look at the problem of *shuffle*:

Given a sequence, randomize the order of all the elements inside.

A typical algorithm for *shuffle* can be:

`i`

initially is`0`

and`len`

is the length of the sequence.- Randomly generate an integer
`r`

within`[i, len)`

. - Swap the element at
`i`

and the one at`r`

. - Increase
`i`

for next targeting element. - Repeat from step 2.

If we use `list`

in OCaml, then we will have:

```
let rm_nth n l =
let rec aux acc i = function
| [] -> List.rev acc
| _::tl when i = n -> List.rev_append acc tl
| x::tl -> aux (x::acc) (i+1) tl
in
aux [] 0 l
(* a slight modification when using list.
we do not swap, instead, we remove the element from the randomised index
and put it to the new list.
*)
let shuffle l =
Random.self_init();
let len = List.length l in
let rec aux i acc l =
if i < len then
let r = Random.int (len - i) in
aux (i+1) (List.nth l r ::acc) (rm_nth r l)
else acc
in
aux 0 [] l
```

We have to scan the list twice in each iteration: once to get the element at the randomised index `r`

and once to remove it. Totally we have `n`

iterations, thus, the time complexity is *O(n^2)*.

If we use `array`

, then it is much faster with time complexity of *O(n)* as locating an element and swapping two elements in array just cost *O(1)*.

```
let swap a i j =
let tmp = a.(i) in
a.(i) <- a.(j);
a.(j) <- tmp
let shuffle_array a =
Random.self_init();
let len = Array.length a in
for i = 0 to len-1 do
let r = i + Random.int (len - i) in
swap a i r
done
```

In addition to `array`

, some other imperative data structures can also improve performance. For example, the `Hashtbl`

is the traditional `hash table`

with `O(1)`

time complexity. However, the functional `Map`

module uses *AVL* tree and thus provides `O(logN)`

. If one really cares about such a difference, `Hashtbl`

can be used with higher priority.

Note that we *should not use potential performance boost as an excuse* to use imperative code wildly in OCaml. In many cases, functional data structures and algorithms have similar performance, such as the *2-list* based functional queue can offer us amortised `O(1)`

.

Constantly creating new values makes functional programming safer and logically clearer, as discussed previously. However, occasionally we don't want everything to be newly created; instead, we need a mutable *object* that stays put in the memory but can be updated. In this way, we can remember values in a single place.

Write a function that does

`x + y * z`

. It outputs the correct result and in addition, prints how many times it has been called so far.

The `x + y * z`

part is trivial, but the later recording the times of calls is tricky. Let's try to implement such a function in pure functional style first.

```
(* pure functional solution *)
let f x y z last_called_count =
print_int (last_called_count+1);
x + y * z, last_called_count + 1
```

The code above can roughly achieve the goal. However, it is not great.

- It needs an additional argument which is meant to be the most recent count of calls. The way could work but is very vulnerable because it accepts whatever integer and there is no way to constrain it to be the real last count.
- It has to return the new call count along with the real result

When a user uses this function, he / she will feel awkward. `last_called_count`

needs to be taken care of very carefully in the code flow to avoid miscounting. The result returned is a tuple so pattern matching is necessary to obtain the real value of `x + y * z`

. Also again, one need to somehow remember the new call count so he / she can supply to the next call.

This is where imperative programming comes to help:

```
(* with help of imperative programming.
note that the function g can be anonymous.*)
let f =
let countable_helper () =
let called_count = ref 0 in
let g x y z =
called_count := !called_count + 1;
print_int !called_count;
x + y * z
in
g
in
countable_helper()
```

`countable_helper`

is a helper function. If it is called, it first creates a mutable `called_count`

and then pass it to `g`

. Because `called_count`

is mutable, so `g`

can freely increase its value by `1`

. Eventually `x + y * z`

is done after printing `called_count`

. `g`

is returned to f as it is the result of `countable_helper()`

, i.e., `f`

is `g`

.

I found this interesting case from The chapter for imperative programming in Real World OCaml and it is about *memoize*. In this section, we borrow contents directly from the book but will emphasise *the substitution* part.

Python developers should be quite familiar with the `memoize`

decorator . Essentially, it makes functions remember the *argument, result* pairs so that if the same arguments are supplied again then the stored result is returned immediately without repeated computation.

We can write `memoize`

in OCaml too:

```
(* directly copied from Real World OCaml, does not use anonymous function *)
let memoize f =
let table = Hashtbl.Poly.create () in
let g x =
match Hashtbl.find table x with
| Some y -> y
| None ->
let y = f x in
Hashtbl.add_exn table ~key:x ~data:y;
y
in
g
```

It uses `Hashtbl`

as the imperative storage box and the mechanism is similar as our previous example of *call count*.

The fascinating bit comes from the fact that this `memoize`

cannot handle recursive functions. If you don't believe it, let's try with the *fibonacci* function:

```
let rec fib n =
if n <= 1 then 1 else fib(n-1) + fib(n-2)
let fib_mem = memoize fib
```

Hmmm...it will compile and seems working. Yes, `fib_mem`

will return correct results, but we shall have a closer look to see whether it really remembers the *argument, result* pairs. So what is `fib_mem`

exactly? By replacing `f`

with `fib`

inside `memoize`

, we get:

```
(* not proper ocaml code, just for demonstration purpose *)
fib_mem is actually
match Hashtbl.find table x with
| Some y -> y
| None ->
let y = fib x in (* here is fib *)
Hashtbl.add_exn table ~key:x ~data:y;
y
```

So inside `fib_mem`

, if a new argument comes, it will call `fib`

and `fib`

is actually not memoized at all. What does this mean? It means `fib_mem`

may eventually remember the new argument and its result, but will never remember all the arguments and their results along the recusive way.

For example, let's do `fib_mem 5`

.

`5`

is not in the Hashtbl, so`fib 5`

is called.- According to the definition of
`fib`

,`fib 5 = fib 4 + fib 3`

, so now`fib 4`

. `fib 4 = fib 3 + fib 2`

, no problem, but look,`fib 3`

will be done after`fib 4`

finishes.- Assuming
`fib 4`

is done, then we go back to the right hand side of the`+`

in point 2, which means we need to do`fib 3`

. - Will we really do
`fib 3`

now? Yes, unfortunately. Even though it has been done during`fib 4`

, because`fib`

is not memoized.

To solve this problem, we need *the substitution* technique with the help of imperative programming.

When I read the book for the first time, before continuing for the solution of memoized fib, I tried something on my own.

So the problem is the `f`

inside `memoize`

is not memoized, right? How about we make that that `f`

mutable, then after we define `g`

, we give `g`

to `f`

? Since `g`

is memoized and `g`

is calling `g`

inside, then the whole thing would be memoized, right?

```
let mem_rec_bad f =
let table = Hashtbl.Poly.create () in
let sub_f = ref f in
let g x =
match Hashtbl.find table x with
| Some y -> y
| None ->
let y = !sub_f x in
Hashtbl.add_exn table ~key:x ~data:y;
y
in
sub_f := g;
g
```

It can pass compiler but will stackoverflow if you do `let fib_mem = mem_rec_bad fib;; fib_mem 5`

. After we define `g`

and replae the original value of `sub_f`

with `g`

, it seems fine, but What is `g`

now?

```
(* g is now actually like: *)
let g x =
match Hashtbl.find table x with
| Some y -> y
| None ->
let y = g x in
Hashtbl.add_exn table ~key:x ~data:y;
y
```

`g`

will check the Hashtbl forever! And we totally lose the functionality of the original `f`

and `g`

becomes meaningless at all.

So we can't directly replace the `f`

inside. Is there any way to reserve the functionality of `f`

but somehow substitute some part with the memoization?

**Updated on 2015-01-25:** Please go to Recursive Memoize & Untying the Recursive Knot for better explanation of recursive memoize. This following content here is not that good.

The answer is Yes, and we have to build a non-recusive function out of the recusive `fib`

.

```
(* directly copied from the book *)
let fib_norec f n =
if n <= 1 then 1
else f(n-2) + f(n-1)
let fib_rec n = fib_norec fib_rec n
```

(I never thought before that a recursive function can be split like this, honestly. I don't know how to induct such a way and can't explain more. I guess we just learn it as it is and continue. More descriptions of it is in the book.) (*[1]*)

Basically, for a recursive function `f_rec`

, we can

- change
`f_rec`

to`f_norec`

with an additional parameter`f`

- replace the
`f_rec`

in the original body with`f`

- then
`let rec f_rec parameters = f_norec f_rec parameters`

- In this way,
`f_rec`

will bring up`f_norec`

whenever being called, so actually the recusive logic is still controled by`f_norec`

.

Here the important part is `f_norec`

and its parameter `f`

gives us the chance for correct substitution.

```
let memo_rec f_norec =
let fref = ref (fun _ -> assert false) in
(* !fref below will have the new function defined next, which is
the memoized one *)
let h x = f_norec !fref x in
let f_mem = memoize h in
fref := f_mem;
f_mem
```

Now, `fref`

's value will become the memoized one with `f_norec`

. Since `f_norec`

controls the logic and real work, we do not lose any functionality but can remember every *argument, result* pairs along the recursive way.

In this post, we just list out three quite typical cases where imperative programming can be helpful. There are many others, such as lazy, etc.

Moreover, one more suggestion on using imperative style is that *don't expose imperative interface to public if you can*. That means we should try to embed imperative code inside pure function as much as possible, so that the users of our functions cannot access the mutable parts. This way can let our functional code continue being pure enough while enjoying the juice of imperative programming internally.

**[1].** glacialthinker has commented here. This technique is called *untying the recursive knot*.

Mystique in the head image is another major character in X-Men's world. *She is a shapeshifter who can mimic the appearance and voice of any person with exquisite precision.* Similar like the *mutable* in OCaml that always sticks to the same type once being created, human is the only *thing* she can mutate to.

In **pure functional programming**, everything is immutable. Strings, lists, values of customised types etc cannot be changed once being created. For example, a list such as `[1;2;3]`

does not have any operation that can alter its elements. If you want the stuff inside to be changed, you need to create a new list based on it. Suppose you want to change the first element of `[1;2;3]`

to be `4`

, then you have to `let new_list = 4::List.tl [1;2;3]`

or directly `let new_list = [4;2;3]`

, however, you don't have ways to do something like `old_list.first_element = 4`

.

If you have got used to popular imperative programming languages such C, Java, Python, etc, you may feel that things being immutable sounds quite inconvenient. Yes, maybe sometimes being immutable forces us to do more things. In the example above, if we are able to do `old_list.first_element = 4`

, it seems more natural and straightforward. However, please do not overlook the power of being immutable. It provides us extra safety, and **more importantly, it leads to a different paradigm - functional programming style**.

Before we start to dive into immutables, there are two things that need to be clarified first.

People who start to learn OCaml may see something, like the trivial code below, is allowed:

```
let f x =
let y = 1 in
let y = 2 in
x * y
```

There are two `y`

s and it seems that `y`

has been changed from `1`

to `2`

. A conclusion might be made that things in OCaml are actually mutable because `y`

is mutated. This conclusion is wrong. In OCaml, it is called *shadowing*.

Basically, the two `y`

s are different. After the binding of second `y`

is defined, the first `y`

is still there for the time being. Just because it happens to have the same name as second `y`

, it is in the shadow and can never be accessed later in the term of scope. Thus we should not say `y`

is mutated and in OCaml *variable* / *binding* is indeed immutable.

The concept of being *immutable* or *mutable* was introduced because we need it to help us to predict whether something's possible change would affect other related things or not. In this section, we majorly talk about *mutable* in imperative programming as clear understading of *what being mutable means* can help us appreciate *immutable* a little more.

It is easy to know *variable* is mutable in imperative programming. For example, we can do `int i = 0;`

, then `i = 1;`

and the same `i`

is mutated. Altering a variable can affect other parts in the program, and it can be demonstrated using a *for loop* example:

```
/* The *for* scope repeatedly access `i`
and `i`'s change will affect each iteration. */
for (int i = 0; i < 10; i++)
System.out.println(i);
```

Sometimes, even if a variable is mutated, it might not affect anything, like the example (in Java) below:

```
int[] a = {1;2;3};
int[] b = a;
a = {5;6;7};
/* What is b now? */
```

`a`

was initially `{1,2,3}`

. `b`

has been declared to be equal to `a`

so `b`

has the same value as `{1,2,3}`

. `a`

later on was changed to `{5,6,7}`

. Should `b`

be changed to `{5,6,7}`

as well? Of course not, `b`

still remains as `{1,2,3}`

. Thus, even if `a`

is mutated, `b`

is not affected. This is because in this case, the *value* underneath has not been mutated. Let's modify the code a little bit:

```
int[] a = {1,2,3};
int[] b = a;
a[0] = 0;
/* What is b now? */
```

`a[0] = 0;`

altered the first element in the underlying array and `a`

would become `{0,2,3}`

. Since *array* in Java is *mutable*, `b`

now becomes `{0,2,3}`

too as `b`

was assigned the identical array as `a`

had initially. So the array value being mutable can lead to the case where if a mutable value is changed somewhere, all places that access it will be affected.

A *mutable* value must provide a way for developers to alter itself through bindings, such as `a[0] = 0;`

is the way for arrays, i.e., via `a[index]`

developers can access the array `a`

by `index`

and `=`

can directly assign the new element at that index in the array.

A *immutable* value does not provide such as way. We can check immutable type *String* in Java to confirm this.

To summarise, when dealing with *mutable*, especially when we test our imperative code, we should not forget possible changes of both *variable* and *value*.

**Note** that OCaml does support *mutable* and it is a kind of hybrid of *functional* and *imperative*. However, we really should care about the *functional* side of OCaml more. For the *imperative* style of OCaml and when to use it, the next post will cover with details.

As described before, if a mutable variable / value is altered in one place, we need to be cautious about all other places that may have access to it. The code of altering an array in the previous section is an example. Another example is supplying a mutable value to a method as argument.

```
public static int getMedian(int[] a) {
Arrays.sort(a);
return a[a.length/2];
}
public static void main(String[] args) {
int[] x = {3,4,2,1,5,9,6};
int median = getMedian(x);
/* x is not {3,4,2,1,5,9,6} any more */
...
}
```

`getMedian`

will sort the supplied argument and the order of the elements in `x`

will be changed after the call. If we need `x`

to always maintain the original order, then the above code has serious flaw. It is not that bad if it is us who wrote `getMedian`

, because we can just refine the code directly. However, if `getMedian`

is in a third party library, even though we still can override this method, how can we know whether other method in the same library have this kind of flaw or not? A library would be useless if we cannot trust it completely.

Allowing mutables also introduces troubles for threading. When multiple threads try to alter a mutable value at the same time, we sometimes can just put a locker there to restrain the access to the value, with the sacrifice of make it single-threading at that point. However, if we have to precisely control the order of altering, things get much more complicated. This is why we have hundreds of books and articles to teach how to do a great multithreading in imperative programming languages.

If everything is immutable, then we do not need to worry about any of those any more. When we need to alter an immutable value, we create a new value that implements the change based on the old value and the old value is kept intact. All places that need access to the original value are very safe. As an example in OCaml:

```
(* trivial way to get median in OCaml, simply for comparison to the imperative case *)
let get_median l =
(* we can't sort in place, have to return a new sorted list *)
let sorted_l = List.sort compare l in
let len = List.length l in
List.nth (len/2) sorted_l (* here we use our new sorted list *)
let median_mul_hd =
let l = [3;4;2;1;5;9;6] in
let median = get_median l in
median * List.hd l (* the hd of l will not be 1, but still 3 - the original hd *)
```

With immutables, the burden of cautions on unexpected modification of values are relieved for us. Hence, for OCaml, we do not need to spend time on studying materials such as *Item 39: Make defensive copies when needed* in Effective Java.

Being immutable brings extra safety to our code and helps us to avoid possibly many trivial bugs like the ones described before. However, this is not free of charge. We may feel more restrictive and lose much flexibilities (*[1]*). Moreover, because we cannot alter anything, pure functional programming forces us to adapt to a new style, which will be illustrated next.

As mentioned in Recursion Reloaded, we *cannot* use loops such in pure functional programming, because *mutable* is not in functional world. Actually, there was a little misinterpretation there: it is not *we cannot* but is *we are not able to*.

A typical *while loop* in Java looks like this:

```
int i = 0;
while ( i < 10 ) {
...
i++;
}
/* note that above is the same as
for(int i = 0; i < n; i++) ... */
```

The reason that the while loop works in Java is because:

`i`

is defined before`while`

scope.- We go into
`while`

scope and`i`

continues to be valid inside. - Since
`i`

is mutable,`i`

's change is also valid inside`while`

. - So we can check the same
`i`

again and again, and also use it repeatedly inside`while`

.

Are we able to do the same thing in OCaml / Functional Programming? The answer is **no**.If you are not convinced, then we can try to assume we were able to and **fake** some OCaml syntax.

```
(* fake OCaml code *)
let i = 0 in
while i < 10 do
...
let i = i + 1 ...
end while
```

`i`

is defined before*while*, no problem.- Initially,
`while`

gets`i`

for the first time and check it against`10`

, no problem. `i`

can be used for the first time inside`...`

, still no problem.- Then we need to increase
`i`

and the problem starts. - Recall that
`i`

is immutable in OCaml. The`first i`

after`let`

in`let i = i + 1`

is actually new and the`original i`

is shadowed. - The new
`i`

is created within the`while`

scope, so after one iteration, it is invalid any more. - Thus the second time at
`while i < 10 do`

, the original`i`

would be still 0 and the loop would stuck there forever.

The above attempt proves that we can't do looping in pure functional programming any more. This is why we replace it with recursion as described in Recursion Reloaded before.

Furthermore, we can see that the execution runtime in imperative programming is normally driven by mutating states (like the *while loop* gets running by changing `i`

). In functional programming, however, the execution continues by creating new things and giving them to the next call / expression / function. Just like in recursion, a state needs to be changed, so we give the same function the new state as argument. This is indeed *the functional programming style:* **we create and then deliver**.

One may still think *imperative* is easier and more straightforward than *functional*. Our brain likes *change and forget*, so for *imperative*'s easiness, maybe. However, if you say *imperative* is more straightforward, I doubt it.

Let's temporarily forget the programming / coding part and just have a look at something at high level.

Suppose there is process, in which there are two states:

`A`

and`B`

. Most importantly,`State A`

can change to`State B`

. Now plese close your eyes and draw a diagram for this simple process in your mind.

Done?

I bet you imagined this diagram as below:

But NOT

When we were really thinking of the trivial process, there must be two separate states in our mind and an explicit arrow would be there to clearly indicate the transition. This is much more nature than imagining just one state and the two situations inside. Afterall, this is called landscape and we often need it when we design something.

When we start to implement it, we might think differently. We sometimes intend to use the easy way we get used to. For example, in many cases we can just use one state variable and forget `State A`

after transiting to `State B`

during implementation, because always remembering `State A`

may be unnecessary. This difference between design and implementation are always there and we are just not aware of it or we just feel it smoother due to our imperative habit.

Functional programming style does not have that difference. In the above example, if we implement it in OCaml, we will do exactly the same as the upper diagram, i.e., we create `State A`

and when the state needs to be changed to `B`

, we create a new `State B`

. This is why I say *functional* is more straightforward because it directly reflects our intuitive design. If you are not convinced still, let's design and implement the *quicksort* in both imperative and functional styles as an demonstration.

Quicksort is a classic sorting algorithm.

- If only one or no element is there, then we are done.
- Everytime we select a pivot element, e.g. the first element, from all elements that we need to sort.
- We compare elements with the pivot, put the
*smaller*elements on its left hand side and*larger*ones on its right hand side. Note that neither*smaller*s or*larger*s need to be in order. - The pivot then gets fixed.
- Quicksort all elements on its left.
- Quicksort all elements on its right.

The core idea of quicksort is that we partition all elements into 3 parts: *smaller*, *pivot* and *larger* in every iteration. So *smaller* elements won't need to be compared with *bigger* elements ever in the future, thus time is saved. In addition, although we split the elements into 3 parts, *pivot* will be fixed after one iteration and only *smaller* and *larger* will be taken account into the next iteration. Hence the number of effective parts, which cost time next, is actually 2. How many times we can get *smaller* and *larger* then? *O(logn)* times, right? So at most we will have *O(logn)* iterations. Hence The time complexity of quicksort is *O(nlogn)*.

The key part of quicksort is *partitioning*, i.e., step 2 and 3. How can we do it? Again, let's design the process first without involving any coding details.

Say we have some elements `4, 1, 6, 3, 7`

. You don't have to write any code and you just need to manually partition them once like step 2 and 3. How will you do it in your imagination or on paper? Could you please try it now?

Done?

I bet you imagined or wrote down the flow similar as below:

The mechanism of *partition* is not complicated. We get the *pivot* out first. Then get an element out of the rest of elements one by one. For each element, we compare it with the *pivot*. If it is smaller, then it goes to the left; otherwise, goes to the right. The whole process ends when no element remains. Simple enough? I believe even for non-developers, they would mannually do the *partition* like shown in the diagram.

Our design of *partition* is quite finished. It is time to implement it.

Let's first try using Java. Can we follow the design in the diagram directly? I bet not. The design assumed that the pivot `4`

will have two bags (initially empty): one on its left and the other on its right. Just this single point beats Java. In Java or other imperative programming languages, normally when we sort some elements, all elements are in an array and we prefer doing things in place if we are able to. That says, for example, if we want `1`

to be on the left hand side of `4`

, we would use *swap*. It is efficient and the correct way when adjusting the positions of elements in an array.

There are a number of ways to implement *partition* in Java and here the one presented in the book of Algorithms, Robert Sedgewick and Kevin Wayne is briefly presented as a diagram below:

We do not have *smaller bag* or *larger bag* any more. We do not do *throwing element into according bag* either. We have to maintain two additional pointers `i`

and `j`

. `i`

is used to indicate the next element in action. If an element is smaller than the pivot, swapping it with the pivot will put it to the left. If an element is larger, then we swap it with `j`

's element. `j`

can now decrease while `i`

stays put because `i`

now is a new element that was just swapped from `j`

. The whole process stops when `i > j`

.

As we can see, the implementation is quite different from the high level design and much more complicated, isn't it? The design just tells us *we get a new element out, compare it with the pivot, and put it to the left or right depending on the comparison.*. We cannot have such a simple summary for the implementation above. More than that, we need to pay extra attention on `i`

and `j`

as **manipulating indices of an array is always error-prone**.

The Java code is directly copied from the book to here:

```
private static int partition(Comparable[] a, int lo, int hi) {
int i = lo;
int j = hi + 1;
Comparable v = a[lo];
while (true) {
// find item on lo to swap
while (less(a[++i], v))
if (i == hi) break;
// find item on hi to swap
while (less(v, a[--j]))
if (j == lo) break; // redundant since a[lo] acts as sentinel
// check if pointers cross
if (i >= j) break;
exch(a, i, j);
}
// put partitioning item v at a[j]
exch(a, lo, j);
// now, a[lo .. j-1] <= a[j] <= a[j+1 .. hi]
return j;
}
```

The functional implementation fits the design perfectly and We have to thank the *always creating new and deliver* of functional programming style for that.

- The design asks for two bags around the pivot and we can directly do that:
`(left, pivot, right)`

. - The design says if an element is
*smaller*, then put to the left. We can again directly do that`if x < pivot then (x::left, pivot, right)`

. - We can also directly handle
*larger*:`else (left, pivot, x::right)`

.

For iterations, we can just use *recursion*. The complete OCaml code for *partition* is like this:

```
let partition pivot l =
(* we can actually remove pivot in aux completely.
we put it here to make it more meaningful *)
let rec aux (left, pivot, right) = function
| [] -> (left, pivot, right)
| x::rest ->
if x < pivot then aux (x::left, pivot, right) rest
else aux (left, pivot, x::right) rest
in
aux ([], pivot, []) l (* two bags are initially empty *)
```

If you know `fold`

and remove pivot during the process, then it can be even simpler:

```
(* using the List.fold from Jane Street's core lib *)
let partition_fold pivot l =
List.fold ~f:(
fun (left, right) x ->
if x < pivot then (x::left, right)
else (left, x::right)) ~init:([], []) l
```

One we get the functional *partition*, we can write *quicksort* easily:

```
let rec quicksort = function
| [] | _::[] as l -> l
| pivot::rest ->
let left, right = partition_fold pivot rest in
quicksort left @ (pivot::quicksort right)
```

Now we have two implementations for *quicksort*: imperative and functional. Which one is more straightforward and simpler?

In fact, functional style is more than just being consistent with the design: **it handles data directly, instead of via any intermedia layer in between**. In imperative programming, we often use array and like we said before, we have to manipulate the indices. The *indices* stand between our minds and the data. In functional programming, it is different. We use immutable list and using indices makes not much sense any more (as it will be slow). So each time, we just care the head element and often it is more than enough. This is one of the reasons why I love OCaml. Afterall, we want to deal with data and why do we need any unnecessary gates to block our way while they pretend to help us?

One may say imperative programming is faster as we do not need to constantly allocate memory for new things. Yes, normally array is faster than list. And also doing things in place is more effient. However, OCaml has a very good type system and a great GC, so the performance is not bad at all even if we always creating and creating.

If you are interested, you can do the design and imperative implementation for *mergesort*, and you will find out that we have to create new auxiliary spaces just like we would do in functional programming.

So far in this post, we used quite a lot of comparisons between imperative and functional to prove *immutable is good* and *functional programming style is great*. Yes, I am a fan of OCaml and believe functional programming is a decent way to do coding. However, I am not saying imperative programming is bad (I do Java and Python for a living in daytime and become an OCamler at deep night).

*Imperative* has its glories and OCaml actually never forgets that. That's why OCaml itself is a hybrid, i.e., you can just write OCaml code in imperative style. One of the reasons OCaml reserves imperative part is that sometimes we need to remember a state in one single place. Another reason can be that we need arrays for a potential performance boost.

Anyhow, OCaml is still a functional programming language and we should not forget that. Especially for beginners, please do pure functional programming in OCaml as much as possible. Do not bow to your imperative habit. For the question of *when should I bring up imperative style in OCaml* will be answered in the next post Mutable.

**[1].**

People enjoy flexbilities and conveniences, and that's one of the reasons why many people fancy Java and especially Python. It feels free there and we can easily learn them in 7 days and implement our ideas with them quickly. However, be aware of the double sides of being flexible and convenient.

When we really want to make a proper product using Java or Python, we may need to read books on *what not to do even though they are allowed* (yeah, this is a point, if we should not use something in most cases, why it is allowed at all?). We have to be very careful in many corners and 30% - 50% of the time would be spent on testing. Eventually, *how to write beautiful code* might become *how to test beautifully*.

OCaml and other functional programming languages are on the opposite side - they are very restrictive, but restrictions are for the great of good. For example, we cannot freely use the best excuse of *sorry I can't return you a valid answer* - **return null;**. Instead, we have the *None*, which is part of *option*, to explicitly tell you that nothing can be returned. The elegant reward of using *option* is that when we try to access it, **we have to deal with the None**. Thus

For *if* and *pattern matching*, OCaml forces us to cover all possible cases. It sounds troublesome during coding as we sometimes know that we don't care about some cases or some cases will never happen. Yes, we may ignore those cases in our working code in Java or Python, but later do we have to consider them in testing code? Often we do, right?

OCaml cuts off some flexibilities. As an outcome, to achieve a goal, we have quite limited number of ways and occasionally even just one way and that way might be very tough (we will see it in later posts on functional pearls). But trust me, after mastering OCaml at a certain level and looking at the very robust OCaml code you wrote, you would appreciate OCaml and recall this slogan:

"Less is more" - Andrea del Sarto, Robert Browning.

**[2].** "Options are important because they are the standard way in OCaml to encode a value that might not be there; there's no such thing as a NullPointerException in OCaml. " - Chapter 1 in Real World Ocaml

**Ps.**

Wolverine in the head image of this post is a major character in X-Men's world. Since the regeneration of his body cells is super fast, he will not be permernantly hurt or age. Just like Wolverine, *immutables* in OCaml will not change once created. However, unlike Wolverine, they can not be immortals because many of them would be recycled by the garbage collector at some point during the runtime.

Binary Search Tree (*BST*) is one of the most classic data structures. The definition for its structure is shown as below:

- It consists of
*Nodes*and*Leaves*. - A
*Node*has a child bst on the*left*side, a*key*(, a*data*), and a child bst on the*right*side. Note here**a node's left or right child is not a node, instead, is indeed another binary search tree**. - A
*Leaf*has nothing but act only as a STOP sign.

```
type 'a bst_t =
| Leaf
| Node of 'a bst_t * 'a * 'a bst_t (* Node (left, key, right) *)
(* a node may also carry a data associated with the key.
It is omitted here for neater demonstration *)
```

The important feature that makes BST unique is

- for any node
- all keys from its
**left child are smaller**than its own key - all keys from its
**right child are bigger**(assumming keys are distinct)

- all keys from its

And here is an example of BST:

Instead of continuing to present the basics of BST, this post will now focus on how to attack BST related problems with the most powerful weapon: **Recursion**.

From Recursion Reloaded, we know that one way to model recursion is:

- Assume we already got a problem solver:
`solve`

. - Don't think about what would happen in each iteration.
**Split**the problem to smallers sizes (*N = N1 + N2 + ...*).- Solve those smaller problems like
`solve N1`

,`solve N2`

, ... Here is the tricky part: still, we do not know or care what are inside`solve`

and how`f`

would do the job, we just believe that it will return the correct results. - Now we have those results for smaller problems, how to
**wire**them up to return the result for*N*? This is the ultimate question we need to answer. - Of course, we do not forget the
**STOP sign**for some edge cases. - Together with point 5 and 6, we get our
`solve`

.

A good thing coming from BST is that the *split* step has been done already, i.e., a BST problem can be always divided into *left child*, *root*, and *right child*.

Thus if we assume we already got `solve`

, we just need to solve *left* and / or solve *right*, then do something with *root*, and finally wire all outcomes to obtain the final result. Again, we should of course never forget the STOP sign and in BST, usually it is the *Leaf*, i.e., we need to ask ourselves what if the BST is a *Leaf*.

The thinking flow is illustrated as the diagram below.

Before we start to look at some problems, note that in the diagram above or Recursion Reloaded, we seem to always solve *both left and right*, or say, *all sub-problmes*. It is actually not necessary. For BST, sometimes *either left or right* is enough. Let's have a look at this case first.

Our starter for this section is the simplest yet very essential operation: **insert** a key to a BST.

From the definition of BST, we know the basic rule is that if the new key is smaller than a root, then it belongs to the root's left child; otherwise, to the root's right child. Let's follow the modelling in the previous diagram to achieve this.

- We assume we already got
`insert key bst`

- We know the problem can be split into
*left*,*root*, and*right*. - Because a new key can go either left or right, so we need to
*deal_with root*first, i.e., compare the new key and the key of the root. - Directly taken from the rule of BST, if the new key is smaller, then we need to
*solve left*thus`insert key left`

; otherwise`insert key right`

. - What if we get to a
*Leaf*? It means we can finally place our new key there as a new*Node*and of course, at this moment both children of the new*Node*are*Leaves*.

Note that the BST type in OCaml we defined early on is pure functional, which means every time we need to update something, we have to create new. That's why in the diagram, even if we just insert x to left or right, we need to construct a new *Node* because we are updating the left child or the right one. The code is shown as below.

```
let rec insert x = function
| Leaf -> Node (Leaf, x, Leaf) (* Leaf is a STOP *)
| Node (left, k, right) ->
if x < k then Node (insert x left, k, right) (* if x is smaller, go left *)
else Node (left, k, insert x right) (* otherwise, go right *)
```

`member`

is to check whether a given key exists in the BST or not. It is very similar to `insert`

. There are three differences:

- When
*dealing with root*, we need to see whether the given key equals to the root's key or not. If yes, then we hit it. `member`

is a readonly function, so we directly solve*left*or*right*without constructing new*Node*.- If arriving at a
*Leaf*, then we have nowhere to go any more and it means the given key is not in the BST.

```
let rec mem x = function
| Leaf -> false
| Node (left, k, right) ->
if x < k then mem x left
else if x = k then true
else mem x right
```

Usually when we need to retrieve some properties of a BST or possibly go through all nodes to get an answer, we have to solve both children. However, the modelling technique does not change.

Recall from Some properties of a tree, the height of a tree is the number of edges that the longest path (from the root to a leaf) has. From this definition, it seems easy to get the height. We simply try to find all possible paths from root and for each path we record its number of edges. The answer will be the max of them. It sounds straightforward, but if you really try to write the code in this way, I bet the code would be a bit messy. Honestly, I never wrote in this way and I will never do that.

Another way is to think recursively. First let analyse a little bit about the longest path matter.

As we can see from the above diagram, *Root* has two edges: one to *Left* and the other to *Right*. So whatever the longest path from *Root* might be, it must pass either *Left* or *Right*. If we somehow could obtain the longest path from the root of *Left* and the longest path from the root of *Right*, the longest path from *Root* should be the bigger one of the two paths, right?

Let's now assume we already got `height`

and it will return the height of a BST. Then we can obtain `h_left`

and `h_right`

. The answer is what is the `h`

(height of *Root*)? Note that the height implies the longest path already (that's the definition). So from the paragraph above, What we need to do is getting `max h_left h_right`

. Since *Root* has an edge to either child, `h = 1 + max h_left h_right`

.

Don't forget the *STOP* sign: the height of a Leaf is 0.

```
let rec height = function
| Leaf -> 0
| Node (left, _, right) -> 1 + max (height left) (height right)
```

Simple, isn't it?

So far, it seems our hypothetic `solve`

function takes only the sub-probem as parameter. In many cases this is not enough. Sometimes we need to supply **more arguments** to help `solve`

. For example, in the problem of retriving all keys at a certain depth definitely needs *current depth* information.

Only with the help of `current_depth`

, the *Root* can know whether it belongs to the final results.

- If
`current_depth = target_depth`

, then*Root*should be collected. Also we do not continue to solve*Left*or*Right*as we know their depths will never equal to*target_deapth*. - Otherwise, we need to solve both
*Left*and*Right*with argument of`1 + current_depth`

. - Assume our
`solve`

is working. Then`solve left (1+current_depth)`

would return a list of target nodes and so does`solve right (1+current_depth)`

. We simply then concatenate two target lists. - STOP sign:
*Leaf*is not even a*Node*, so the result will be empty list.

The code is like this:

```
let rec from_depth d cur_d = function
| Leaf -> []
| Node (left, k, right) ->
if cur_d = d then [k]
else from_depth d (cur_d + 1) left @ from_depth d (cur_d + 1) right
let all_keys_at depth bst = from_depth depth 0 bst
```

From Ninja Wiki

A system of rank existed. A jōnin ("upper man") was the highest rank, representing the group and hiring out mercenaries. This is followed by the chūnin ("middle man"), assistants to the jōnin. At the bottom was the genin ("lower man"), field agents drawn from the lower class and assigned to carry out actual missions.

**Ps.**

Some readers contacted me. They hope that maybe I can use more advanced knowledge or harder examples in my posts and the current ones might seem a little boring. I think I need to explain a bit here.

The general idea behind *Many things about OCaml* is not to write a cookbook for certain problems related to OCaml or be a place where quick solution is there and copy / paste sample code would do the trick. Instead, *Many things* means some important aspects in OCaml that might be overlooked, or some particular problems that can show the greatness of OCaml, or the elegant OCaml solutions for some typical data structures and algorithms, etc. As long as something are valuable and that value shows only in OCaml or Functional Programming, I would like to add them all in here one by one.

Fortunately or unfortunately, even though I have only limited experiences in OCaml, I found that the *many* is actually quite big. And due to this *many*, I had to make a plan to present them all in a progressive way. Topics can interleave with each other in terms of time order as we do not want to have the same food for a month. More importantly, however, all should go from simple / easy to advanced / hard. In order to present some advanced topic, we need to make sure we have a solid foundation. This is why, for example, I even produced a post for the properties of a tree although they are so basic. This is also why I reloaded recursion since recursion is everywhere in OCaml. They are a kind of preparations.

Moreover, I believe in fundamentals. Fundamentals are normally concise and contain the true essence. But sometimes they can be easily overlooked or ignored and we may need to experience a certain number of difficult problems afterwards, then start to look back and appreciate the fundamentals.

The reason of using simple examples is that it makes my life easier for demonstrations. I love visualisations and one graph can be better than thousands of words. For complicated problems and solutions, it is a bit more difficult to draw a nice and clean diagram to show the true idea behind. I will try to do so later on, but even if I could achieve it in this early stage, some readers might easily get lost or confused because of the unavoidable complication of the graph. As a result, the point of grasping fundamentals might be missed.

Anyway, please don't worry too much. Attractive problems in OCaml are always there. For example, in my plan, I will later start to present a number (maybe 15 ~ 17) of my beloved Functional Pearls in OCaml and if you are really chasing for some awesomeness, I hope they would satisfy you.

]]>One essential of computer programming is *repeating some operations*. This repetition has two forms: *for / while loop* and *recursion*.

**Loop**

- We directly setup a boundary via
*for*or*while*. - Certain conditions change inside the boundary.
- And operations are executed based on the changing conditions.
- It ends when hitting the boundary.

if asked to write the `sum`

method in Java, it might be more intuitive to have this code:

```
/* Java */
public int sum(int[] a) {
int s = 0;
for (int i = 1; i <= n; i++)
s += a[i];
return s
}
```

**Recursion**

- We write a function for the operations that need to be repeated.
- The repetition is achieved by invoking the function inside itself.
- We may not specify an explicit boundary, yet we still need to set a STOP sign inside the function to tell it when to stop invoking itself.
- It ends when hitting the STOP sign.

The recursive `sum`

in OCaml can look like this:

```
(* OCaml *)
let sum = function
| [] -> 0 (*STOP sign*)
| hd::tl -> hd + sum tl
```

**loop or recursion?**

In imperative programming, developers may use *loop* more often than recursion as it seems to be more straightforward.

However, in functional programming, we almost (*[1]*) cannot do *loop*, and *recursion* is our only option. The major reason is that *loop* always needs to maintain mutable states inside the clause to change the conditions or pass intermediate results through iterations. Unfortunately, in functional programming, mutables are not allowed (*[2]*) and also intermediates must be passed via function calls not state changing.

So, if we really wish to get into the OCaml world, we have to get used to recursion. If you feel a bit hard, then this post is supposed to help you a little bit and show you the lovely side of recursion.

Honestly, I hated recursion in the beginning. In order to **write a recursive function** or **validate one's correctness**, I thought I had to always somehow simulate the execution flow in my head and imagine divings into one level after another and every time when I did that, my brain felt a bit lost after some levels and sometimes I had to use pen & paper.

This pain is natural. Human's brain feels much more comfortable on linear tasks than recursive ones. For example, if we are asked to finish the flow of first finish *task-1* -> *task-2* -> *task-3* -> *task-4* -> *done*, it seems concise and easy. On the other hand, if we are asked to finish *task-1*, inside which we need to finish *task-2* and *task-3*, and inside *task-2* we need to finish *task-4* and *task-5*, and inside *task-3* we need...(*[3]*), our brain may not that happy, does it?

What is the difference between these two cases?

In the first case, when we do *task-1*, we do not need to care about *task-2*; and after we finish *task-1*, we just move on and forget *task-2*. Thus, we just focus and remember one thing at one time.

In the second case, when do do *task-1*, we not only need to foresee a bit on *task-2* and *task-3*, but also can't forget *task-1*; and we need to still keep *task-2* in mind after we begin *task-4*...

So the difference is very obvious: we remember / track more things in case 2 than in case 1.

Let's investigate this difference further in the paradigm of the two different `sum`

functions written early.

Summing [9;14;12;6;], what will we do?

- We will do
`9 + 14`

first, we get`23`

and we put the`23`

somewhere, say, place`S`

. - Next number is
`12`

, then we check`S`

, oh it is`23`

, so we do`23 + 12`

and get`35`

. - Now, we do not need the previous result
`23`

any more and update`S`

with`35`

. - Next one is
`6`

, ...

If we look closer into the diagram above, we will find out that at any single step, we need to care only `S`

and can forget anything happened before. For example, in step 1, we get 2 numbers from the list, calculate, store it to `S`

and forget the 2 numbers; then go to step 2, we need to retrieve from `S`

, get the 3rd number, then update `S`

with new result and forget anything else...

Along the flow, we need to retrieve one number from the list at one time and forget it after done with it. We also need to remember / track `S`

, but it is just a single place; even if its value changes, we do not care about its old value. In total, we just have to bear **2 things** in mind at all time, no matter the how big the list can be.

For a contrast, let's take a look at the recursive way.

If we try to analyse the recursive `sum`

, we get

After we retrieve the first number, 9, from the list, we cannot do anything yet and have to continue to retrieve 14 without forgetting 9, and so on so forth. It is easy to see that we need to remember **4 things** (marked as red square) if we use our brain to track the flow. It might be fine if there are only 4 numbers. Yet if the list has 100 random numbers, we need to remember **100 things** and our brains would be fried.

Basically, as long as we try to simulate the complete recursive / iteration flow, we cannot avoid remembering potentially big number of things. So what is the solution to escape this pain? The answer is simple: **we do not simulate**.

Actually, for many problems that need recursion, we do not need to look into the details of recursion and we do not need to iterate the function in our heads. **The trick is the way of modeling the recursion**.

Let's revise the recursive `sum`

function:

```
let sum = function
| [] -> 0 (*STOP sign*)
| hd::tl -> hd + sum tl
```

One **bad way** to interpret its repeatition process is

- We get an element out
- On the lefthand side of
`+`

, we hold it - On the righthand side of
`+`

, we do point 1 again. - After STOP, we do the plus on the previous numbers that were held, one by one

This way is bad because it involves simulating the flow.

The **good way** to interpret is:

- Someone tells me
`sum`

can do the job of summing a list - We don't care what exactly
`sum`

is doing - If you give me a list and ask me to sum, then I just get the head (
`hd`

) out, and add it with`sum`

of the rest - Of course, what if the list is empty, then
`sum`

should return 0

Unlike the bad way, this modelling has actually removed the iteration part from recursion, and been purely based on logic level: **sum of list = hd + sum of rest**.

To summary this process of modelling:

- I need to write a recursive function
`f`

to solve a problem - I won't think the exact steps or details inside
`f`

- I just assume
`f`

is already there and correct - If the problem size is
*N*, then I divide it into*X*and*Y*where*X + Y = N* - If
`f`

is correct, then`f X`

and`f Y`

are both correct, right? - So to solve
`N`

, we just need to do`f X`

and`f Y`

, then think out**some specific operations**to wire the results together. - Of course,
`f`

should recognise STOP signs.

So what we really need to do for recursion are:

- Think how to divid
`N`

to`X`

and`Y`

- Think how to deal with the results from
`f X`

and`f Y`

in order to get result of`f N`

- Set the STOP sign(s)

Let's rewrite `sum`

to fully demenstrate it

- We need to sum a list
- I don't know how to write
`sum`

, but I don't care and just assume`sum`

is there - a list
`l`

can be divid to`[hd]`

+`tl`

- So
`sum l = sum [hd] + sum tl`

should make sense. - Of course, STOP signs here can be two:
`sum [] = 0`

and`sum [x] = x`

.

```
let sum = function
| [] -> 0 (*STOP sign*)
| [x] -> x (*STOP sign*)
| hd::tl -> sum [hd] + sum tl (* Logic *)
(* note sum [hd] can directly be hd, I rewrote in this way to be consistent to the modelling *)
```

I am not sure I express this high level logic based modelling clearly or not, but anyway we need more practices.

We need a `merge`

function to merge two sorted lists so the output list is a combined sorted list.

Because the head of the final combined list should be always min, so we care about the first elements from two lists as each is already the min inside its owner. `l1 = hd1 + tl1`

and `l2 = hd2 + tl2`

.

We will have two cases here.

If hd1 < hd2, then X = tl1 and Y = l2, then the wiring operation is just to add hd1 as the head of `merge X Y`

If hd1 >= hd2, then X = l1 and Y = tl2, then the wiring operation is just to add hd2 as the head of `merge X Y`

If one of the lists is empty, then just return the other (no matter it is empty or not).

Then it should be easy to write the code from the above modelling.

```
let merge = function
| [], l | l, [] -> l (*STOP sign, always written as first*)
| hd1::tl1 as l1, hd2::tl2 as l2 ->
if hd1 < hd2 then hd1::merge (tl1, l2)
else hd2::merge (l1, tl2)
```

The idea can be demonstrated by the diagram below:

Mergesort is a way to sort a list.

We can split N evenly so X = N / 2 and Y = N - N / 2.

What if we already mergesorted X and mergesorted Y? We simply `merge`

the two results, right?

If the list has just 0 or 1 element, we simply don't do anything but return.

```
(* split_evenly is actually recursive too
We here just ignore the details and use List.fold_left *)
let split_evenly l = List.fold_left ~f:(fun (l1, l2) x -> (l2, x::l1)) ~init:([], []) l
let rec mergesort l =
match l with
| [] | hd::[] as l -> l
| _ ->
let l1, l2 = split_evenly l in
merge (mergesort l1) (mergesort l2)
```

So far, all examples used in this post are quite trivial and easy. I chose these easy ones because I wish to present the idea in the simplest way. Many other recursion problems are more complicated, such as permutations, combinations, etc. I will write more posts for those.

One of the execellent use case of recursion is Binary Search Tree. In BST, we will see that the size of a problems is naturally split into 3. Depending on the goal, we deal with **all or partial** of the 3 sub-problems. I will demonstrate all these in the next post.

Again, remember, when dealing with recursion, don't try to follow the execution flow. Instead, focus on splitting the problem into smaller ones and try to combine the results. This is also called *divide and conquer* and actually it is the true nature of recursion. After all, no matter the size of the problem can be, it is anyway the same function that solve the cases of `size = 0`

, `size = 1`

, and `size = n > 2`

, right?

**[1]** OCaml allows imperative style, but it is not encouraged. We should use imperative programming in OCaml only if we have to.

**[2]** OCaml allows mutables and sometimes they are useful in some cases like in lazy or *memorise*. However, in most cases, we are not encouraged to use mutables.

**[3]** I already got lost even when I am trying to design this case

This is a post on the three important properties of trees: *height*, *depth* and *level*, together with *edge* and *path*. I bet that most people already know what they are and tree (data structure) on wiki also explains them briefly.

The reason why I still decided to produce such a trivial page is that I will later on write a series of articles focusing on binary search tree in OCaml. The starters among them will be quite basic and related to these three properties.

In order to be less boring, the properties are presented in a visual way and I try to fulfill details (some might be easily overlooked) as much as possible.

Edge – connection between one node to another.

An example of edge is shown above between *A* and *B*. Basically, an edge is a line between two nodes, **or a node and a leaf**.

Path – a sequence of nodes and edges connecting a node with a descendant.

A path starts from a node and ends at another node or a leaf. Please don't look over the following points:

- When we talk about a path, it includes all nodes and all edges along the path,
*not just edges*. - The direction of a path is strictly from top to bottom and cannot be changed in middle. In the diagram, we can't really talk about a path from
*B*to*F*although*B*is above*F*. Also there will be no path starting from a leaf or from a child node to a parent node. (*[1]*)

Height of node – The height of a node is the number of edges on the longest downward path between that node and a leaf.

At first, we can see the above wiki definition has a redundant term - *downward* - inside. As we already know from previous section, path can only be downward.

When looking at height:

- Every node has height. So
*B*can have height, so does*A*,*C*and*D*. - Leaf cannot have height as there will be no path starting from a leaf.
- It is the longest path from the node
**to a leaf**. So*A*'s height is the number of edges of the path to*E*, NOT to*G*. And its height is 3.

**The height of the root is 1.**

Height of tree –The height of a tree is the number of edges on the longest downward path between the root and a leaf.

So **the height of a tree is the height of its root**.

Frequently, we may be asked the question: *what is the max number of nodes a tree can have if the height of the tree is h?*. Of course the answer is $ 2^h-1 $ . When $ h = 1 $ , the number of node inside is 1, which is just the root; also when a tree has just root, the height of the root is 1. Hence, the two inductions match.

How about giving a height of 0? Then it means we don't have any *node* in the tree; but still we may have *leaf* inside. This is why in most languages, the type of a tree can be a leaf alone.

```
type 'a bst =
| Leaf
| Node of 'a bst * 'a * 'a bst
```

Moreover, when we use $2^h-1$ to calculate the max number of nodes, leaves are not taken into account. *Leaf* is not *Node*. It carries no key or data, and acts only like a STOP sign. We need to remember this when we deal with properties of trees.

Depth –The depth of a node is the number of edges from the node to the tree's root node.

We don't care about path any more when depth pops in. We just count how many edges between the targeting node and the root, ignoring directions. For example, *D*'s depth is 2.

Recall that when talking about height, we actually imply a baseline located at bottom. For depath, the baseline is at top which is root level. That's why we call it depth.

Note that **the depth of the root is 0**.

Level – The level of a node is defined by 1 + the number of connections between the node and the root.

Simply, **level is depth plus 1.**

The important thing to remember is when talking about level, it **starts from 1** and **the level of the root is 1**. We need to be careful about this when solving problems related to level.

**[1]** Although point 2 stands, sometimes some problems may talk about paths in an arbitrary way, like *the path between B and F*. We have to live with that while deep in our hearts remembering the precise definition.

**ps.** yEd - Graph Editor has been used to create all diagrams here.

Currently in post-production.

Sorry, I still haven't finished this post.

As this will be the last post in the *magic of thunk* series, I really don't want to just introduce the *Async* library and its API, as janestreet@github and realworldocaml have already done a pretty good job.

What I am looking for is the design of *Async* and the magic of *thunk* that hides inside its implementations. Since *Async* is sophisticated, it is a non-trivial task for me. I have to read the source code and it will take a while.

As discussed previously, thunk is used to encapsulate computations for later uses. Although we may not evaluate a thunk yet, we know its value is determined. However, it has a weakness: when we invoke a thunk repeatedly, the same computation inside it gets carried out again and again, despite of the fact that the results returned are always identical. For example,

```
let fibonacci n =
let rec fib a b i =
if i = n then a
else fib b (a+b) (i+1)
in
fib 0 1 0;;
let millionth_fib = fun() -> fibonacci 1000000;;
let show_the_fib fib = function
| true -> Some (fib())
| flase -> None;;
let print_the_fib fib = function
| true -> print_int (fib())
| flase -> ();;
```

`show_the_fib`

and `print_the_fib`

both use thunks, just in case that the arguments are always false (so the heavy computation would never be necessarily carried out). It is good. But what if `show_the_fib millionth_fib true`

and `print_the_fib millionth_fib true`

? Then actually `millionth_fib`

will be evaluated twice. What if we give `billionth_fib`

? What if `show_the_fib millionth_fib true`

and/or `print_the_fib millionth_fib true`

get called lots of times? Then the heavy computations will be done again and again. You may say in those worst cases, we just let `millionth_fib()`

be done once anyway and bind it to a variable, i.e., don't use thunk. Yes, we can do that, however, it is not perfect as it is still a waste of CPU when arguments are always false. Thunk puts one gate there to make sure that the computation would not be touched if unnecessary, however, once it becomes necessary, the times of executions cannot be controlled.

lazy is there to make all those bad cases go away. It is essentially a thunk (delay the computation, first gate) with the feature of making sure that the computation would be executed at most once (control the times, second gate).

`lazy`

is actually a built-in keyword of OCaml for you to create a `lazy_t`

. There is a module `Lazy`

there for you to fully utilise lazy. However, in this section, we will pretend lazy never exists and try to invent it. In this way, the understanding of lazy would hopefully become deeper. Let's get started.

Assume we have `let millionth_fib = fun() -> fibonacci 1000000;;`

as before. Now we want to have `millionth_fib_lazy`

. How can we add the second gate to stop the repeated computations, i.e., as long as it is computed once, how to return the existing result directly? Let's rephrase it more clearly:

- millionth_fib_lazy needs to be thunk when never evaluated.
- millionth_fib_lazy needs to become an int immediately after first time evaluation.
- Future attempts of accessing it will get the int back directly.

The key part is from step 1 to step 2: the transition between two different types. So we have two fundamental problems to solve now:

millionth_fib_lazy of course must belong to a type and we can call it

`lazy_node_t`

. Also ideally`lazy_node_t`

must include two other types: thunk and int (int is particularly for this case and we can have`'a`

for general purposes). We do not name the type as`lazy_t`

yet because of the other problem below.One type (thunk) must be able to transit to the other (int), but not the other way around

Variants can do this job. It combines multiple types into one and still stands as one single type to OCaml's type system. If you learn or already know **variant type**, then it is straightforward to write the type lazy_node_t:

```
type 'a lazy_node_t =
| Delayed of (unit -> 'a)
| Value of 'a;;
```

So a lazy can initially be `Delayed`

of a thunk. Then after first time evaluation, it becomes a `Value`

of `'a`

. In the case of `millionth_fib_lazy`

, it is initially `Delayed (fun() -> fibonacci 1000000)`

, then later on at some point, it becomes `Value 1884755131`

. Are we done? No! Although millionth*fib*lazy now can be two things, we still need to handle the transition.

Remember one fact that in OCaml world, being immutable is the default. So even if we somehow manage to evaluation `Delayed`

inside millionth*fib*lazy and return the result as `Value`

, it will be given as a new binding and will not change millionth*fib*lazy. This is not what we want, instead, we want to change the binding/variable millionth*fib*lazy itself. So wherever it is used, it has been changed already. Thus we have to use **ref** and here comes our final lazy_t:

```
type 'a lazy_node_t =
| Delayed of (unit -> 'a)
| Value of 'a
| Exn of exn;; (* we add exception type here just in case the thunk might produce exception *)
type 'a lazy_t = 'a lazy_node_t ref;;
let lazy_from_fun f = ref (Delayed f);; (* create a lazy_t from a thunk *)
```

Now we are ready to make the transitions more concrete.

```
let force x_lazy =
match !x_lazy with
| Value x -> x
| Exn e -> raise e
| Delayed f ->
try
let r = f() in
x_lazy := Value r;
r
with e ->
x_lazy := Exn e;
raise e;;
```

`force`

is used to evaluate lazy_t. The logic is simple:

- If the lazy is Value already, just return the result
- If the lazy is Exn, then raise it
- Otherwise, we change the lazy to either Value of Exn accordingly first, then return or raise

Now the two fundamental problems mentioned before are solved. Let's try it:

```
let millionth_fib_lazy = lazy_from_fun (fun() -> print_endline "calculate millionth fib";fibonacci 1000000);;
let show_the_fib fib = function
| true -> Some (force fib)
| flase -> None;;
let print_the_fib fib = function
| true -> print_int (force fib)
| flase -> ();;
show_the_fib millionth_fib_lazy true;;
print_the_fib millionth_fib_lazy true;;
show_the_fib millionth_fib_lazy true;;
show_the_fib millionth_fib_lazy true;;
calculate millionth fib
int option = Some 1884755131
1884755131
int option = Some 1884755131
int option = Some 1884755131
```

We can see `calculate millionth fib`

is printed only once. It seems working! So are we done? No, still no!

OCaml is thread safe as long as you keep everything in the functional way obviously. However, whenever you have to use imperative way, thread safety must be kept in mind.

Like in above lazy, it uses `ref`

the typical imperative type and it is thread dangerous. Let's say we have a lazy inside which the thunk takes 1 hour to finish. Two threads are using it. When the first thread tries to force it, it sees `Delayed f`

, then carry on the computation slowly. After some time fraction, when the second thread is activated at some point and tries to force it, it also sees `Delayed f`

, right? So eventually, two evaluations of the same thunk of the same lazy has been done. The purpose of lazy collapses, doesn't it?

Quiz 1: what is the simplest way to provide thread safe?

A lock

Mutex provides essential thread safe in OCaml. Let's quickly try it.

First of all, each lazy needs a mutex dedicated for itself and with this, all access to the lazy are guarded.

```
type 'a lazy_t = 'a lazy_node_t ref * Mutex.t;;
let lazy_from_fun f = ref (Delayed f), Mutex.create();; (* create a lazy_t from a thunk with a brand new mutex *)
```

Then when we force it, we need to lock / unlock.

```
let force x_lazy =
let x_lazy_part, m = x_lazy in
Mutex.lock m;
match !x_lazy_part with
| Value x -> Mutex.unlock m; x
| Exn e -> Mutex.unlock m; raise e
| Delayed f ->
try
let r = f() in
x_lazy_part := Value r;
Mutex.unlock m;
r
with e ->
x_lazy_part := Exn e;
Mutex.unlock m;
raise e;;
(* A bit ugly as Mutex.unlock m is everywhere.*)
```

That's it. Thread is now safe with lazy.

Let's ask again:

Are we done?

Yes, we are done. Now beer time!

*To be continued*