[clean-list] strong root normal form

Wed, 09 Oct 2002 23:36:10 +0200

Hi Erik,

>I am at a point where I want to understand the concept of 'strong root
>normal form'. There are too many programs in which I turn lets into strict
>lets and stumble upon a version that appears to work as I had expected. But
>I don't understand what's happening. 

I may be able to give an alternative informal explanation. If this is 
not entirely accurate, anyone, please correct me. It has been a 
while. That being said, I doubt that you need all this to understand 
how Clean works. See the end of this mail.

>The Clean documentation says:
>
>"A pattern partially matches a graph if firstly the symbol of the root of
>the pattern equals the symbol of the root of the graph and secondly in
>positions where symbols in the pattern are not syntactically equal to
>symbols in the graph, the corresponding sub-graph is a redex or the
>sub-graph itself is partially matching a rule.

A partial match means: we know that the root of an expression matches 
the pattern, and for the rest of the graph we do not know yet whether 
it matches the rest of the pattern, because it has not yet been 
evaluated far enough to decide.

>A graph is in strong root
>normal form if the graph does not partially match any rule.

As long as a graph partially matches a rule, it may still be 
reducible by that rule (it can, as soon as we are able to establish a 
total match). Conversely, if a graph does not partially match ANY 
rule, this means that there is NO rule that will EVER be able to 
rewrite the root node to anything else.

>It is decidable
>whether or not a graph is in strong root normal form. A graph in strong root
>normal form does not partially match any rule, so it is also in root normal
>form.

The exact differences between strong root normal form and root normal 
form elude me right now. Root normal form basically means that the 
root will never reduce to anything else. This means that the root is 
known to be part of the final result.

In terms of Clean you could see it as follows: if you have a rule

f 1 = Cons 2 (f 2)
f 2 = Nil

then the rule pattern 'f 1' (partially) matches the expression 'f (0 
+ 1)', because f matches f, and for the rest of the pattern we do not 
know yet. After reducing (0 + 1) it totally matches 'f 1', turning it 
into a redex, that can be rewritten to 'Cons 2 (f 2)'. 'Cons' does 
not match any pattern, so it is part of the final result and can be 
delivered.

Looking at things this way seems a bit overkill though. I doubt that 
many programmers find comfort in the statement that 'f (0 + 1)' 
partially matches the patterns 'f 1' and 'f 2'.

>The default reduction strategy used in CLEAN is the functional reduction
>strategy. Reducing graphs ac-cording to this strategy resembles very much
>the way execution proceeds in other lazy functional languages: in the
>standard lambda calculus semantics the functional strategy corresponds to
>normal order reduction.

In the discussion above, we have seen that graphs can partially match 
a pattern. But before we can apply a rewrite rule, we have to know 
whether there is a rule that matches completely. In order to 
establish this, we may have to reduce some subgraph. This was the 
case with the '0 + 1' expression in the example above.

The point is now: we cannot just start reducing any subgraph we like, 
and repeat this until we we happen to have a match. If we choose the 
wrong subgraphs, we might end up in an infinite reduction, probably 
reducing graphs that are not needed for the final result.

So, we need a sensible order for choosing our rewrite rules. The 
normal order reduction strategy has been proven to be a sensible 
order. It will never cause a needless infinite computation.

>On graph rewrite rules the functional strategy
>proceeds as follows: if there are several rewrite rules for a particular
>function, the rules are tried in textual order; patterns are tested from
>left to right; evaluation to strong root normal form of arguments is forced
>when an actual argument is matched against a corresponding non-variable part
>of the pattern. A formal definition of this strategy can be found in (Toyama
>et al., 1991)."

Basically, this strategy means that you only evaluate a graph for as 
far as needed to establish whether it completely matches a pattern. 
And as you cannot write down infinite patterns, this process cannot 
CAUSE  an infinite computation either.

Others may call it common sense, but note that strict programming 
languages do not use this strategy. They will always evaluate 
function arguments before evaluating the function itself, even if 
they are not needed to establish the function result.

But wait: what if the functional strategy reduces a subgraph that 
leads to an infinite computation while matching for a pattern? Well 
in that case the function would have been undefined anyway. 
Programmer mistake.

>Well, to be honest, I understand not a word of this. Does anybody have some
>examples of expressions which are or are not in SRNF and why?

I hope this clears things up somewhat. If not, you may find comfort 
in the fact that I never think about these issues while programming 
in Clean. I just remember things like:

- Clean will normally evaluate its results top-down. First the root 
constructor, then its parts. Same for each substructure. As soon as 
an expression delivers its root constructor (Cons or Nil in case of 
lists), it is in root normal form.

- Without strictness annotations Clean will never evaluate more than 
is necessary to establish the part of the result that has to be 
delivered next.

- Derived strictness information may change the evaluation order, but 
this does not change the previous remark. Strictness will only be 
derived for expressions that are needed to compute the root result.

- Explicit strictness annotations will change the evaluation order, 
relative to other evaluations. What happens exactly is a bit depended 
on the kind of strictness annotation (and I do not know the latest 
details). I try to avoid them as much as possible (same with 
uniqueness by the way), but sometimes a program is just too 
inefficient without them.

Now the main question becomes: what is it that you need strictness 
for exactly? Is it needed for efficient execution, to reduce memory 
equirements, or for other reasons? Can you give an example of a 
surpising effect?

regards,
Marco