NN code, further results

Richard A. O'Keefe ok@atlas.otago.ac.nz
Tue, 16 Mar 1999 14:27:25 +1300 (NZDT)


I decided to try another array data structure, called an Urst
(UnRolled Strict Tree).  I also decided it was time to try the
Clean stuff on a PowerMac.

RESULTS for SPARCstation 5 (84MHz, V8 SPARC)

    Compiler       List    Urst    Ursl   Array
    GHC 3.02       5 sec   4 sec   3 sec  25 sec
    Clean 1.3.1    5 sec   5 sec   4 sec   0.8 sec
    C                                      0.2 sec

Ursts require body recursion, and the size I chose unrolled by 3
instead of the Ursl code's 4, so the relative times in GHC were
not a surprise.

RESULTS for PowerMac 7600/120 (120MHz, PowerPC 604)

    Compiler      List    Urst    Urst-F  Usrt-M  Array
    Clean 1.3     2.54    2.76    2.02    2.31    0.38
    Metrowerks 2.1                                0.114 (C)

The Metrowerks C compiler had its optimisation settings as high as
they would go; that includes automatic inlining and instruction
scheduling, it's not clear to me whether it includes unrolling but
it might.  The Clean/C ratio here was about 3.3.

The really interesting thing is what happened when I used macros.
What I did was to take code for mapping and folding like

    mapx :: (a -> b) (X a) -> (X b)
    mapx f xs = loop xs
     where loop ... = ... 

and turn it into

    mapx :: (a -> b) (X a) -> (X b)
    mapx f xs :== loop xs
     where loop ... = ... 

This crashed the SPARC-Solaris version of Clean 1.3.1, and it made
the PowerMac version of Clean 1.3 give me nonsensical error messages
about trying to redefine Int (in definitions which didn't involve
Int anywhere at all).  Fortunately, the PowerMac version *did* manage
to tell me which lines it was unhappy about, and I suddently
remembered that you aren't allowed to give type specifications for
macros in Clean.  That's an important semantic difference between
functions and macros.  I suggest that detecting this situation and
reporting it more clearly is an important bug fix for the next release,
because it is such an amazingly easy trap to fall into.

So I commented out the declarations, getting
    //mapx :: (a -> b) (X a) -> (X b)
    mapx f xs :== loop xs
     where loop ... = ... 
and was able to compile the program.

This was the big surprise.  Turning these mapping and folding functions
into macros made the code *SLOWER*.  Clean reports (computation time +
garbage collection time + transput time = total time) on a PowerMac,
and I reported the sum of computation time and gc time above.  Here are
the results in a little more detail just for Ursls:
                 Work  GC    IO
    Ursl-Macro   1.88  0.43  0.03
    Ursl-Plain   1.71  0.31  0.03

At the very least, this tells us that turning mapping functions into 
macros is no substitute for automatic inlining by the compiler.

The source code for this test is only 233 lines, I'm tempted to post it.