NN code, further results
Richard A. O'Keefe
ok@atlas.otago.ac.nz
Tue, 16 Mar 1999 14:27:25 +1300 (NZDT)
I decided to try another array data structure, called an Urst
(UnRolled Strict Tree). I also decided it was time to try the
Clean stuff on a PowerMac.
RESULTS for SPARCstation 5 (84MHz, V8 SPARC)
Compiler List Urst Ursl Array
GHC 3.02 5 sec 4 sec 3 sec 25 sec
Clean 1.3.1 5 sec 5 sec 4 sec 0.8 sec
C 0.2 sec
Ursts require body recursion, and the size I chose unrolled by 3
instead of the Ursl code's 4, so the relative times in GHC were
not a surprise.
RESULTS for PowerMac 7600/120 (120MHz, PowerPC 604)
Compiler List Urst Urst-F Usrt-M Array
Clean 1.3 2.54 2.76 2.02 2.31 0.38
Metrowerks 2.1 0.114 (C)
The Metrowerks C compiler had its optimisation settings as high as
they would go; that includes automatic inlining and instruction
scheduling, it's not clear to me whether it includes unrolling but
it might. The Clean/C ratio here was about 3.3.
The really interesting thing is what happened when I used macros.
What I did was to take code for mapping and folding like
mapx :: (a -> b) (X a) -> (X b)
mapx f xs = loop xs
where loop ... = ...
and turn it into
mapx :: (a -> b) (X a) -> (X b)
mapx f xs :== loop xs
where loop ... = ...
This crashed the SPARC-Solaris version of Clean 1.3.1, and it made
the PowerMac version of Clean 1.3 give me nonsensical error messages
about trying to redefine Int (in definitions which didn't involve
Int anywhere at all). Fortunately, the PowerMac version *did* manage
to tell me which lines it was unhappy about, and I suddently
remembered that you aren't allowed to give type specifications for
macros in Clean. That's an important semantic difference between
functions and macros. I suggest that detecting this situation and
reporting it more clearly is an important bug fix for the next release,
because it is such an amazingly easy trap to fall into.
So I commented out the declarations, getting
//mapx :: (a -> b) (X a) -> (X b)
mapx f xs :== loop xs
where loop ... = ...
and was able to compile the program.
This was the big surprise. Turning these mapping and folding functions
into macros made the code *SLOWER*. Clean reports (computation time +
garbage collection time + transput time = total time) on a PowerMac,
and I reported the sum of computation time and gc time above. Here are
the results in a little more detail just for Ursls:
Work GC IO
Ursl-Macro 1.88 0.43 0.03
Ursl-Plain 1.71 0.31 0.03
At the very least, this tells us that turning mapping functions into
macros is no substitute for automatic inlining by the compiler.
The source code for this test is only 233 lines, I'm tempted to post it.