Operations on arrays

Fri, 5 Mar 1999 15:29:10 +1300 (NZDT)

I wrote:

>	I suppose I can do
>	
>	   vappend xs ys = createArray (nxs+nys) (\i ->
>	                      if (i >= nxs) ys.[i-nxs] xs.[i])
>	                   where nxs = size xs; nys = size ys

WHOOPS!  My mistake.
I hadn't remembered the interface of createArray correctly,
and for a simple reason:  I still can't _believe_ it.

Here's exactly what I want to do:  I want to have a vector just
like an existing one, except that a new "border" element has been
pushed on the front.  With lists, it's easy:

    push_border :: !Real ![!Real] -> [Real]
    push_border x xs = [x:xs]

With arrays, it doesn't have to be _easy_ because this is not a terribly
array-like thing to do, but it doesn't have to be _hard_ either.  The
obvious thing is

    push_border :: !Real !{#Real} -> *{#Real}
    push_border x xs = {x} +++ xs

but that doesn't work.  The next thing to try is

    push_border x xs = newArray (1+size xs) (\i -> if (i==0) x xs.[i-1])

but _that_ doesn't work either because there is no function

    newArray :: Int (Int -> a) -> {a}

and I can't think why not.  The simplest I've been able to come up with
is

    push_border x xs = {createArray (1+size xs) x &
                        [i] = xs.[i-1} \\ i <- [1..size xs]}

which is not pretty.  I've even wondered about

    push_border x xs = {y \\ y <- [x:[z \\ z <-: xs]]}

which at least I'm confident will be linear time.

As for making a random matrix, what I _really_ want is a mkArray with a
threaded state:

    mkArray :: (Int s -> (a,s)) Int s -> ({a},s)
    mkArray f n s0 = ({x1, x2, ..., xn}, sn)
    where (x1,s1) = f s0
          (x2,s2) = f s1
          ...
          (xn,sn) = f sn_1

but I've been reduced to making a list of lists and then converting that
to an array of arrays.  Of course I can program this function myself:

    mkArray :: (!Int !*s -> *(u:a,*s)) !Int !*s -> *(*{u:a},*s)
    mkArray f n s0 = ({x \\ x <- xs}, s`)
      where (xs, s`) = loop 0 s0
            loop :: !Int !*s -> *(*[u:a],*s)
            loop i s = if (i == n) s ([x`:xs],s``)
                       where (x`,s`) = f i s
                             (xs,s``) = loop (i+1) s`

assuming I've got it right (warning: NOT tested), but making this work
with the several different kinds of array in Clean is beyond my current
abilities.

Oh yes, here's the good news.  I've rewritten the NN program to use

:: Number :== !Real
:: Vector :== !{#Real}
:: Matrix :== !{Vector}

instead of

:: Number :== !Real
:: Vector :== !{Number}
:: Matrix :== !{Vector}

and the program now takes 452 seconds, down from 2248 last time.  So
that's nearly a factor of five going from lists of lists of doubles
to arrays of arrays of unboxed doubles; I was expecting a factor of 4.

This is now only a factor of 4.45 slower than optimised C code on the
same machine.  Loop unrolling and scheduling (which the Clean compiler,
we are told, does not do) accounts for about a factor of 2 in that.

Since I haven't got many * annotations on the vectors, storage management
probably accounts for the rest.  I strongly suspect that someone skilled
in the placement of these things could get the Clean version within a
factor of 3 of optimised C.  Well done!

There's an interesting lesson here.
I started with straightforward code where I had concentrated on clarity.
The kind of code that one hopes one's students will learn to write.
I've ended up with something that is A FACTOR OF TWENTY-THREE TIMES
FASTER by
 - adding ! and * annotations everywhere I was sure I could
 - stuffing everything into one file to work around the fact that
   there are "not all strictness information exported" warnings
   despite everything I could think of
 - eliminating higher-order functions completely (factor of 4)
 - switching from lists of numbers to unboxed arrays (factor of 5)

This is not good news.  What it really means in practice is that
clear straightforward code can _easily_ be a factor of twenty or
better _slower_ than hacked code.  It's not the absolute numbers that
matter so much; given that Clean is not SISAL or NESL, it's quite
reasonable that it hasn't been tuned for matrix crunching code.  What
_is_ a bit of a worry is the size of the gap.

,