[clean-list] Fastest shuffle with Clean?

Fabien Todescato f.todescato@larisys.fr
Wed, 12 Sep 2001 09:41:14 +0200


John wrote

>Greetings all,
>I am new to FP and Clean, and I need some help here. A couple of questions:

Hi John,

> My first question is how to implement the fastest shuffle possible in
Clean.

>shuffleArray :: *{#Int} !Int -> *{#Int}
>shuffleArray ar top
>    | top < 0   = ar 
>    | otherwise
>        #! r = randomInt mod (top+1)
>        #! rnd_elem = ar.[r]
>        #! top_elem = ar.[top]
>        = shuffleArray {ar & [r]=top_elem, [top]=rnd_elem} (top-1)


>Is this the most efficient (read: fastest) way to shuffle an array in
Clean?

I can't make out what you mean, you should maybe state clearly what are your
requirements about the suffling you're talking about.

> Does this return the *same* array (now shuffled), or does it return a new
array consisting of the shuffled elements?

The '*' in the type *{#Int} means that the array being manipulated is never
duplicated during the course of the computation - it is 'unique' - , so that
it can be efficiently updated in-place with a safe semantics. Also, note
that the update operation provided on arrays are efficiently implemented by
means of constant-time destructive updates, and therefore require the arrays
to be unique. The compiler would have rejected the use of array updates on a
non-unique array.

Now, I am not sure the compiler does agree with your way of swapping the
elements of the array. When you write :

>        #! rnd_elem = ar.[r]
>        #! top_elem = ar.[top]

The array ar is no more single-threaded, there is a duplicated reference to
it. The following swapping function does however perform the swapping of two
array items while keeping the array single-threaded :

  swap a ( i , j )
    # ( x , a ) = uselect a i
    # ( y , a ) = uselect a j
    # a         = update a i y
    # a         = update a j x
    = a  

Admitedlly, uniqueness typing is somehow confusing at the beginning. You
should have a look at the relevant section in the 'Functional Programmiong
with Clean' book available on the Clean Home Pages.

> The Clean version of the MersenneTwister returns an infinite list instead
of a new random number at each call.
> Given that billions of calls are made to the generator, wouldn't this
result in overflow ?

Yes, but Clean list are 'lazy' lists. If at any given time you need only a
finite part of the list, however infinite, all should go well. Assuming
infiniteRandomNumberList is the function that generates your infinite random
number list, you may write your swapping code more or less as follows,
assuming also that 'top' is the size of your array.

shuffleArray ar top
  = foldl swap ar indexPairList
where
  infiniteRandomNumberList // Define your infinite random number list
LOCALLY...
    = ... // ...using your Mersenne Twister generator there.
  indexPairList
    = zip2 [ 0.. ] ( map ( \ n -> n mod ( top + 1 ) ) ( take top
infiniteRandomNumberList ) )

You see that the infiniteRandomNumberList is manipulated as is. You should
take care, however, not to define the infiniteRandomNumberList as a global
variable, for it would then be kept in memory while the whole computation is
running, creating an unbearable space leak.

Also, remember that functional programs written in Clean do exhibit the
so-called 'referential transparency' property, which means that the value to
which an identifier gets bound to cannot be mutated during the course of
execution of the program. So you cannot expect a 'randomInt' function to
generate a different number at each call. Instead, however, you may have a
randomIntList function returning an infinite series of random numbers.

>Any guidance would be greatly appreciated!

Hope it helps,

Best regards, Fabien