[clean-list] 2 dimensional FFT code
    John van Groningen 
    johnvg@cs.kun.nl
    Thu, 8 Nov 2001 14:51:17 +0100
    
    
  
Siegfried Gonzi wrote:
>...
>The Clean version could be 30% or even more faster, but due to the insane array types this will not happen. After the first transform I have to transpose the array (in order to access the columns effectively):
>
> TransposeFFT:: !Int !*{#*{#Real}} -> !*{#*{#Real}}
>
>For transposing I have to pass (or better: Clean expects) at least the innermost dimension as strict. BUT the first transform (before transposing in place):
>
> fft_rows:: !Int !Int !Int !*{#{#Real}} !Real  -> !*{#*{#Real}}
>
>delivers the array ONLY in that format: !*{#*{#Real}}
>
>when I use as exit strategy (in fft_rows):
>
>  | i == ( sizeA ) = { { el \\ el <-: array.[ij] } \\ ij <- [0..(dec sizeA )] }
>
>BUT I WANT to use:
>
>| i == (sizeA ) = array
>
>...
>
>I tried to work-around the problem, but IT IS IMPOSSIBLE.
It is possible by using the replace function. Instead of
  #! row = array.[i]
in  fft_rows use:
  # (row,array) = replace array i {}
and change the type:
fft_rows:: !Int !Int !Int !*{#*{#Real}} !Real  -> !*{#*{#Real}}
If you want to improve the performance:
- If you are using the copying garbage collector: use the marking garbage collector instead.
- Use shifts instead of divisions. Replace 
    m1 = ( LOOP2 ( n/2 ) j )
  by:
    m1 = ( LOOP2 ( n>>1 ) j )
  and
      = LOOP2 ( m/2 ) ( j-m )
  by:
      = LOOP2 ( m>>1 ) ( j-m )
  or even better, don't recompute the bit reversed value each time, but
  increment the previous bit reversed value, for example with:
BitRev n marray
	= BitShuffling 0 0 ((log2 n)-1) marray
where
  log2 1 =  0;
  log2 n =  inc (log2 (n >> 1));
  BitShuffling :: !Int !Int !Int !*{#Real} -> !*{#Real}
  BitShuffling i j log_size marray
   | i>=( n-1 ) = marray
   # m1=inc_reversed j log_size;
   | j>i
     #! xj = marray.[j]
     #! xi = marray.[i]
     #! yi = marray.[i+1]
     #! yj = marray.[j+1]
     = BitShuffling ( i+2 ) m1 log_size { marray & [i] = xj, [j] = xi, [j+1] = yi, [i+1] = yj }
     = BitShuffling ( i+2 ) m1 log_size marray
	where
		inc_reversed i m
			| i bitand m==0
				= i bitor m
				= inc_reversed (i bitxor m) (m>>1)
- LOOP3 is not strict in mmax and istep, add these as arguments and add !Int's in the type of LOOP3.
After these modifications the program executes in 2.65 seconds on my 450
Mhz G4.
Regards,
John van Groningen