[clean-list] Re: Matrix timings

shiv@mac.com shiv@mac.com
Tue, 30 Oct 2001 13:50:30 -0800


S. Gonzi wrote:

[Only to be sure:

A X = Y

A is a 300x300 array and Y and the solution array X is also a 300x300
array]

In this case the C code (written on top of the Boehm garbage collector) 
timings are 2.25 seconds on a PowerBook G4 (400MHz), while Scilab with 
ATLAS BLAS takes 1.14 seconds. At this stage the factor of 2 could be 
because my C code uses QR factorization (which is about twice as slow as 
LU with partial pivoting), so the ATLAS BLAS may be of no use here (or 
my Scilab is not actually linking to ATLAS BLAS which I thought it was!).

I don't have the Clean timings handy for linear system solution. But 
here are the timings for matrix-matrix multiplication instead:

Clean's best timings for 300x300 was 1.98seconds vs. 0.16seconds for 
Scilab (so maybe that ATLAS BLAS got linked in after all). The C code 
(on top of the Boehm gc) timing was 0.85 seconds. So the Clean code was 
about 2.3 times slower than "gcc -O3 -funroll-all-loops" in this case 
(which was not quite fair to C; 2.5 was more likely to be the case).

When I get a chance I will time S.Gonzi's code too.

--shiv--

PS: I believe that LAPACK is better tuned for small number of right-hand 
sides, but I cannot vouch for that right now.