[Re: Open Clean Source]

Ana Maria Abrao ana@ufu.br
Wed, 03 Feb 1999 15:10:00 -0200


John van Groningen wrote:
> Yesterday I downloaded SML (version 110) for windows to see how fast it is. It was slower than I expected. 
> To find out why, I disassembled some code generated by the compiler. I think I can now explain why it is 
> so much slower than C:

Dear John.
Since we have a strong interest in getting a faster SML/NJ, we contacted
Lucent and informed them of the existence of your mail in Clean list. 
We received feedback from many members of the SML/NJ team. 
Dave McQueen (the head of SML/NJ team, I guess) has put your 
observations on their mail list. The reaction:

(1) The idea of using 'jo' instead of 'into' is very good indeed,
and will be implemented soon. They say that they already use a
similar idea in the PPC...

(2) They alread have a prototype which generates code identical 
to the one you have suggested to replace 'stosd' instructions.

(3) Consider the code below:
> 
>         call    next_instruction
> next_instruction:
>         pop     eax
>         add     eax,continuation_address-next_instruction
> 
> and stores to address in the heap with:
> 
>         stosd

SML/NJ team is going to replace this kind of code with a base 
pointer that is dynamically maintained on entry to the function.


(4) Lal George is in charge of the MLRISC. Therefore, he cannot
do anything about the use of heap closure, instead of stack closure.
However, Andrew Appel is studying the behavior of the cache
in different processors. Possibly, his studies will improve the
use of cache for heap allocation.

Items (2) and (3) produced code which is at least 50% faster
than before. In some cases (Mandelbrot), the speed was 5 times higher.
Item (1) was not implemented yet, but Lal George thinks that
it will improve integer processing.

Thank you for helping the SML community with your 
worthy suggestions.

Eduardo Costa
Alex