[clean-list] Re:Clean Wish List: My old major wish...
Marco Kesseler
m.wittebrood@mailbox.kun.nl
Wed, 16 Oct 2002 22:59:15 +0200
Erik wrote:
>Valery wrote:
>> However, I think there should be clear understanding that both Pentium
>> and PCs based on it (as a excellent representative of nowadays
>> architectures) are far from being a natural solution concerning
>> hardware architecture for the graph rewriting.
>> ...especially parallel one.
>
>Functional programming developments once concentrated also on devising
>totally new hardware configurations. That stopped when important
>breakthroughs took place in compiler design, was it in the 1980's?
>They opened
>up the perspective of acceptable preformance for functional language
>programs on ordinary hardware.
It is true that somewhere towards the end of the 1980's, research
groups started working on serious compilers for functional languages
for common processor architectures. It turned out that research
groups building specialised hardware could not compete with large
companies like Intel, building ever faster general-purpose
processors. It was kind of a pragmatic decision.
This does not mean that it is technically impossible to build
hardware that would perform faster graph rewriting than any Pentium
or PowerPC today. The academic research groups simply do not have the
resources.
>What we now need is practical solutions, so they should be based on available
>and afordable hardware. I think the Clean team are doing a good job. When the
>need for implicit parallellism becomes ever more clear, and one bright
>researcher finds a way to implement it, we'll get it. Until then,
>we'll dream on...
I have done little a bit of searching on the Pentium atomic XCHG
(exchange) instruction and found no definite answer on its costs, let
alone its costs relative to graph-rewriting. I have seen game writers
avoid it, and multi-processing groups use it. I have seen remarks
that it does not lock the bus on newer processors for cached memory
locations.
Apart from the costs, it seems rather trivial to lock nodes with this
instruction. So what if it slows down Clean by 10 percent?
Valery wrote:
>I'd postpone discussing hardware issue up to the moment when
>we could se (invisible) forking implemented in Clean. Before this
>there's no much sense discussing hardware.
The reality of today is that the Intel platform has become the main
platform for Clean. I do not see how one can see forking implemented
in Clean without discussing the Intel hardware.
Apart from that, I use Macintoshes, so this reality does bug me a
bit. Apparently I do not have {P} and {I} on my PowerPC because of
bad Pentium design.
>What's the problem?
>
>1. Just deliver *every* graph reduction step to a new CPU and/or thread
> and/or process.
>
>2. Then you'll see vastly degrading of your performance on classical
>machines,
> than you could ask for new hardware
You could, but who is going to give this new machine to you? It is
more likely that the rest of the world is going to say "see, these
functional languages will never be efficient enough".
Apart from that, people probably want a machine that can run
different kinds of languages well.
>3. And meanwhile you could make some optimizers which could enable
> delivering not every graph reduction step to another CPU but only needed
> (like thos compile time optimizers, trying to use registers instead of
> memory as much as possibly)
Or you could start without parallellism and let the programmer
introduce some. This is what {P} and {I} were about.
regards,
Marco