NN code: some progress
John van Groningen
johnvg@cs.kun.nl
Fri, 5 Mar 1999 14:18:03 +0100
>Is that enough to make me use macros? No. If I have to write the code
>in a somewhat contorted way, e.g.
>
> vred2 :: !(Number Number Number -> Number) Number Vector Vector -> Numb=
er
> vred2 f a xs ys =3D
> loop a xs ys
> where loop a [] [] =3D a
> loop a [x:xs] [y:ys] =3D loop (f a x y) xs ys
>
>then I'll do it. If I have in addition to write a pragma
>
> :: pragma :: f inline
> :: pragma :: f macro
>
>or something weird, I'll do that too. But if I have to use a special
>macro syntax in order to get good performance, I might as well use C.
There is no special macro syntax in Clean, except that instead of '=3D' you=
write ':=3D=3D' to indicate that this is a macro definition. There are some=
restrictions: the macro can not be recursive, can have only one alternative=
and may not use pattern matching (before :=3D=3D), and no types may be=
specified for the macro and for local functions defined inside the macro=
definition. The local functions may be recursive, use pattern matching and=
have more alternatives.
Macros have the same semantics as functions.
So all you have to do is write:
vred2 f a xs ys :=3D=3D
loop a xs ys
where loop a [] [] =3D a
loop a [x:xs] [y:ys] =3D loop (f a x y) xs ys
>#define VRED2(nm,fn) \
> double nm(int n, double const *x, double const *y, double a) { \
> while (n-- !=3D 0) a =3D fn(a,*x++,*y++); \
> return a; \
> }
>#define DOTMAC(a,x,y) ((a)+(x)*(y))
>VRED2(vdot, DOTMAC)
>
>The C code is even the same size as the Clean code.
But the macro in C does not have the same syntax and semantics as a function=
definition. A macro definition in Clean does (except for :=3D=3D).
>I'm especially reluctant to use macros since the manual does not describe
>the semantics of macros clearly and explicitly warns that error messages
>may be cryptic if macros are used.
There is no need to, macros have the same semantics as functions in Clean.=
Except of course, when a macro is used in a pattern.
Errors are more cryptic because the line number of the macro is often=
reported, instead of the line number of the macro application. Changing the=
macro back into a function usually helps in this situation. And type errors=
are more difficult to locate.
>I've been telling students that one of the reasons functional languages
>are good is that it's easier for the compiler to make inlining decisions
>and to do inlining than in languages with all sorts of side effects and
>order dependencies. In the future, do I have to tell them "provided the
>compiler writer is Jeff Siskind"? I really want to *honestly* tell
>these students that functional programming is elegant *and* practical.
>If that means switching to Mercury instead of Clean, I'll do that too.
It is easier to determine whether inlining is allowed. I am not sure whether=
it is easier to determine whether inlining is usefull.
>I do appreciate that the Clean team has limited time, limited manpower,
>and limited money, and that there are all _sorts_ of things they need
>to do yesterday. But my C compilers (UNIX, MacOS) make inlining decisions
>without any help ('inline' has been added to C9x at about the time that
>it stopped being useful!), the Eiffel compiler I have makes inlining
>decisions without any help, and the Ada compiler I use just needs a
>pragma (and can probably manage without it). If it only made a few percent
>difference, who'd need it. But a factor of better than four?
This improvement is not simply because of inlining, but because after=
inlining a few other optimizations are possible. These optimizations are=
already implemented:
- After inlining the compiler can perform constant folding of f. Instead of=
adding the free variable f in loop as an argument, the compiler sees that f=
is a constant and replaces all occurences of f by this constant instead.=
This happens during lamba lifting.
- After inlining this function, the strictness analyzer can determine that=
loop is strict in its first argument (if f is strict in its first=
argument). Therefore no closure has to be build for f a x y, but f can be=
called immediately.
- The value of the first argument can be unboxed and passed in a register.
Without those optimizations, inlining would not have improved the=
performance. (except for the first call may be, probably just a few=
instructions less)
To optimize functions like this, we need to implement specialization for=
functions with function arguments, not inlining. Clean 2.0 can probably do=
this, within the same module. With the current compiler, macros have to be=
used for those higer order functions to perform this optimization. The=
foldr and foldl functions have been defined in this way. map not, because=
it is not much faster, and sometimes slower.
John van Groningen