world as value

Adrian Hey ahey@iee.org
Tue, 13 Jul 1999 20:46:01 +0100 (BST)


On Mon 12 Jul, Peter Achten wrote:

> I hope this reply provides more answers than that it raises questions.

Alas not so (in my case at least:-)

I have a few questions (in fact I have many, but I'll only ask a few).

Q1: Is 'Concurrent Clean' concurrent or not?
--------------------------------------------
When I think about concurrency I'm thinking about concurrent IO performing
processes. I think this is what most other people mean by concurrency also.
But the current Clean manual documentation on concurrency annotations actually
appears to be talking about concurrent graph reduction (what I would call
parallelism). In particular, it makes the point that concurrency annotations
_do_not_ introduce non-determinism.

> The scheme is deterministic: given a Clean program and a particular World
> value, one can in principle completely trace and predict the outcome of
> the program.

How so? Does this forbid IO functions which get the current time? If these
are allowed then I presume that programs running on machines with different
speeds might well behave differently, even if they started with the same
'world state' (although since the world state is a thoroughly opaque black
box, I can't think of any way we could ever be sure the two programs did
start with the same world state). This seems like a situation which physical
scientists would deplore. Arguments which cannot (in principle) be resolved
by experiment are usually regarded as futile.

The paper about Concurrent Haskell (which Simon Peyton Jones mentioned)
makes the point that all concurrent languages are non-deterministic (third
page).
So we have 3 contradictory 'facts'..
 1-  Clean is a concurrent programming language.
 2-  All Clean programs are deterministic.
 3-  Every concurrent programming language is non-deterministic.

So who's right? I guess the real problem here is that I am (we are) using
terms and concepts without properly defining them first. E.G. We're not
distinguishing between a Clean (source) program and a compiled one which is
actually executing and interacting with a (non-deterministic?) *World.

Actually, it is possible to implement deterministic concurrent programs for
some systems. I do a lot of DSP work where all events are strictly periodic
and synchronous and the time taken to compute the response to an event is
fixed. But, for systems in general, we don't know when events will occur
or how long it will take to compute the response. So it's usually
necessary to accept non-determinism in order to prevent all IO activity
being blocked by lengthy computation. 


Q2: Concurrency via multiple environments?
------------------------------------------
Consider somethimg like..

doit :: *World -> *World
doit world =
 let (world1,file1)= openafile world
     file2         = writesomedata file1
     world2        = dosomethingelse world1
     world3        = closeafile file2 world2
 in  world3

If my understanding of Clean semantics is correct, then we could say that
because world1 and file1 are independent environments then writesomedata and
dosomethingelse can be performed concurrently (perhaps with additional
concurrency annotations).

I think this could be regarded as a form of deterministic concurrency, but it
seems to suffer from the same problem as the 'symFork' construct (mentioned
in the Concurrent Haskell paper). The problem is that all IO activity after
doit will be blocked (unnecessarily) until both writesomedata and
dosomethingelse have completed.

One problem I can see with this is that the collection of independent
environments is effectively determined by the Clean run time system, and
there is no way I can introduce new (IO) environments of my own using Clean.
This seems a little inflexible to me. The best I could do would be to dust
down my trusty C compiler and hack it. Correct?

Another (minor) problem I can see is that this whole approach only seems
viable if the world can be broken down into a 'tree like' data structure
of independent environments. What if this isn't the case? I suspect a data
structure which reflected real world environment dependencies would be
more 'graph like', with shared environments and cycles. I'm not sure if
this is a real problem or not, but I fear that admitting this possibility
will make the whole problem of deciding what can and can't be done
concurrently much more complex.


Q3: What is a value?
--------------------

> The major point of the World, and any other, environment is that it
> is a _value_. 

But unlike any other value, in that it changes all by itself, not just in
response to what the Clean program does (as Martin Wierich observed).
Are values which do this really values? I suppose there's no law against
calling such things values, if you like. But I could call the time on my watch
a constant if I like. It's different every time I look at it, but I say it's
still a constant (or at least it would be if I didn't keep looking at it). :-)

But I don't think a philosophical debate about what is or isn't a value will
get us very far. In pragmatic terms, it is my humble opinion that the world
(or environment) value is simply a trick to fool the Clean (or Haskell) graph
reducer to invoke side effect causing operations in some specified order.

I have no problem with this. In a lazy functional language we need some way
to fool the graph reducer into performing operations in the order we want them
performed, and world as value is as good as any other way I've seen so far.
(I count IO monads and continuation passing as 'world as value' IO also).

My problem with this is: What if we don't care in what order the operations
are performed? We still have to pick an order and pass the environment
value between operations, thereby making the operations strictly sequential.
For an environment like a file, this is probably exactly what we want, but
is this true for environments in general?


Q4: What does a model prove?
----------------------------
> Now it's not difficult to explain what happens when a functional Clean
> program tries to open a file: the fopen function gets the current World
> value as argument and first evaluates an 'arbitrary' number of other
> programs. This models context switching. Note that 'arbitrary' can indeed
> be easily modeled by an infinite sequence of random numbers, also to be
> placed in the World. Then the file is being opened. This either succeeds or
> fails, depending on the fact whether any other program already closed or
> opened that particular file. The (dummy) file and result boolean are passed
> as result of fopen, as well as the new World value. [In his e-mail d.d.
> 6-15-1999, Fergus Henderson mentions a similar scheme, as well as its 
> antiquity]

There's that word again. I can understand that a (deterministic) model
(a.k.a. simulation?) of a non-deterministic system might well be a useful
way of answering some questions about that systems behaviour, if the
apparent non-determinism arises from (pseudo) random numbers.

But what does the existance of such a model prove? If it's supposed to prove
that the system being modeled really is deterministic (and can therefore be
implemented as a purely functional program), I don't find this argument very
persuasive :-(

I suppose part of the problem is I'm unclear about what 'level' we're talking
about when we say a system is deterministic or not. It could be.....

1- A high level 'non-deterministic' concurrent programming language.
2- A logic gate level 'deterministic' finite state machine.
3- A 'non-deterministic' quantum mechanical electron and semiconductor
   physics level.

Certainly, a deterministic computer program could accurately model option 2,
and could legitimately be regarded as an implementation of the
'non-deterministic' option 1, but in reality, who would want to go to
such lengths? It would be easier to write the program in assembler.

If concurrency (and implied non-determinism) is required then people are
much more likely to chose a concurrent (not purely functional) programming
language, I think.

                        ---------------------

BTW, after reading about Concurrent Haskell I had a look for info about
a language called 'Pict' (mentioned in the Concurrent Haskell paper).
If anybody's interested you can read about it here..

     http://www.cis.upenn.edu/~bcpierce/papers/pict/Html/Pict.html

This is apparently based on the pi-calculus. I don't think I'll trade in
Clean for Pict just yet, but Pict reminded me of an informal graphical
notation I'd been playing with to represent actions and dependencies.
In fact I'm pretty sure my graphical notation could be translated into
Pict without too much difficulty. Alternatively, I think one could define a
data type which represented Pict programs and calculate values of this
type using pure functions. Maybe some additional syntactic constructs which
did for Pict structures what 'do notation' does for monadic IO would help.
(I guess that's what the Pict language really is.)

So we could make a Clean program something like..

  Start :: PictProg
or
  execute :: PictProg -> *World -> *World
  Start :: *World -> *World
  Start = execute pictprog

.. and the Clean runtime environment would contain a PictProg interpreter
and scheduler. Just an idea, but maybe implementing forkIO and MVar would
be simpler.

Regards
-- 
Adrian Hey