TCP library

Richard A. O'Keefe ok@atlas.otago.ac.nz
Mon, 8 Feb 1999 12:35:23 +1300 (NZDT)


	From: Martin Wierich <martinw@cs.kun.nl>

	But StdFile's collection of functions is not enough.
	Sending and receiving data via a network can block the whole
	Clean program (I make a distinction between "Clean" and
	"Concurrent Clean").  Blocking makes File IO and network IO so
	different, and this has implications for a lazy functional
	language.  Imagine a client program, that sends data to a
	server, which will send it back on another channel.  Finally the
	client receives that data again.  When we write
	
	  # channel1               = fwrites "hallo wereld\n" channel1
	    (halloWereld,channel2) = freadline channel2
	  = (halloWereld,channel2,channel1)
	
	the order of evaluation is not specified.  In this example the
	Clean compiler would choose for evaluating freadline before
	fwrites, and the program would block forever, because it's never
	sending anything.

There is a sense in which he is obviously right about the distinction
between "File" and "Network" IO.  However, there is also an important
sense in which he is as obviously wrong, and I tried to explai that
in my previous message.  He is right that passive local data containers
"file" *may* behave differently from communications endpoints "network".
The point that I was trying to make, and which I believe we are agreed
on is that

    Almost[%] all of the operating systems where Clean is used
    allow at least one kind of communications endpoint to
    masquerade as and be accessed as a plain file.
	[%: I listed UNIX, OS/2, and Windows.  I am not sure about MacOS.]

Worse still,
    All the operating systems where Clean is used allow remote file
    systems to be accessed *through* a network *as if* they were
    local, and thus plain file access to something which *is* a plain
    file may incur arbitrary network delays.

Martin Wierich's point about two-way communication is a valid
observation about the trickiness of trying to co-ordinate two separate
end-points in a lazy language; a compelling argument using for a *single*
bidirectional channel with 'receive' rightly seen as a mutating operator,
rather than having separate endpoints.  Putting these observations
together:

1.  Even when you _think_ you are dealing with plain files, you really
    do want to have timeout available (I've had a UNIX box stall for
    10 minutes trying to search a remote directory).

2.  Just because you open it like a file doesn't mean that read
    really _is_ a pure operation; {UNIX,OS/2,Windows} named pipes live
    in the file space and may be treated like files *but* reading from
    them changes them just like truncating a file, so a Clean program
    should not be allowed to open a pipe for shared input.  Even
    without networks, think of reading from /dev/tty in UNIX or COM1:
    in DOS or .BIn in MacOS.

3.  But a program should be allowed to find out what it will be allowed
    to do, so Clean programs need a way to _tell_ whether a name in the
    file system name space stands for a data container or a communcations
    end point (such as a named pipe, socket, or serial port).

4.  Interacting with another program (whether remotely through a
    network or locally through message queues or whatever) requires
    destructive reads and writes to be synchronised, and that appears
    to be easiest when a _single_ Clean value is used for both
    directions of communication.

5.  Some remote programs use Java protocols, some remote programs
    use Erlang protocols, some remote programes use CORBA protocols,
    some will use Clean protocols, but some use text.  The transitions
	Clean program writes file, file read by GNU plot
    =>  Clean program writes to named pipe read by co-running local GNU plot
    =>  Clean program writes to socket read by co-running remote GNU plot
    should be reasonably simple and really _should_ not involve changing
    most of the code of the Clean program.