[clean-list] UNIX (or POSIX) I/O

Maks Verver m.verver@student.utwente.nl
Fri, 21 May 2004 18:23:52 +0200


Hi again,

In my testcase the memory leak went away when I did not return the *World
object back to the main rule. For example, the following code works
correctly (and returns the number of lines in a text file):

------ 8< ---------------

import StdEnv

Start world 
# (success, file, world) = fopen "D:\\Temp\\Track.sql" FReadText world
| success = snd (process file world)

where
	process file world
	# (file, state) = readlines file 0
	= ( fclose file world, state )

	readlines file state
	# (eof, file) = fend file
	| eof = ( file, state )
	# (_, file) = freadline file
	= readlines file (1+state)

---------------- >8 -----

Now, if I remove the application of 'snd' so not just the file size but also
the *World object will be printed the application, so the Start rule
becomes:

    Start world 
    # (success, file, world) = fopen "D:\\Temp\\Track.sql" FReadText world
    | success = process file world

For small files, the program returns something like ((True,65536), 3) for a
three-line input file, but for large files the program runs out of heap
space, except when I set the heap size pretty high and stack size pretty
low, in which case a stack overflow occurs before the heap is full. I think
that the compiler doesn't generate the correct code for the tail recursive
call in 'readlines' for some reason. I have no idea how this works
internally or why this happens.

By the way, the 'working' version is pretty fast. It takes 7.5 seconds to
count the 2,864,468 lines in a 271,822,936 byte file; that's slightly less
than Perl on the same machine, so I guess it's pretty competitive. No need
to re-implement it for performance reasons.

Kind regards,
Maks Verver.