Experiences with Directory

Zuurbier, E. - AMSXE Erik.Zuurbier@klm.nl
Mon, 27 Dec 1999 12:20:34 +0100


Hi,

If have tried the Directory module and I would like to share my findings.

I have written a program to recursively scan a directory (folder if you
like) to create a list of all paths/names of .icl files in it. For this, I
use the function
getDirectoryContents::	!Path !*env -> (!(!DirError, [DirEntry]), !*env)
| FileSystem env

This function is strict in its output !*env, which means that it will not
start dealing DirEntry-s until it has internally built up the complete list
of DirEntry-s.
A simple change of the implementation would make this function lazy. (Thank
you, John van Groningen for pointing this out to me.)
But the fact that this function always delivers the complete list (whether
strict or lazy), means you cannot write a function that simply stops
searching after you found what you need. Also, as in my program, you cannot
recursively search the first sub-directory in a directory before the whole
directory has been scanned. It seems to me that this calls for a function
that delivers DirectoryContents one by one, not in a list. The current
implementation uses a combination of functions on a lower level to do this:

findFirstFileC	::	!String !*env	-> (!ErrCode, !*env)
findNextFileC	::	!Int !*env	-> (!ErrCode, !*env)
getDirEntry	::	!Bool !Int !*env -> (DirEntry, !*env)	| FileSystem
env

But they are not exported.

If they WOULD be exported, there would still be another problem. As you can
see in the type signatures above, findFirstFileC and findNextFileC encode
their results in the !*env ouput. That means (I think) that you can only
have one directory scanning session going on at the same time. If I would
like to scan two directories and compare the contents, I would still have no
choice but to first completely scan one directory and get the list of its
contents in core, and then start the second scan and compare them. For large
directories this would consume more memory than I would like to spend on it.

A solution (if the operating systems support it) would be to change the
signatures as follows:
findFirstFileC	::	!String !*env	-> (!ErrCode, !*env, *dir)
findNextFileC	::	!Int !*dir	-> (!ErrCode, *dir)
and of course change the implementations accordingly.

This change would effectively introduce the same mechanism that already
exists in StdFile.dcl for files, so we could have:
class FileSystem f where
	dopen :: !{#Char} !Int !*f -> (!Bool,!*Directory,!*f)
	dclose :: !*Directory !*f -> (!Bool,!*f)
	sdopen :: !{#Char} !Int !*f -> (!Bool,!Directory,!*f)
dreade		:: !*Directory -> (!Bool,!DirEntry,!*Directory)
sdreade	:: !Directory -> (!Bool,!DirEntry,!Directory)

etc. This would give the application writer the extra benefit that the
operating system would guarantee that directory contents will not be changed
by others, during the period (s)he has opened it.

However, unlike with files, you would still like to be able to see a
directory's contents without dopen-ing it (which may fail if the directory
has already been dopen-ed or sdopen-ed by another function/program) and
without sdopen-ing it (which may fail if the directory has already been
dopen-ed by another function/program). So functions like the ones presently
available in Directory.dcl would still be necessary.

Regards
Erik Zuurbier