world as value

Sun, 18 Jul 1999 23:18:35 +0100 (BST)

On Sun 18 Jul, Erik Zuurbier wrote: 
> Now imagine I would like to implement a personnel-database system in Clean.
> The whole data-part of the database could be in one file, but then, if one
> user was busy updating data of Mr. X in this database, the whole file would
> be blocked for the rest of the company, even from access by a user who
> would only update data of an unrelated Mr. Y. Commercial database systems
> don't (b)lock on the file level, because this is totally unacceptable. Some
> lock on a page level or on a record level. So effectively I cannot
> implement an acceptable database system in Clean. Correct? 

I am no expert on data base design, but I think there is a (partial) solution
to your problem, with Clean as it stands today. If it was me, I would design
my system so that only 1 process controlled the file itself, and other
process accessed and/or modified the database contents using some suitably
defined messages/protocol. This process could then ensure that if one
process had requested control of a record that no other process could access
it until control had been surrendered.

At the moment, you couldn't implement this idea in quite this way
because Clean doesn't support independent processes communicating
non-deterministicly via a messaging system (I think this is what Pascal
us working on). But what you could do is implement a 'file system within a
file system' using Clean as it is at present (without any C hacking, because
no new hardware devices or OS routines are involved). By this I mean that
you could treat a record as a 'mini file' and define a record type to
represent it (which need not necessarily be abstract or unique).

So you'd have functions like..

open_database  :: String *World -> (Bool,*Database,*World)
close_database :: *Database *World -> *World

read_record  :: RecordID *Database -> (Bool,Record,*Database)
write_record :: Record *Database -> *Database    // Record contains RecordID
newRecordID  :: *Database -> (RecordID,*Database)

Here a Database is a unique data structure which contains the database file
and any other information you want. I think Clean provides all you need to
implement something like this already, just so long as it's all deterministic.
The only caveat I would place on this is that I can't remember how good
existing Clean file functions are for random access files, but let's assume
that isn't a problem.

> (BTW, I have read about
> the approach in Peter Achten's PhD where he treats the file system as a
> server.)

Oh, that seems to be the way I would do it. I don't think it would be possible
to do it any other way. What your asking for would need mutable (non-unique)
variables and some kind of locking mechanism. I don't think the Clean
designers would be to keen on this (because we lose referential transparency). 

I'm not sure what the limits of uniqueness and Concurrent Clean (rather than
ordinary Clean) are, but I don't think it should be a problem. I haven't
tried it, but just so long as separate threads get different parts of a unique
structure I can't see any problem, because that's the only form of
Concurrency Clean allows. If other forms of concurrency (like independant
processes sharing 'unique' values) are introduced then I think we have a
problem, but I don't think that's going to be legal even when Pascal's
system is included (because any such values won't be unique, obviously).
So if Pascal's system only allows inter-process communication via messaging,
were back to the server approach to managing unique mutable data structures.

Regarding arrays, I have often thought that it would be useful to split
the array data type into 2 components, a 'physical' 1D array (where data is
actually stored, and a 'logical' array, which is just a formula which
converts (multi-dimensional) logical indexes to (1D) physical indexes.
You could then perform a lot of useful operations, like taking slices,
transposing, splitting into even and odd halves, just by changing the
formula. This seems so obvious I suppose it's already been done. I
think something like this could be useful for the problem you mentioned,
and it seems to fit quite nicely with uniqueness typing, provided
each operation produced logical sub-arrays which did not overlap in the
physical array.

Regards
-- 
Adrian Hey