[clean-list] Re: Garbage collect databases

Betsy Pepels betsy@cs.kun.nl
Wed, 26 Nov 2003 14:02:10 +0100


Hello Erik,

I'd to think for a while about your problem. I'm interested in data
modelling and FP (and actually I'm trying to develop a CASE Tool as an
addition to Clean).

Although your problem looks like a garbage collection problem (or more
generally spoken, a run-time or processing problem), I think it is more a
data modelling problem. Or if you wish, a data modelling error that is
solved in the wrong way.

First, as always, little or no difference is made between the data model and
its implementation. Like:
> Therefore a boolean is added to each company record.

Second, deleting records from a table is far from trivial anyhow.You've got
to check dependencies (not only dependencies like you describe; one can
imagine zillions of situations, like: you can't remove a debtor if he still
has to pay debts). Here a good data model is also very useful, because it
helps to identify such dependencies.

To answer your question (very partially), I think that solving this 'garbage
collection' problem at the lowest level (the implementation level) is the
wrong way. It is better to identify the dependencies at the data model level
and to derive from these the correct processes (for instance for record
deletion). In the far, far future, hopefully my tool will generate
automatically the checks that have to be made before (the FP equivalent of)
a record can be deleted.

Look what happened when in your case a correct data model had been
developed. (It looks like if that is very expensive, but it prevents such
awful repairs afterwards).
Then you (hopefully) detect that companies exist in, say, flavours.
Companies which are allowed to order, and companies which aren't. Most
probably there will be more flavours, like companies which are allowed to
order but have to pay beforehand. And to make things even more complicated:
companies which are allowed to order as long as their debt doesn't exceed a
certain limit.
After you have determined that there are such company flavours, you think
about an implementation of this. One or more Boolean's could be a solution,
but for instance maintaining tables for each flavour is also a possibility.

GreetZ,
Betsy (guest at Software Technology)

______________________________________________________________
> Reply-To: fzuurbie@inter.nl.net
> Subject: [clean-list] Garbage collect databases
.....
> Dear Cleaners,
>
> Back down to earth. Where I work, a system is being developed along the
following lines. First there is an ordering data-entry program that checks
the entered company code with a relational master data table. Then someone
else processes the ordering data and yet another program sends the result to
the company mentioned earlier, based on the address in the master data
table.
>
> Nothing exciting, so what do you think? Of course there is a program to
maintain the master data, which introduces the posibility that sending
fails: enter ordering data, delete the company and try to send the ordering
data.
>
> (Please hang on for another minute, I am getting to FP.)
>
> First of all, the initial development budget is tight, so this is just how
it is being programmed. The failure will simply be dumped on an operator. In
a second version someone programs that a company cannot be deleted if there
are outstanding orders. Then someone realizes the burden on the operator and
the possible starvation: if enough orders are being entered, the company can
never be deleted. Therefore a boolean is added to each company record. If
the system manager wants to delete a company, s/he sets the boolean to False
and the company will no longer be allowed when entering ordering data.
Periodically a batch program removes the company records with the boolean
set to False, which have no outstanding orders.
>
> Things like these happen a thousand times, while it is nothing more than a
garbage collection problem. A thousand times operators, analysts and
programmers are loaded with work to overcome it. Also system documentation
covers the particular garbage collection solution, which is essentially
non-information.
>
> Now while FP's have adequately solved this garbage collection problem for
single executables and their RAM-heap, I am not aware of any practical
solution that also garbage collects database records or files.
>
> Does anybody know about work in this direction?
>
> Regards Erik Zuurbier