[clean-list] Wish list, part 3
Richard A. O'Keefe
ok@cs.otago.ac.nz
Fri, 12 Apr 2002 12:54:24 +1200 (NZST)
Someone wrote:
> > - The clean rules for identifiers are such that they can either
> > consist of ordinary characters, or of some special characters, but
> > not both. Otherwise the compiler has trouble interpreting things like
> > 'n+1'. In my opinion this is a pity, because:
> > a) I have no trouble writing 'n + 1' (i.e. with spaces, which I often
> > do anyway)
> > b) I prefer a simple rule that all symbols are separated by
> > whitespace (and brackets)
> > c) I would very much like to markup my symbols like >add<, >sub< for
> > dyadic operators, or symbol+, symbol*, symbol? for parsing functions,
> > or @node for labels, or <elem> and </elem> for constructors.
> > d) I don't believe that things like (<?@) make your code any more
> > readable (taken from parser combinator stuff).
And someone else wrote:
There may be a solution where n+1 would mean what it means today
in Clean - OR - it would refer to an object named n+1, depending on
whether such an object is in the name space, and depending on which of
these alternative meanings typecheck.
YOW! Are we in never-never land here, or in a world where programming
is done by fallible human beings? Clean is supposed to be useful for
writing real programs; a notation where you can't tell where the words
begin and end without a map would be disastrous.
There's only one language family I use regularly where "all symbols are
separated by whitespace (and brackets)", and that's Lisp. Lisp can get
away with it because it has *no* infix operators. x+y might as well be
a symbol because there isn't anything else it could be.
At the last count I had used over 120 different programming languages,
including some without operators, some with a fixed set of operators,
and some with user-defined operators.
There is precisely one language that I've come across that lets you have
+ operators, including user-defined operators,
AND
+ tokens containing both letters and operator symbols.
That's Pop-11. If I recall correctly, the syntax is like this:
<word> ::= <syllable 1> {'_' <syllable n>}*
<syllable 1> ::= <letter> {<letter> | <digit>}*
| <operator character>+
<syllable n> ::= {<letter> | <digit>}+
| <operator character>+
That made tokens like symbol_+, symbol_*, and symbol_? available,
while in symbol+..., symbol*..., and symbol?.... the letters and the
would have to be in separate tokens, making x+1 unamiguously three
tokens and x_+ 1 unambiguously two tokens.
The only thing I personally would change in Clean's syntax is to
change 'if e1 e2 e3' to 'if e1 then e2 else e3'. This would actually
reduce the number of tokens I have to write, because I usually have
to wrap parentheses around the expressions so they are parsed correctly.
Clean's list syntax is _perfect_, and I wish Haskell would adopt it.
I suggest that the remedy for anyone who doesn't like Clean's syntax,
especially someone who is happy writing combinator-based parsers, is
to write a preprocessor to convert the syntax they do like to Clean.
(And of course share it to build up the number of 'converts'.)