Bug in Parser Combinators?

Pieter Koopman pieter@wi.leidenuniv.nl
Mon, 24 Nov 1997 10:54:22 +0100


Nick Kallen wrote:

>The following program is taken (almost exactly) from the Parser Combinators
>chapter in the Clean book:
>
>module test
>import StdEnv, Parser
>natural :: Parser Char Int
>natural = <*> digit <@ foldl nextDigit 0
>where nextDigit a b = a*10+b
>digit :: Parser Char Int
>digit = satisfy isDigit <@ digtoInt
>Start = natural ['100']
>
>When I try to run it, I get the error:
>    Run time error, rule 'p' in module 'Parser' does not match
>
>Parser is a module I made by almost directly copying the Parser Combinators
>document (a copy is appended to this message). The error stems from <*>, but
>I cannot see why my implementations
>of <*> or any of the functions it depends upon are different (in meaning)
>from those in the document. So I don't know what the error is.
>    My guess is that there's an error in <&> in that the empty list xs1 as a
>result of p1 cannot be correctly parsed by p2. This is corresponded by the
>fact that (Start = natural ['100x']) works! This little kludge of inserting
>something that wont let (satisfy isDigit) evaluate as true is unsatisfactory
>in my oppinion.
>    Now, I -assume- that the epsilon in the definition of <*> is supposed to
>prevent this error. Unfortunately, it wont. Despite all the laziness, the
><&> is evaluated before the epsilon.
>
>Does anyone have any clue as to how to fix this?
>---
You should extend your function satisfy by an additional alternative to handle
empty inputs:

satisfy :: (s -> Bool) -> Parser s s
satisfy f = p
where
  p [x:xs]
   | f x = [(xs, x)]
   | otherwise = []      // Can be removed
  p _ = []               // Missing in your definition !!

This also explains why the original program runs correct on the input
['100x']. At the end of parsing the input satisfy isDigit is applied to
['x']. This implies that the function

p [x:xs]
   | isDigit x = [(xs, x)]
   | otherwise = []

is applied to ['x']. No problem here. isDigit 'x' yields False and the
parser yields fail (the empty parse result: []).

When you try to parse ['100'] the funtion p is applied to [] (after parsing
the '1', '0' and '0'). This function has no altrenative for empty lists and
the appropriate error message is generated.

Pieter Koopman