next up previous contents
Next: Parsing Up: Grammar Rules Previous: Procedural Attachments   Contents

The cats> Operator

The cats> operator is used to describe a list of daughters, whose length cannot be determined until run-time. Daughters are not parsed or generated as quickly as part of a cats> specification. Note also the interpretation of cats> requires that its argument is subsumed by the type list, which must be defined, along with ne_list, e_list, etc., and the features HD, and TL, which we defined above. This check is not made using unification, so that an underspecified list argument will not work either. If the argument of cats> is not subsumed by list, then the rule in which that argument occurs will never match any string, and a run-time error message will be given. This operator is useful for so-called ``flat'' rules, such as HPSG's Schema 2, part of which is given (in simplified form) below:


  schema2 rule
  (cat:(head:Head,
        subcat:[Subj]))
  ===>
  cat>
   (cat:(head:Head,
         subcat:[Subj|Comps])),
  cats> Comps.
Since various lexical items have SUBCAT lists of various lengths, e.g., zero for proper nouns, one for intransitive verbs, two for transitive verbs, cats> is required in order to match the actual list of complements at run-time.

It is common to require a goal to produce an output for the argument of cats>. If this is done, the goal must be placed before the cats>. Our use of cats> is problematic in that we require the argument of cats> to evaluate to a list of fixed length. Thus parsing with the following head-final version of HPSG's Schema 2 would not work:


  schema2 rule
  (cat:(head:Head,
        subcat:[SubjSyn]))
  ===>
  cats> Comps,
  cat>
   (cat:(head:Head,
         subcat:[Subj|Comps])).
One way to work around this is to place some finite upper bound on the size of the Comps list by means of a constraint.

  schema2 rule
  (cat:(head:Head,
        subcat:[SubjSyn]))
  goal>  three_or_less(Comps),
  cats> Comps,
  cat>
   (cat:(head:Head,
         subcat:[Subj|Comps])).

  three_or_less([]) if true.
  three_or_less([_]) if true.
  three_or_less([_,_]) if true.
  three_or_less([_,_,_]) if true.
The problem with this strategy from an efficiency standpoint is that arbitrary sequences of three categories will be checked at every point in the grammar; in the English case, the search is directed by the types instantiated in Comps as well as that list's length. From a theoretical standpoint, it is impossible to get truly unbounded length arguments in this way.


next up previous contents
Next: Parsing Up: Grammar Rules Previous: Procedural Attachments   Contents
Detmar Meurers
2001-03-03