On regular dependency trees

Leonid Mitjushin

   This paper considers dependency trees that can be constructed by 
the bottom-up process. The results presented were partly published in 
(Mitjushin 1985). Theorems 2 and 4 are new; for Theorem 3 a simpler 
proof has been found. This text does not contain proofs, except for a 
sketch of the proof of Theorem 4.


   1  Regular dependency trees

   By a 'sentence' we understand a finite sequence of nodes which 
represent words. A 'segment' is a subsequence of a sentence containing 
one or more consecutive nodes.
   A 'dependency tree' is a directed and rooted tree whose set of 
nodes is a sentence. A 'fragment' (of a dependency tree) is a directed 
and rooted tree on a segment. A dependency tree and a one-node segment 
are "extreme" examples of fragments.
   'Neighbouring segments' are segments going after one another. 
Precisely speaking, two segments are 'neighbouring' if they do not 
intersect, and their union is a segment. 'Neighbouring fragments' are 
fragments whose sets of nodes are neighbouring segments.
   If we have two neighbouring fragments, then drawing a directed arc 
from any node of one of the fragments to the root of the other creates 
a new fragment on the union of segments occupied by the original 
fragments. 
   This operation is called 'linking' and forms the basis for bottom-
up construction of dependency trees (Leykina and Tseytin 1975). At the 
initial moment we have only one-node fragments. Then we construct 
larger fragments by repeated application of the linking operation. At 
last we arrive at complete dependency trees, i.e. fragments on the 
whole sentence. 
   Not every dependency tree can be constructed by the bottom-up 
process. The simplest example is as follows: 


                   .-------------------------.
                   |                         |
                   | .-----------.           |
                   | |           |           |
                   | |     .---------------. |
                   | |     |     |         | |
                   | |     |     |         | |
                    o      o     o          o


(with any direction of arcs allowed in a rooted tree). In this 
situation we cannot make any linking, as no two neighbouring nodes are 
connected with an arc. 
   Trees that can be constructed from one-node fragments by (repeated) 
linking will be called 'regular'. As direction of arcs is not 
important for regularity (all directed trees with the same underlying 
non-directed tree are either regular or non-regular simultaneously), 
non-directed trees will be considered till the end of this section. 
   
   DEFINITION 1. Consider a tree with ordered nodes. A 'tangle' is a 
quadruple of nodes  (a, b, c, d),  where  a < b < c < d,  and there is 
a path in the tree of the form

                     c , a , . . . , d , b .


   Note that nodes  a, b, c, d  are not necessarily neighbours with 
respect to the order considered: there may be other nodes between 
them.

   THEOREM 1 (criterion for regularity) (Mitjushin 1985). A tree with 
ordered nodes is regular if and only if it does not contain tangles. 

   DEFINITION 2. Consider a tree with ordered nodes. A 'generalized 
tangle' is a quadruple of nodes  (a, b, c, d),  where  a < b < c < d,  
and there is a path in the tree of the form

               c , . . . , a , . . . , d , . . . , b .


   The simple tangle is a special case of the generalized tangle. 

   THEOREM 2 (another criterion for regularity). A tree with ordered 
nodes is regular if and only if it does not contain generalized 
tangles. 

   Cutting an edge (a,b) of a tree produces two trees not connected with
each other. Assuming a < b, denote their sets of nodes as V(a) and V(b),
and denote { x | a <= x <= b } as S(a,b) (the segment covered by the edge
(a,b)).

   COROLLARY to Theorem 2. If the tree is regular, then, for any edge 
(a,b), V(a) and V(b) intersected with S(a,b) are segments.


   2  Length of arcs

   For a tree with ordered nodes, the length of an arc (or edge) 
between nodes  a  and  b  is the number of nodes lying between them, 
plus 1. (Hence, the length of an arc between neighbouring nodes is 
equal to 1.)
   There exists a tendency to make arcs of dependency trees shorter. 
Some cases of so-called 'postponement' are probably due to this 
tendency, for example: 

   The time had come to decorate the house for Christmas.

   The problem arose of what to do with money.

Other examples can be found in (Mitjushin 1985) and (Lin 1996).  
   In real English and Russian sentences the average length of arcs in 
dependency trees usually does not exceed 3. Sentences with greater 
values of this parameter often seem "awkward". (Perhaps, for long 
sentences the limit should be slightly increased.) To be more precise, 
this is true for the specific method of assigning dependency trees to 
sentences used by Mel'cuk (1974) and Apresjan et al. (1989, 1992). 
However, the statement seems to remain true also for other "natural" 
methods. 
   The tendency to make arcs shorter turns out to be connected with 
regularity. In a sense, it may be regarded as "the reason" for 
regularity. 
   Imagine that we have a dependency tree which is not yet linearized. 
That is, its nodes are not arranged in any particular linear order, 
and our task is to find a "good" arrangenent. 
   Usually there exist certain linguistically motivated constraints on 
admissible order of nodes. These constraints are often imposed on 
pairs of nodes connected with arcs, but they may as well be of more 
general form. 
   Consider a directed tree with an unordered set of nodes. Let each 
arc z of the tree be assigned a positive 'weight' p[z], and let D be 
the 'weighted' total length of arcs: D is the sum of p[z]*d[z], where 
d[z] is the length of arc z, and the sum includes all arcs of the 
tree.

   THEOREM 3 (Mitjushin 1985). Let all constraints on admissible order 
of nodes be of the form "a < b", where a and b are connected with an 
arc (in any direction). Then any admissible order of nodes at which D 
reaches its minimum over such orders produces a regular tree.


   3  Parsing with an 'oracle'

   In this section we consider dependency trees in the form proposed 
by Mel'cuk (1974) and used in the ETAP system developed by Apresjan 
et al. (1989, 1992). 
   For each word of the sentence, all its lexico-morphological 
interpretations (called 'homonyms') are considered. A homonym consists 
of a lexeme name, a part-of-speech marker, and a list of values of 
morphological features. 
   In this new framework, a fragment is a set of homonyms occupying 
one or more successive positions in the sentence (one homonym in each 
position), plus a set of directed arcs connecting those homonyms. Each 
arc is labelled with a name of syntactic relation. The homonyms and 
arcs of a fragment must make up a rooted tree. 
   As in Section 1, 'linking' will mean drawing an arc between two 
neighbouring fragments, but now the arc is labelled with a name of 
syntactic relation.
   Let the words of the sentence be numbered  1, 2, . . . , n  from 
left to right. A dependency tree of the sentence is a fragment on 
segment [1, n] (i.e. the homonyms which constitute the set of nodes of 
the tree occupy positions 1, 2, . . . , n). Regular trees are defined 
in the same way as in Section 1.
   A sentence can have many different grammatically correct dependency 
trees. By an 'intended' dependency tree we understand a dependency 
tree which is grammatically correct and agrees with the semantic and 
pragmatic content of the sentence. As a rule, such a tree is unique. 
   A fragment is called 'true' if it is included in the intended 
dependency tree of the sentence. An 'oracle' is a device which can say 
about any fragment containing more than one homonym whether that 
fragment is or is not true. It seems that, in a sense, a human has 
such an 'oracle' (working at a subliminal level), though its degree of 
certainty, is, say, 99% instead of 100%. 
   A dependency tree is called 'weakly projective' if its arcs do not 
intersect, that is, no quadruples of nodes (a, b, c, d) exist for 
which  a < b < c < d,  a and c are connected with an arc (in any 
direction), and b and d are connected with an arc (in any direction).
Intended dependency trees are weakly projective for the vast majority 
of real English and Russian sentences.
   By Theorem 1, every weakly projective tree is regular. A weakly 
projective tree can be constructed by the bottom-up process in order
of increasing length of arcs drawn between fragments. 

   THEOREM 4. If the intended dependency tree of a sentence is unique 
and weakly projective, it can be found in no more than C * H * D 
steps, where each step includes one call of the 'oracle'. Here C is a 
constant proportional to the number of names of syntactic relations, 
H is the squared maximum number of homonyms for one word in the 
sentence, and D is the total length of arcs in the intended tree. 

   A SKETCH OF THE PROOF. The intended tree will be assembled in order 
of increasing length of arcs. Let A be the growing set of fragments 
constructed from homonyms by the linking operation. At the beginning 
of the process, A contains all homonyms of the sentence (and no other 
fragments).
   Define parameter k which successively takes values  1, 2, . . . , 
n - 1 (k is the current length of arcs to be established). For each k 
consider words of the sentence from left to right. If for the current 
word i no arcs have been drawn into its homonyms at previous steps, 
try to draw arcs of the form  (i-k) --> i  and  i <-- (i+k),  provided 
these arcs connect neighbouring fragments from A. After drawing each 
arc, present the new fragment to the 'oracle', and if it is true, 
add it to A. 
   A more "left-to-right" version of this procedure is also possible.


   References

Jurij Apresjan, Igor Boguslavsky, Leonid Iomdin, Aleksandr Lazurskij,  
   Nikolaj Pertsov, Vladimir Sannikov, and Leonid Tsinman. 1989. 
   Lingvisticheskoe Obespechenie Sistemy ETAP-2. Nauka, Moscow. ('The 
   linguistics of the ETAP-2 system', in Russian) 

Jurij Apresjan, Igor Boguslavsky, Leonid Iomdin, Aleksandr Lazurskij,  
   Leonid Mitjushin, Vladimir Sannikov, and Leonid Tsinman. 1992. 
   Lingvisticheskij Protsessor dlja Slozhnykh Informatsionnykh Sistem. 
   Nauka, Moscow. ('A linguistic processor for complex information 
   systems', in Russian) 

Berta Leykina and Gregory Tseytin. 1975. Sintaksicheskaja model' s 
   dopushcheniem ogranichennoj neproektivnosti. In "Mezhdunarodnyj 
   Seminar po Mashinnomu Perevodu", Moscow, pp. 72 - 74. (`A syntactic 
   model allowing limited non-projectivity', in Russian) 
 
Dekang Lin. 1996. On the structural complexity of natural language 
   sentences. In "Proceedings of COLING-96", Vol. 2, Copenhagen, 
   pp. 729 - 733.

Igor Mel'cuk. 1974. Opyt Teorii Lingvisticheskikh Modelej "Smysl - 
   Tekst". Nauka, Moscow. (`Toward a theory of Meaning - Text 
   linguistic models', in Russian) 

Leonid Mitjushin. 1985. Dlina sintaksicheskikh svjazej i induktivnye 
   struktury. "Semiotika i informatika", No. 26, pp. 34 - 51. ('Length 
   of syntactic links and inductive structures', in Russian)