University of Southern Denmark
World of VISL  Visual Interactive Syntax Learning  
Syddansk Universitet
 
 

VISL teaching treebanks (VTB)

Linguistic design considerations

  1. Linguistic theory: The default VTB is a constituent treebank. However, though it can easily be transformed into a classical bracketing structure like the one use in the PENN treebank, bracketing (“syntactic form”) is not the main point, and the formalism retains a strong emphasis on function and dependency structure, both of which it has in common with VISL's other important linguistic format, Constraint Grammar. In ordinary VTB's, dependency is implicitly marked through head/dependent labels, and export into TIGER-dependency format or is possible.

  2. Branching: Multiple (non-binary) branching is allowed. For clarity, single-daughter nodes (e.g. rewriting single nouns as np's) are discouraged. In VTB source, branching is expressed as '=' indentation, with each '=' adding an additional layer. By convention, the top node's daughters are not indented:

      STA:fcl
      S:np
      =DN:art The
      =H:n
      teacher
      P:v-fin
      laughed

  3. Form and function: Each node, terminal or non-terminal, is marked for form and function. The core of a function labels is in upper case, form labels in lower case. Subcategories can be added in lower case (e.g. Od for direct object), in the case of form categories with a hyphen (e.g. pron-pers). Form and function are combined into a complete node label with a colon, function first, e.g. Od:np.

  4. Valency: Clause level constituents may be marked for ± valency by prefixing a lower case 'f' (free) or 'b' (bound). fC, for instance, is a free predicative, as opposed to the default Cs and Co (subject and object predicatives, respectively).

  5. Non-terminals: 3 types of non-terminal form are distinguished: Clause, group/phrase and paratagma, each allowing a different set of daughter functions - which normally wouldn't be mixed across node types. Clauses allow clause functions (S,P,O,A,C and subcategories), groups have heads (H) and dependents (D), possibly specified according to group type, as DN (adnominal), DA (adverbial modifier in group), DP (argument of preposition). A paratagma consists of conjuncts (CJT) and optional coordinators (CO).

  6. Heads: VISL extends the “hypotactic” use of group heads to the catatactic pp, opting for the preposition as (functional) head. The head of a clause is its verbal constituent, marked as P (predicator). With this exception, VTB-heads are normally terminals, though complex heads are allowed, especially in connection with shared modifiers. Note that in standard notation, the elliptic head function of a missing np-head can be marked on another candidate word class. Thus, old in the old will be head, but retain its adjective form (H:adj), and the group will still be an np.

  7. Dependents: Exploiting the philosophy of multiple branching, modifiers in a group will usually be handled in a flat way. The article, determiner and adjective in those few old oligarchs will all be daughters of oligarchs on the same level.

  8. Verb phrases: A standard VTB complies with the concept of “little vp”, allowing only verbal material, infinitive markers and auxiliary particles as daughters. Specific functions can be used for main verb (Vm), auxiliary/modal (Vaux) , infinitive marker (INFM) and verb-integrated particles (Vpart). The latter can either be placed inside the vp, or at clause level, according to linguistic preference. Head-dependent annotation can also be used, opting either for a semantic head (main verb) or a functional head (auxiliary). Clause level functions have been integrated into the vp in some VTB's (e.g. Spanish enclitic object pronouns or SUB instead of INFM), but such usage is discouraged and has so far been avoided by most VTN-designers.

  9. Clauses: VISL distinguishes between three types of clause form: finite (fcl), non-finite (icl) and averbal (acl), though under-specification as just clause (cl) is common in the teaching treebanks. Participle and infinitive constructions with clause leve constituents (e.g. objects, subjects, adverbials) will normally be regarded as clauses (icl) rather than groups - which would be the case in certain Romance linguistic traditions.

  10. Subordinators: The function category SUB is ordinarily used for subordinating conjunctions, while relatives and interrogatives are marked for their specific clause level SPOAC function rather than their SUB function. Though both the former and the latter may head averbal elliptic clauses in a dependency-transformation, they will not be regarded (functional) heads in ordinary VTB's.

  11. Crossing branches: VTB's may have crossing branches, i.e. non-projective dependencies. These are expressed as discontinous constituents in stardard VTB's, with a directed hyphen to “join” the individual parts of a discontinous node, e.g. P:vp- fA -P:vp for a predicate-vp bracketing a free adverbial (has never seen). This notation will also handle fronted raised constituents (What (DP) are you afraid of? That (Od) wasn't easy to guess.) Note that in a multi-level branching, not only the immediate mother node, but possible the grand-mother, or even further ancestors, too, will have to be discontinous (Hvem tror du han holder mest af at drille?).

  12. Stacking: There are 2 non-specified “dummy” categories, 'X' for function, and 'x' for form. Introduced by C. Bache, the stacking notationmakes use of these symbols in order to avoid ad hoc categories, and to delegate labels in elliptic constructions to a level where they can be resolved:

    STA:fcl
    S:pron-pers He
    P:v-fin gave
    X:par
    =CJT:x
    ==Oi:pron-pers her
    ==Od:np
    ===DN:art a
    ===H:n horse
    =CO:conj-c and
    =CJT:x
    ==Oi:pron-pers him
    ==Od:np
    ===DN:art a
    ===H:n car

    This notation will also handle coordinated predicates sharing the same subject, and is an option in certain cases of ellipsis. For verb-elliptic clauses, a special form tag, acl (averbal clause) exists.

 


Copyright 1996-2017 | Report a Problem / Contact Us | Printable Version