High-order Graph-based Dependency Parser


        Our project is about to improve performance of graph-based parser by introduce high-order features while not worsening the efficiency.


        Parsing is a classical natural language processing(NLP) problem, which is the basis of knowledge understanding and other NLP applications(e.g. Information Extraction, Question Answering etc. ). Dependency trees outperform constituency trees because it enables a more compact representation and order-free grammars.
        Data-driven methods give elegant solution to build dependency parse trees instead of tedious and laborious rule-based work. Among the data-driven methods, graph-based parsers introduce graph theories to turn parsing into building spanning trees and gain a competitive position. Comparing to transition-based methods, which define features over a rich history, graph-based parsers is lacking in making use of the parsing history.
        As a result, the idea of high-order graph-based parser came into being. It do allow parser to consider more (e.g. sibling, grandparent etc. ) while parsing and scoring, however, it also brings a problem of parsing speed. How to make a trade-off between accuracy and speed is worth exploring.

Related Work

1. Popular Dependency Parser Available:

Stanford Parser: Head rule based, transform phrase structure tree into dependency tree.
Malt: Transition-based dependency parser.
MST: Turn parsing into finding the Maximum Spanning Tree, Graph-based parser.

2. High-order graph-based dependency parser papers:

+ McDonald, Ryan T., and Fernando CN Pereira. "Online Learning of Approximate Dependency Parsing Algorithms." EACL. 2006.
+ Koo, Terry, and Michael Collins. "Efficient third-order dependency parsers." ACL. 2010.
+ Carreras, Xavier. "Experiments with a Higher-Order Projective Dependency Parser." EMNLP-CoNLL. 2007.
+ Ma X, Zhao H. Fourth-Order Dependency Parsing[C]//COLING (Posters). 2012
+ Rush, Alexander M., and Slav Petrov. "Vine pruning for efficient multi-pass dependency parsing." ACL, 2012.


1. Enrich features of parser by make use of large-scale dataset. These dataset can provide us with syntactic and semantic information which is essentially important for parsing.

a) Probase
b) Google Syntactic N-gram

2. It's natural for human beings to treat parsing discriminatively in terms of different linguistic, in that different relation with different significance and parsing confidence, even we human beings do not consider all the words at the same time. And different adaptive parse feature templates is needed by different linguistic relations (serve for both speed and accuracy).Followed by this intuition, we can divide parse into two parts:

+ Step 1: Get a parsing order, which means find a word index (dependent) sequence to do the parse. This is a greedy process to speed up the parsing and similar idea to the parsing process of humans.
+ Step 2: According to the parse order, find a head for every dependent with highest score, and the score is computed while parsing, which means we can make use of the high order features.


Wenjing Fang(fangwenjing_sjtu@126.com): Shanghai Jiao Tong University.
Yizhong Wang(disciplinant.wyz@gmail.com): Shanghai Jiao Tong University.
Jia Tan(tanjia@sjtu.edu.cn): Shanghai Jiao Tong University.
Kenny Q. Zhu (kzhu@cs.sjtu.edu.cn): Shanghai Jiao Tong University