Question Answering

Introduction

Automatic question answering is an open research topic in natural language processing. Each question contains a real world relation between entities. Our project aims at digging semantic meeanings behind the question, finding the correct relation and relation arguments in the question, and finally, transforming them to a machine-readable representaion and query the answer in structured knowledge bases, such as Freebase and YAGO.

Spectrum of Question Answering

If we focus on question understanding and representation, there are 4 methods to represent a question, sorting by the granularity from coarse grained to fine grained.

  1. Template Based

    This is the most coarse grained method to understand a question. It try to learn question templates from a large number of QA pairs. One question template is a question with relation phrases or entity phrases be replaced by placeholders. E.g., "What's the birth date of $person?", $person is a placeholder for one entity. Every template is mapped to one predicate in KB, this is learnt by a large number of QA pairs, it can find the most probable predicates in KB that can induce these QA pairs, and it doesn't care the specific meaning of each single word in the template. When a new question comes, the corresponding question template is selected, with the predicate and the entity we found by this template, we will get the answer immediately.

    Main work:

    [1] Xiao et al. KBQA: Question Answring over a Billion Scale Knowledge Base. 2013. (draft)

    [2] Fader et al. Paraphrase-Driven Learning for Open Question Answering. ACL 2013.

    [3] Unger et al. Template-based question answering over RDF data. WWW 2012.

  2. Keyword Based

    The idea to understand the question is to extract keywords from the question, then mapping to the ontology, and trying to find a connection on the ontology which can connect these keywords. Usually the connection is a tree, and it's easy to convert the tree to SPARQL query form. This method identify some keywords, and try to understand the question by these keywords, but not explore the semantic structure of the question.

    Main work:

    [4] Haase et al. Semantic Wiki Search. ESWC 2009.

    [5] Yahya et al. Natural Language Questions for the Web of Data. EMNLP 2012.

  3. DCS Based

    Dependency-based Compositional Semantics(DCS) is proposed by Liang (2011). They try to understand the question by building a semantic parsing tree structure similar with the dependency parse tree. In this representation, each word/phrase has the meaning, they will be mapped into one predicate in KB, here a predicate can be an entity, a type, or an 2-ary relation. For example, "city in California", "city" will be mapped to city(a type) in KB, and "California" is mapped to California(an entity). Based on that, the semantic parsing tree is constructed from bottom to head, which is very similar with the generation of a dependency tree(just imagine each predicate as single word in the sentence). The tree is projective, the neighbouring predicates can be connected to form a subtree, and two neighboring subtrees can also be connected(connect the head predicates of two subtrees), forming a larger subtree. DCS provides several ways(composition rules) to connect two predicates, like join, aggregate, compare and so on. Finally we can get the semantic parsing tree of the question, which is understood by the computer and can find answers in the KB.

    Due to there are several composition ways when connecting subtrees, we may generate a bunch of semantic parsing trees for one question. However, each tree can be encoded as a list of features, with the help of QA pairs, we can perform a supervised learning(Log-Linear + Beam search) to weight each feature. Then we can use the (top-k) most probable tree to represent the question.

    The information for mapping from phrases to predicates is called "Lexicon", for example, "city" --> city is one lexical item of this lexicon. At first, we don't have any idea about what the phrase is, so there may have many lexical items like "city" --> predicate in the lexicon. Fortunately, we can add the selection of lexical items into the feature list, then hopefully we can find those highly weighted lexical items. Liang performed close domain evaluation on a geography KB, where the initial lexicon contains "city" --> city / state / river / population / ..., since we can only find answers by understanding "city" as city, the system can get the high weight for "city" --> city, while low weight to other lexical items. For open domain QA, the lexicon generation is a challenge because we can't make the lexicon containing too much useless lexical items, it will harm the efficiency and precision. Liang didn't evaluate open domain QA on the full version of DCS, his student Berant simplify the DCS model(only support three composition rules: bridge, join, aggregate) and evalute the accuracy on Freebase.

    Main work:

    [6] Liang et al. Learning Dependency-Based Compositional Semantics. ACL 2011 & Computer Linguistics 2013.

    [7] Berant et al. Semantic Parsing on Freebase from Question-Answer pairs. EMNLP 2013.

  4. CCG Based

    Combinatory Categorical Grammar(CCG) first proposed by Steedman (2000), early used by Zettlemoyer & Collins (2005) for semantic parsing. Recently, there are many papers using this model to perform semantic parsing and question answering. This is the most fine grained model among the 4 methods. Similar to DCS, we try to understand the meaning of each phrase, and composite them to form a semantic parsing tree. But CCG-parsed tree is more likely to be a constituency parsing tree rather than dependency parsing tree.

    Each phrase in the sentence will be mapped to a more complex logical form, for example: border --> (S\NP)/NP , lambda xy.border(y,x).

    The first part (S\NP)/NP is the syntactic type, which means "border" is a sentence(S) but losing two noun phrases(NP), one on the left, and one on the right. This syntactic type will guide the composition. For example, if there is a word "Texas" after "border", and its syntactic type is NP, then we can connect the node border" and "Texas" together, getting a parent node "border Texas" with syntactic type S\NP. Note that this composition will create a new parent node, where the DCS composition won't create a new parent node(connect two subtrees, one is parent, the other one is child), that's why CCG parsing is similar with constituency parsing, not dependency parsing. In addition, the symbol S,NP,... stands for the constituent of a text span, which is almost the same meaning with the symbol S,NP,VP,PP in constituency parsing.

    The second part shows the semantic logical form of "border", the phrase is mapped into border in KB, when compositing with other nodes, the logical form of other nodes(E.g., "Texas") will be applied to the logical form of "border". After the composition, we will get the final tree node, which has the syntactic type "S", and the final logical form, which is the representation of the question. For the detail understanding, please refer to "Zettlemoyer, Collins. Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars".

    However, the lexicon is more complex than DCS, they contains too many information, and due to the different interaction with neighbouring phrases, one phrases may have several lexical items, even the phrase doesn't have any polysemy. And most previous papers learn the lexicon from annotated questions, but this kind of training data is very limited. On the open domain, Cai and Yates (2013) took an effort on mapping phrases into Freebase predicates, they use FB entity pairs to extract sentence, and then find keywords in the sentence to be the phrase of an lexical item. But the method seems not reliable.

    Main Work:

    [8] Steedman. The Syntactic Process. MIT Press, 2000.

    [9] Zettlemoyer and Collins. Learning to map sentences to logical form: Structured classification with probabilistic categorical grammars. 2005.

    [10] Kwiatkowski et al. Inducing probabilistic CCG grammars from logical form with higher-order unification. EMNLP 2010.

    [11] Krishnamurthy et al. Weakly supervised training of semantic parsers. EMNLP 2012.

    [12] Kwiatkowski et al. Scaling Semantic Parsers with On-the-fly Ontology Matching. 2013.

    [13] Cai, Yates. Large-scale Semantic Parsing via Schema Matching and Lexicon Extension. ACL 2013.

Members