Usage
  • 98 views
  • 244 downloads

Robust Knowledge Acquisition in Answering Information-seeking Questions At Scale

  • Author / Creator
    Kamalloo, Ehsan
  • Answering information-seeking question involves retrieving relevant documents from a massive haystack of unstructured text corpora. This dissertation aims at building question answering (QA) systems that can be deployed in the wild where incoming questions may be noisy and their distribution inevitably shifts from that of the training data. At a high level, we attempt to tackle three distinct problems arising in real-world scenarios: from the modelling perspective, how to build robust and scalable QA models, and how to acquire knowledge that is useful for fulfilling questions from text, and from the evaluation perspective, how to reliably evaluate retrieval-based QA models.
    Towards this goal, we first study the problem of adapting classical IR models for QA tasks. For this purpose, we investigate one of the basic and salient linguistic features in text, the relationship between the ordering of words in an answer passage and that of a question. In particular, we present a sparse retrieval model that treats $n$-grams as single compound terms to represent local word order. Second, our focus shifts to the generalizability of QA models via data augmentation. To this end, we design a sample-efficient data augmentation framework, inspired by adversarial training methods, that makes QA models robust to distribution shift. Third, we present a novel knowledge acquisition method that can be helpful in addressing ambiguity in questions. In particular, we aim at automatically deriving meta-information about the spatial grounding of location mentions in text. Our method does not require any supervision and leverages the structural interactions between the mentions in a document. Finally, we focus on the reliability of evaluation benchmarks in information-seeking QA. Specifically, we highlight that existing benchmarks are heavily skewed toward passage-level information. Our analysis paves the way for designing future benchmarks that can better reflect the true performance of QA models.
    Overall, in pursuit of achieving genuine human-level QA systems that can be readily used in real-world applications, the present thesis highlights the key requirements of knowledge acquisition, robustness to distribution shift, scalability, and reliable evaluation.

  • Subjects / Keywords
  • Graduation date
    Fall 2022
  • Type of Item
    Thesis
  • Degree
    Doctor of Philosophy
  • DOI
    https://doi.org/10.7939/r3-khy8-s837
  • License
    This thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.