Search

Skip to Search Results
  • Fall 2014

    Yu, Jiangwei

    We study the problem of geotagging named entities where the goal is to identify the most relevant location of a named entity based on the content of the Web pages where the entity is mentioned. We hypothesize the relationship between the mentions of an entity and its geo-center in web pages, and...

  • Spring 2012

    Chubak, Pirooz

    Natural language text is a prominent source of representing and communicating information and knowledge. It is often desirable to search in granularities of text that are smaller than a document or to query the syntactic roles and relationships within syntactically annotated text sentences, often...

  • Fall 2021

    Hasnat, Md Arif

    We study the problem of set discovery where given a few example tuples of a desired set, we want to find the set in a collection of sets. A challenge is that the example tuples may not uniquely identify a set, and a large number of candidate sets may be returned. Our focus is on interactive...

  • Spring 2016

    Zhou, Kai

    Set expansion aims at expanding a given query seed set into a larger and more complete set by adding elements that are likely to belong to the same grouping as the elements of the query set. This thesis studies the problem of efficient set expansion; in particular, given a collection of data...

  • Fall 2023

    Samadi, Mohammadreza

    In this thesis, we study the problem of performance prediction for open-domain multi-hop Question Answering (QA), where the task is to estimate the difficulty of evaluating a multi-hop question over a corpus. Despite the extensive research on predicting the performance of ad-hoc and QA retrieval...

  • Fall 2023

    Naeim Abadi, Ali

    Product Entity Matching (PEM) is a challenging subfield of record linkage that involves linking records referring to the same real-world product. Despite recent transformer models showing near-perfect performance scores on various datasets, they struggle the most when dealing with PEM datasets....

  • Fall 2022

    Kamalloo, Ehsan

    Answering information-seeking question involves retrieving relevant documents from a massive haystack of unstructured text corpora. This dissertation aims at building question answering (QA) systems that can be deployed in the wild where incoming questions may be noisy and their distribution...

  • Fall 2021

    Su, Yuchen

    A large portion of quantitative information about entities mentioned in Web pages is expressed as Web tables, and these tables often lack proper schema and annotation, which introduces challenges for the purpose of querying and further analysis. In this thesis, we study the problem of annotating...

  • Fall 2017

    Singh, Sanket Kumar

    Many applications that use geographical databases (a.k.a. gazetteers) rely on the accuracy of the information in the database. However, poor data quality is an issue in gazetteers; often data is integrated from multiple sources with different quality constraints and there may not be much detail...

  • Fall 2020

    Kassenov, Zharkyn

    The Web contains an enormous amount of structured data in the form of web tables, and there is a great value in retrieving this data and harnessing it for decision making and gain more insights. Finding the right data on the Web and integrating it with the existing data within an organization can...

11 - 20 of 22