Search
Skip to Search Results- 1Chubak, Pirooz
- 1Esteki, Afsaneh
- 1Hasnat, Md Arif
- 1Kamalloo, Ehsan
- 1Kassenov, Zharkyn
- 1Makazhanov, Aibek
- 3Databases
- 2Geotagging
- 2Information Retrieval
- 2Natural Language Processing
- 2Question Answering
- 1Active Learning
-
Fall 2014
We study the problem of geotagging named entities where the goal is to identify the most relevant location of a named entity based on the content of the Web pages where the entity is mentioned. We hypothesize the relationship between the mentions of an entity and its geo-center in web pages, and...
-
Spring 2012
Natural language text is a prominent source of representing and communicating information and knowledge. It is often desirable to search in granularities of text that are smaller than a document or to query the syntactic roles and relationships within syntactically annotated text sentences, often...
-
Fall 2021
We study the problem of set discovery where given a few example tuples of a desired set, we want to find the set in a collection of sets. A challenge is that the example tuples may not uniquely identify a set, and a large number of candidate sets may be returned. Our focus is on interactive...
-
Spring 2016
Set expansion aims at expanding a given query seed set into a larger and more complete set by adding elements that are likely to belong to the same grouping as the elements of the query set. This thesis studies the problem of efficient set expansion; in particular, given a collection of data...
-
Fall 2023
In this thesis, we study the problem of performance prediction for open-domain multi-hop Question Answering (QA), where the task is to estimate the difficulty of evaluating a multi-hop question over a corpus. Despite the extensive research on predicting the performance of ad-hoc and QA retrieval...
-
Fall 2023
Product Entity Matching (PEM) is a challenging subfield of record linkage that involves linking records referring to the same real-world product. Despite recent transformer models showing near-perfect performance scores on various datasets, they struggle the most when dealing with PEM datasets....
-
Fall 2022
Answering information-seeking question involves retrieving relevant documents from a massive haystack of unstructured text corpora. This dissertation aims at building question answering (QA) systems that can be deployed in the wild where incoming questions may be noisy and their distribution...
-
Fall 2021
A large portion of quantitative information about entities mentioned in Web pages is expressed as Web tables, and these tables often lack proper schema and annotation, which introduces challenges for the purpose of querying and further analysis. In this thesis, we study the problem of annotating...
-
Fall 2017
Many applications that use geographical databases (a.k.a. gazetteers) rely on the accuracy of the information in the database. However, poor data quality is an issue in gazetteers; often data is integrated from multiple sources with different quality constraints and there may not be much detail...
-
Fall 2020
The Web contains an enormous amount of structured data in the form of web tables, and there is a great value in retrieving this data and harnessing it for decision making and gain more insights. Finding the right data on the Web and integrating it with the existing data within an organization can...