- 256 views
- 171 downloads
Geotagging Named Entities in Web Pages
-
- Author / Creator
- Yu, Jiangwei
-
We study the problem of geotagging named entities where the goal is to identify
the most relevant location of a named entity based on the content of the Web pages
where the entity is mentioned. We hypothesize the relationship between the mentions
of an entity and its geo-center in web pages, and propose a framework that
explores this hypothesis and provides a model that can give a ranked list of locations
at different location granularities for an entity. We further study the problem
of dispersion, and show that the dispersion of a name can be estimated and a geo-center
can be detected at an exact dispersion level.
Two key features of our approach are: (i) minimal assumption is made on the
structure of the mentions hence the approach can be applied to a diverse and heterogeneous
set of web pages, and (ii) the approach is unsupervised, leveraging shallow
English linguistic features and large gazetteers.
We evaluate our methods under different settings and with different categories
of named entities. Our evaluation reveals that the geo-center of a name can be
estimated with a good accuracy based on some simple statistics of the mentions,
and that the accuracy of the estimation varies with the categories of the names. -
- Graduation date
- Fall 2014
-
- Type of Item
- Thesis
-
- Degree
- Master of Science
-
- License
- This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.