Download the full-sized PDF of Analyzing Controversy in WikipediaDownload the full-sized PDF



Permanent link (DOI):


Export to: EndNote  |  Zotero  |  Mendeley


This file is in the following communities:

Graduate Studies and Research, Faculty of


This file is in the following collections:

Theses and Dissertations

Analyzing Controversy in Wikipedia Open Access


Other title
Structure classifier
Collaboration network
Type of item
Degree grantor
University of Alberta
Author or creator
Sepehri Rad, Hoda
Supervisor and department
Barbosa, Denilson (computing science)
Examining committee member and department
Inkpen, Diana (computing science)
Osmar, Zaiane (computing science )
Harms, Janelle (computing science)
Schuurmans, Dale (computing science)
Barbosa, Denilson (computing science )
Department of Computing Science

Date accepted
Graduation date
Doctor of Philosophy
Degree level
This thesis describes a novel controversy model that helps the current manual process in automatically identifying controversial Wikipedia articles and warning readers about disputable information contained in these articles. The model is based on identifying collaboration patterns among editors and inferring their attitudes towards one another. These are exploited in the form of a social network representing the overall structure of history of collaborations of editors of each article. A set of features, rooted at sound theories of social behavior, are extracted from each network to train a classifier distinguishing controversial articles from other articles. To infer attitudes, a novel supervised approach is employed based on votes cast in Wikipedia admin elections. The combination of structural features extracted from each network, and the method for inferring attitudes of editors provides an accurate and efficient controversy model as demonstrated by several experiments and comparison with other methods. Also, a systematic evaluation and comparison of previous controversy models is provided. The results show the inefficiency of most of these models in capturing the complex process of formation of controversy, and express the power of editors collaboration networks for modeling this process. Finally, to give more insight about controversial topics, a novel framework is proposed to analyze controversy at a more fine-grained level. Using this framework, two different approaches are proposed. The first approach aims to separate the most controversial parts of each article from other non-controversial and reliable parts. This approach is shown to be a challenging problem due to both designing a suitable method and providing a quantitative evaluation. On other hand, the second approach helps to get a ranked list of the revisions that contributed most to controversy of the article. For this approach, a solution based on maximum coverage problem is proposed and its usefulness is shown by quantitative results and some case studies.
This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for the purpose of private, scholarly or scientific research. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
Citation for previous publication
Identifying Controversial Wikipedia Articles Using Editor Collaboration Networks. Hoda Sepehri Rad and Denilson Barbosa. ACM TIST, 6(1):pp. 5, 2015.Leveraging editor collaboration patterns in wikipedia. Hoda Sepehri-Rad, Aibek Makazhanov, Davood Rafiei and Denilson Barbosa. In Proceedings of the 23rd ACM conference on Hypertext and social media, pp. 13-22. ACM, 2012.Identifying controversial articles in Wikipedia: A comparative study. Hoda Sepehri-Rad and Denilson Barbosa. In 8th International Symposium on Wikis and Open Collaboration, pp. 7:1-7:10. ACM, 2012.Towards identifying arguments in Wikipedia pages. Hoda Sepehri-Rad and Denilson Barbosa. In Proceedings of the 20th International Conference on World Wide Web, pp. 117-118. ACM, 2011.

File Details

Date Uploaded
Date Modified
Audit Status
Audits have not yet been run on this file.
File format: pdf (Portable Document Format)
Mime type: application/pdf
File size: 2149808
Last modified: 2016:06:16 17:08:59-06:00
Filename: Sepehri Rad_Hoda_201510_PhD.pdf.pdf
Original checksum: 9d7b3a700d7817f668d1d5596c83a965
Well formed: false
Valid: false
Status message: Invalid page tree node offset=1808154
Status message: Unexpected error in findFonts java.lang.ClassCastException: edu.harvard.hul.ois.jhove.module.pdf.PdfSimpleObject cannot be cast to edu.harvard.hul.ois.jhove.module.pdf.PdfDictionary offset=3336
Page count: 72
Activity of users you follow
User Activity Date