A Novel Framework for Unique People Count from Monocular Videos

Mukherjee, Satarupa

doi:doi:10.7939/R31M4W

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

378 views
768 downloads

A Novel Framework for Unique People Count from Monocular Videos

Author / Creator

Mukherjee, Satarupa
Counting unique number of people in a video (i.e., counting a person only once while the person passes through the field of view (FOV)), is required in many video analytic applications, such as transit passenger and pedestrian volume count in railway stations, malls and road intersections, aid in security and resource management, urban planning, advertising and many others.

In this PhD thesis I have developed a robust algorithm to generate unique people count from monocular videos taken from an arbitrary angle. From applications point of view, my algorithm is one of the most economical ones, because it can work with existing video cameras already mounted. Within a region of interest (ROI) on the FOV of the camera, I compute influx/outflux rate of people, i.e., number of people coming in or going out of the ROI per unit time. Then, I sum the influx/outflux rate between any two time points to estimate the number of people that entered and/or left the ROI within that time interval. I employ two well-known computer vision techniques for this purpose: Gaussian process regression (GPR) to estimate the number of people present within a ROI and optical flow-based tracking of the boundary of the ROI.

The principle roadblock in most of computer vision problems is occlusion. To avoid this bottleneck, we adopt the combination of (a) the concept of influx and outflux of fluid mass from computational fluidics, (b) the GPR to estimate the number of people within a ROI and (c) ROI boundary tracking (as opposed to object or feature tracking) for a short period. Thus, the principal contribution of the thesis is to successfully handle occlusions by computing the average influx and/or outflux of people and avoiding people detection and tracking.

We validate the proposed algorithm on 19 publicly available monocular benchmark videos. Occlusions are abundant in these videos, yet we obtain more than 95% accuracy for most of these videos. We also extend our proposed framework beyond monocular videos and apply it on multiple views of a publicly available dataset with about 99% accuracy.
Subjects / Keywords
Graduation date

Spring 2014
Type of Item

Thesis
Degree

Doctor of Philosophy
DOI

https://doi.org/10.7939/R31M4W
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Doctoral
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- Ray, Nilanjan (Conputing Sciences)
Examining committee members and their departments
- Saha, Punam (Electrical & Computer Engineering, The University of Iowa, USA)
- Mandal, Mrinal (Electrical & Computer Engineering)
- Cheng, Irene (Computing Science)
- Boulanger, Pierre (Computing Science)
- Zhang, Hong (Computing Science)