Moving Object Detection Using Unsupervised and Weakly Supervised Neural Networks in Videos with Illumination Changes and Dynamic Background

Bahri, Fateme

doi:doi:10.7939/r3-2rh2-vc69

ERA is in the process of being migrated to Scholaris, a Canadian shared institutional repository service (https://scholaris.ca). Deposits and changes to existing ERA items and collections are frozen until migration is complete. Please contact erahelp@ualberta.ca for further assistance

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

119 views
112 downloads

Moving Object Detection Using Unsupervised and Weakly Supervised Neural Networks in Videos with Illumination Changes and Dynamic Background

Author / Creator

Bahri, Fateme
Background subtraction is a crucial task in computer vision applications, such as video surveillance, traffic monitoring, autonomous navigation, and human-computer interaction. This approach involves acquiring a background model to separate moving objects and the background from an input image. However, challenges such as sudden and gradual illumination changes and dynamic backgrounds can make this task difficult. Among various methods proposed for background subtraction, supervised deep learning-based techniques are currently considered state-of-the-art. However, these methods require pixel-wise ground-truth labeling, which can be time-consuming and expensive. The aim of this thesis is to develop unsupervised and weakly supervised background subtraction methods that can handle illumination changes or dynamic backgrounds.

Most methods handle illumination changes and shadows in batch mode and are unsuitable for long video sequences or real-time applications. To address this, we propose an extension of a state-of-the-art batch Moving Object Detection (MOD) method, ILISD, to an online/incremental MOD method using unsupervised and generative neural networks. Our method uses illumination invariant image representations and obtains a low-dimensional representation of the background image using a neural network. It then decomposes the foreground image into illumination changes and moving objects.

Yet another challenge is dynamic background, where a background pixel can have different values due to periodical or irregular movements, negatively affecting a method's performance. For example, surging of water, water fountains and waving trees cause dynamic variations in the background. To address this, we propose a new unsupervised method, called DBSGen, which estimates a dense dynamic motion map using a generative multi-resolution convolutional network and warps the input images by the obtained motion map. Then, a generative fully connected network generates background images using the warped input images in its reconstruction loss term. Finally, a pixel-wise distance threshold that utilizes a dynamic entropy map obtains the binary segmented results.

Finally, we propose a weakly supervised background subtraction method, where the training set consists of a moving object-free sequence of images, without requiring per-pixel ground-truth annotations. Our method consists of two neural networks. The first network, an autoencoder, generates dynamic background images for training the second network. Dynamic background images are obtained by applying a threshold to background-subtracted images. The second network is a U-Net that uses as input the same moving object-free video and is trained by using the dynamic background images produced by the autoencoder for its target output. During the testing phase, the autoencoder and U-Net process input images to generate background and dynamic background images, respectively. The dynamic background image helps remove dynamic motion from the background-subtracted image, resulting in a foreground image that is free of dynamic artifacts.
Subjects / Keywords
Graduation date

Fall 2023
Type of Item

Thesis
Degree

Doctor of Philosophy
DOI

https://doi.org/10.7939/r3-2rh2-vc69
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Doctoral
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- Ray, Nilanjan (Computing Science)