Usage
  • 53 views
  • 59 downloads

Moving Object Detection Using Unsupervised and Weakly Supervised Neural Networks in Videos with Illumination Changes and Dynamic Background

  • Author / Creator
    Bahri, Fateme
  • Background subtraction is a crucial task in computer vision applications, such as video surveillance, traffic monitoring, autonomous navigation, and human-computer interaction. This approach involves acquiring a background model to separate moving objects and the background from an input image. However, challenges such as sudden and gradual illumination changes and dynamic backgrounds can make this task difficult. Among various methods proposed for background subtraction, supervised deep learning-based techniques are currently considered state-of-the-art. However, these methods require pixel-wise ground-truth labeling, which can be time-consuming and expensive. The aim of this thesis is to develop unsupervised and weakly supervised background subtraction methods that can handle illumination changes or dynamic backgrounds.

    Most methods handle illumination changes and shadows in batch mode and are unsuitable for long video sequences or real-time applications. To address this, we propose an extension of a state-of-the-art batch Moving Object Detection (MOD) method, ILISD, to an online/incremental MOD method using unsupervised and generative neural networks. Our method uses illumination invariant image representations and obtains a low-dimensional representation of the background image using a neural network. It then decomposes the foreground image into illumination changes and moving objects.

    Yet another challenge is dynamic background, where a background pixel can have different values due to periodical or irregular movements, negatively affecting a method's performance. For example, surging of water, water fountains and waving trees cause dynamic variations in the background. To address this, we propose a new unsupervised method, called DBSGen, which estimates a dense dynamic motion map using a generative multi-resolution convolutional network and warps the input images by the obtained motion map. Then, a generative fully connected network generates background images using the warped input images in its reconstruction loss term. Finally, a pixel-wise distance threshold that utilizes a dynamic entropy map obtains the binary segmented results.

    Finally, we propose a weakly supervised background subtraction method, where the training set consists of a moving object-free sequence of images, without requiring per-pixel ground-truth annotations. Our method consists of two neural networks. The first network, an autoencoder, generates dynamic background images for training the second network. Dynamic background images are obtained by applying a threshold to background-subtracted images. The second network is a U-Net that uses as input the same moving object-free video and is trained by using the dynamic background images produced by the autoencoder for its target output. During the testing phase, the autoencoder and U-Net process input images to generate background and dynamic background images, respectively. The dynamic background image helps remove dynamic motion from the background-subtracted image, resulting in a foreground image that is free of dynamic artifacts.

  • Subjects / Keywords
  • Graduation date
    Fall 2023
  • Type of Item
    Thesis
  • Degree
    Doctor of Philosophy
  • DOI
    https://doi.org/10.7939/r3-2rh2-vc69
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.