Robust and Accurate Generic Visual Object Tracking Using Deep Neural Networks in Unconstrained Environments

Khaghani,Javad

doi:doi:10.7939/r3-ge9g-a368

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

255 views
415 downloads

Robust and Accurate Generic Visual Object Tracking Using Deep Neural Networks in Unconstrained Environments

Author / Creator

Khaghani,Javad
The availability of affordable cameras and video-sharing platforms have provided a massive amount of low-cost videos. Automatic tracking of objects of interest in these videos is the essential step for complex visual analyses. As a fundamental computer vision task, Visual Object Tracking aims at accurately (and efficiently) locating a target in an arbitrary video, given an initial bounding box in the first frame. While the state-of-the-art deep trackers provide promising results, they still suffer from performance degradation in challenging scenarios including small targets, occlusion, and viewpoint change. Also, estimating the axis-aligned bounding box enclosing the target cannot provide the full details about its boundaries. Moreover, the performance of tracker relies on its well-crafted modules, typically consisting of manually-designed network architectures to boost the performance. In this thesis, first, a context-aware IoU-guided tracker is proposed that exploits a multitask two-stream network and an offline reference proposal generation strategy to improve the accuracy for tracking class-agnostic small objects from aerial videos of medium to high altitudes. Then, a two-stage segmentation tracker to provide better semantically interpretation of target in videos is developed. Finally, a novel cell-level differentiable architecture search with early stopping is introduced into Siamese tracking framework to automate the network design of the tracking module, aiming to adapt backbone features to the objective of network. Extensive experimental evaluations on widely used generic and aerial visual tracking benchmarks demonstrate the effectiveness of the proposed methods.
Subjects / Keywords
Graduation date

Spring 2022
Type of Item

Thesis
Degree

Master of Science
DOI

https://doi.org/10.7939/r3-ge9g-a368
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Master's
Department
- Department of Electrical and Computer Engineering
Specialization
- Signal and Image Processing
Supervisor / co-supervisor and their department(s)
- Cheng, Li (Electrical and Computer Engineering)