Deep Learning-based Framework of Summarizing Construction Videos for Vision-based Monitoring of Construction Sites

Xiao, Bo

doi:doi:10.7939/r3-xa8t-p893

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

229 views
416 downloads

Deep Learning-based Framework of Summarizing Construction Videos for Vision-based Monitoring of Construction Sites

Author / Creator

Xiao, Bo
In recent years, video monitoring of construction sites has become increasing popular worldwide, with the video footage captured containing important visual information concerning the progress of the given project. Video monitoring also improves the security at construction sites, serving as a deterrent against theft of materials and equipment. Furthermore, vision-based analysis of video footage is beneficial to construction management in terms of facilitating crew productivity, reducing safety risks, and optimizing site layouts. Despite offering a range of potential benefits, though, the efficient use of raw jobsite videos by construction professionals remains a challenge. In current practice, construction engineers have to manually browse the entire video to retrieve the desired information from a particular period of footage, and this manual inspection is a time-consuming and error-prone process. Meanwhile, storage of the video footage is challenging, especially considering the high resolution and long streaming time typical of construction site footage. Consequently, project managers have to recycle video footage every one or two weeks to free up digital storage space, discarding construction documentation that would have been invaluable as a long-term point of reference. To address these issues, this research proposes a deep learning-based framework to automatically distill raw video footage from construction sites into video highlights and text descriptions using a deep learning-based framework. To achieve this overarching goal, three specific objectives are pursued: (1) dataset development: developing an image dataset of construction machine images for deep learning object detection; (2) highlights detection: proposing a deep learning-based method for detecting video highlights from construction raw video footage; and (3) text generation: deploying deep learning image captioning methods to generate text descriptions from construction images. The outputs of the proposed framework (i.e., video highlights and text descriptions) will help construction engineers to efficiently ascertain what is happening in construction site without the need to manually browse the original construction videos. Compared with the original raw footage, the video highlights and text descriptions require much less storage space, making it practical to retain them for a period of years rather than weeks. The proposed framework provides the foundation for several advanced applications that will benefit the construction management, including: (1) auto-generating reports from daily construction videos; (2) building a querying system that searches for clips of interest based on text descriptions; and (3) quantitatively analyzing construction productivity based on video highlights. The framework proposed in this research is focusing on summarizing videos of construction machines captured by stationary cameras, which can be expanded for processing other types of construction videos (e.g., workers and materials) in the future.
Subjects / Keywords
Graduation date

Fall 2021
Type of Item

Thesis
Degree

Doctor of Philosophy
DOI

https://doi.org/10.7939/r3-xa8t-p893
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Doctoral
Department
- Department of Civil and Environmental Engineering
Specialization
- Civil (Cross-Disciplinary)
Supervisor / co-supervisor and their department(s)
- Kang, Shih-Chung (Civil and Environmental Engineering)