3D Multi-view Imaging: Object Contour Approximation for Depth Image Coding and Multi-view Image/Video Streaming

Yuan, Yuan

doi:doi:10.7939/R3V11W08Z

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

369 views
549 downloads

3D Multi-view Imaging: Object Contour Approximation for Depth Image Coding and Multi-view Image/Video Streaming

Author / Creator

Yuan, Yuan
Thanks to the rapidly dropping cost of digital cameras, multi-view imaging—using a series of cameras capturing images from the same 3D scene simultaneously but from different viewpoints—opens a wide variety of interesting research topics and applications. Among them, free viewpoint TV and light field camera are two of the most important applications, which enable users to observe a static 3D scene by freely changing their viewpoints. However, the amount of multi-view data that needs to be stored or transmitted is huge. Therefore, efficient image/video coding and streaming are crucial points for the success of such applications. We first investigate object contours in depth image coding. A depth image provides partial geometric information of the captured 3D scene, which is important for synthesizing images corresponding to different virtual camera viewpoints via depth-image-based-rendering (DIBR). It has been shown that lossy compression of object contours will lead to bleeding artifacts in DIBR synthesized view, while losslessly coding of the exact object contours can be expensive at low rate. In this thesis, we propose to approximate object contours to save coding bits. Specifically, we first greedily approximate object contours based on an arithmetic edge coding (AEC) model to lower the edge coding cost. To control the induced synthesized view distortion due to contour approximation, we then introduce a rate-distortion (RD) optimal scheme. We show that the object contours themselves can be suitably approximated to save coding bits, while the synthesized objects remain sharp and natural for human perception. We then study the problem of multi-view image/video streaming. In free viewpoint TV, a user can pull color and depth videos captured from two nearby reference viewpoints to synthesize his chosen intermediate virtual view for observation via DIBR. A user may pull the same reference views from the server with other users so that the streaming cost can be shared, while a reference view with further distance to his chosen virtual view may increase the synthesized view distortion. In this thesis, we divide users into groups, where a user simultaneously belongs to two groups and each group shares the streaming cost of a single reference view. We also aim to find a Nash Equilibrium (NE) solution of reference view selection for each user, so that the shared streaming cost and the synthesized view distortion are optimally traded off. Specifically, we first derive a lemma based on known property of synthesized view distortion functions. We then design a search algorithm to find a NE solution, leveraging on the derived lemma to reduce search complexity. Interactively streaming light field multi-view images is another focus of this thesis. Interactive light field streaming (ILFS) means that a user periodically requests a viewpoint for observation, and in response the server transmits a pre-synthesized and encoded viewpoint image to the user. The challenge is how to design and pre-encode a storage-constraint frame structure to enable efficient view navigation. In this thesis, using a Lloyd’s algorithm variant, we recursively insert into a frame structure a set of “landmarks” at locally optimal locations to improve ILFS performance, so as to trade off frame storage cost and the expected transmission cost. Experimental results show that our proposed structure has noticeably lower expected transmission cost for the same storage than other previous methods.
Subjects / Keywords
- Multi-view Imaging
Graduation date

Fall 2017
Type of Item

Thesis
Degree

Doctor of Philosophy
DOI

https://doi.org/10.7939/R3V11W08Z
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Doctoral
Department
- Department of Electrical and Computer Engineering
Specialization
- Signal and Image Processing
Supervisor / co-supervisor and their department(s)
- Hai, Jiang (Electrical and Computer Engineering)
- H. Vicky, Zhao
Examining committee members and their departments
- Linglong, Kong (Mathematical and Statistical Sciences)
- Yindi, Jing (Electrical and Computer Engineering)