Download the full-sized PDF of 3D Multi-view Imaging: Object Contour Approximation for Depth Image Coding and Multi-view Image/Video StreamingDownload the full-sized PDF



Permanent link (DOI):


Export to: EndNote  |  Zotero  |  Mendeley


This file is in the following communities:

Graduate Studies and Research, Faculty of


This file is in the following collections:

Theses and Dissertations

3D Multi-view Imaging: Object Contour Approximation for Depth Image Coding and Multi-view Image/Video Streaming Open Access


Other title
Multi-view Imaging
Type of item
Degree grantor
University of Alberta
Author or creator
Yuan, Yuan
Supervisor and department
H. Vicky, Zhao
Hai, Jiang (Electrical and Computer Engineering)
Examining committee member and department
Yindi, Jing (Electrical and Computer Engineering)
Linglong, Kong (Mathematical and Statistical Sciences)
Department of Electrical and Computer Engineering
Signal and Image Processing
Date accepted
Graduation date
2017-11:Fall 2017
Doctor of Philosophy
Degree level
Thanks to the rapidly dropping cost of digital cameras, multi-view imaging—using a series of cameras capturing images from the same 3D scene simultaneously but from different viewpoints—opens a wide variety of interesting research topics and applications. Among them, free viewpoint TV and light field camera are two of the most important applications, which enable users to observe a static 3D scene by freely changing their viewpoints. However, the amount of multi-view data that needs to be stored or transmitted is huge. Therefore, efficient image/video coding and streaming are crucial points for the success of such applications. We first investigate object contours in depth image coding. A depth image provides partial geometric information of the captured 3D scene, which is important for synthesizing images corresponding to different virtual camera viewpoints via depth-image-based-rendering (DIBR). It has been shown that lossy compression of object contours will lead to bleeding artifacts in DIBR synthesized view, while losslessly coding of the exact object contours can be expensive at low rate. In this thesis, we propose to approximate object contours to save coding bits. Specifically, we first greedily approximate object contours based on an arithmetic edge coding (AEC) model to lower the edge coding cost. To control the induced synthesized view distortion due to contour approximation, we then introduce a rate-distortion (RD) optimal scheme. We show that the object contours themselves can be suitably approximated to save coding bits, while the synthesized objects remain sharp and natural for human perception. We then study the problem of multi-view image/video streaming. In free viewpoint TV, a user can pull color and depth videos captured from two nearby reference viewpoints to synthesize his chosen intermediate virtual view for observation via DIBR. A user may pull the same reference views from the server with other users so that the streaming cost can be shared, while a reference view with further distance to his chosen virtual view may increase the synthesized view distortion. In this thesis, we divide users into groups, where a user simultaneously belongs to two groups and each group shares the streaming cost of a single reference view. We also aim to find a Nash Equilibrium (NE) solution of reference view selection for each user, so that the shared streaming cost and the synthesized view distortion are optimally traded off. Specifically, we first derive a lemma based on known property of synthesized view distortion functions. We then design a search algorithm to find a NE solution, leveraging on the derived lemma to reduce search complexity. Interactively streaming light field multi-view images is another focus of this thesis. Interactive light field streaming (ILFS) means that a user periodically requests a viewpoint for observation, and in response the server transmits a pre-synthesized and encoded viewpoint image to the user. The challenge is how to design and pre-encode a storage-constraint frame structure to enable efficient view navigation. In this thesis, using a Lloyd’s algorithm variant, we recursively insert into a frame structure a set of “landmarks” at locally optimal locations to improve ILFS performance, so as to trade off frame storage cost and the expected transmission cost. Experimental results show that our proposed structure has noticeably lower expected transmission cost for the same storage than other previous methods.
This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for the purpose of private, scholarly or scientific research. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
Citation for previous publication
Y. Yuan, G. Cheung, P. Frossard, P. L. Callet and V. H. Zhao, “Contour Approximation & Depth Image Coding for Virtual View Synthesis”, IEEE International Workshop on Multimedia Signal Processing, 2015.Y. Yuan, G. Cheung, P. L. Callet, P. Frossard, and V. H. Zhao, “Object Shape Approximation & Contour Adaptive Depth Image Coding for Virtual View Synthesis”, accepted to IEEE Transactions on Circuits and Systems for Video Technology, August, 2017.Y. Yuan, B. Hu, G. Cheung and V. H. Zhao, “Optimizing Peer Grouping for Live Free Viewpoint Video Streaming”, IEEE International Conference on Image Processing, 2013.Y. Yuan, G. Cheung and P. Frossard, “Optimizing Landmark Insertions for Interactive Light Field Streaming”, accepted to IEEE International Conference on Image Processing, 2017.

File Details

Date Uploaded
Date Modified
Audit Status
Audits have not yet been run on this file.
File format: pdf (PDF/A)
Mime type: application/pdf
File size: 8457665
Last modified: 2017:11:08 17:30:39-07:00
Filename: Yuan_Yuan_201709_PhD.pdf
Original checksum: fd11c910f6602984a7e501bc29aade9a
Activity of users you follow
User Activity Date