3D Multi-view Imaging: Object Contour Approximation for Depth Image Coding and Multi-view Image/Video Streaming

  • Author / Creator
    Yuan, Yuan
  • Thanks to the rapidly dropping cost of digital cameras, multi-view imaging—using a series of cameras capturing images from the same 3D scene simultaneously but from different viewpoints—opens a wide variety of interesting research topics and applications. Among them, free viewpoint TV and light field camera are two of the most important applications, which enable users to observe a static 3D scene by freely changing their viewpoints. However, the amount of multi-view data that needs to be stored or transmitted is huge. Therefore, efficient image/video coding and streaming are crucial points for the success of such applications. We first investigate object contours in depth image coding. A depth image provides partial geometric information of the captured 3D scene, which is important for synthesizing images corresponding to different virtual camera viewpoints via depth-image-based-rendering (DIBR). It has been shown that lossy compression of object contours will lead to bleeding artifacts in DIBR synthesized view, while losslessly coding of the exact object contours can be expensive at low rate. In this thesis, we propose to approximate object contours to save coding bits. Specifically, we first greedily approximate object contours based on an arithmetic edge coding (AEC) model to lower the edge coding cost. To control the induced synthesized view distortion due to contour approximation, we then introduce a rate-distortion (RD) optimal scheme. We show that the object contours themselves can be suitably approximated to save coding bits, while the synthesized objects remain sharp and natural for human perception. We then study the problem of multi-view image/video streaming. In free viewpoint TV, a user can pull color and depth videos captured from two nearby reference viewpoints to synthesize his chosen intermediate virtual view for observation via DIBR. A user may pull the same reference views from the server with other users so that the streaming cost can be shared, while a reference view with further distance to his chosen virtual view may increase the synthesized view distortion. In this thesis, we divide users into groups, where a user simultaneously belongs to two groups and each group shares the streaming cost of a single reference view. We also aim to find a Nash Equilibrium (NE) solution of reference view selection for each user, so that the shared streaming cost and the synthesized view distortion are optimally traded off. Specifically, we first derive a lemma based on known property of synthesized view distortion functions. We then design a search algorithm to find a NE solution, leveraging on the derived lemma to reduce search complexity. Interactively streaming light field multi-view images is another focus of this thesis. Interactive light field streaming (ILFS) means that a user periodically requests a viewpoint for observation, and in response the server transmits a pre-synthesized and encoded viewpoint image to the user. The challenge is how to design and pre-encode a storage-constraint frame structure to enable efficient view navigation. In this thesis, using a Lloyd’s algorithm variant, we recursively insert into a frame structure a set of “landmarks” at locally optimal locations to improve ILFS performance, so as to trade off frame storage cost and the expected transmission cost. Experimental results show that our proposed structure has noticeably lower expected transmission cost for the same storage than other previous methods.

  • Subjects / Keywords
  • Graduation date
    2017-11:Fall 2017
  • Type of Item
  • Degree
    Doctor of Philosophy
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
  • Language
  • Institution
    University of Alberta
  • Degree level
  • Department
    • Department of Electrical and Computer Engineering
  • Specialization
    • Signal and Image Processing
  • Supervisor / co-supervisor and their department(s)
    • Hai, Jiang (Electrical and Computer Engineering)
    • H. Vicky, Zhao
  • Examining committee members and their departments
    • Yindi, Jing (Electrical and Computer Engineering)
    • Linglong, Kong (Mathematical and Statistical Sciences)