- 40 views
- 80 downloads
Perceptually Motivated Algorithms for Multimedia
-
- Author / Creator
- Zhang, Shupei
-
Perceptual factors in vision can facilitate the development of more effective multimedia algorithms. In particular, the wide dynamic range of the human vision system is a motivation for developing image lighting enhancement algorithms. Image lighting enhancement can be achieved by capturing multiple images with different exposure settings and then reconstructing a final image. However, this approach cannot solve the problem of revealing or predicting details in already-captured images. Single-image lighting enhancement is desirable for this scenario, but many challenges remain to be addressed including over-enhancement, noise, and color artifacts due to a lack of understanding of the image content. Another aspect of multimedia algorithms that can benefit from perceptual factors, like the foveation mechanism and perceptual quality, is image and video compression. As the resolution and image quality of modern cameras have increased, the amount of data produced by computational photography has also surged dramatically. This has created a demand for better image/video compression methods that can reduce the data size without compromising the image quality.
In this thesis, four perceptually motivated methods are proposed to address the challenges in single-image lighting enhancement and image/video compression. First, we propose an image lighting enhancement method based on a fusion pyramid, which is a traditional contrast-based fusion approach. Second, we propose a self-attention-based learning strategy to reconstruct a properly exposed image from a single input image. We leverage the self-attention mechanism to model the interdependencies between different locations, and design a generative adversarial network (GAN) with a custom HDR loss function to improve the image quality. Third, we propose a novel video compression method that integrates visual saliency information with foveation to reduce perceptual redundancy. This is an innovative approach to subsample and restore the input image using saliency data, which allocates more space for salient regions and less for non-salient ones. Finally, based on the assumption that a group of images can be decomposed into several shared feature matrices, we propose a novel principal component approximation network (PCANet) for image compression. This is the first learning-based method that achieves promising performance while including the size of the network in the bitrate calculation. -
- Subjects / Keywords
-
- Graduation date
- Fall 2024
-
- Type of Item
- Thesis
-
- Degree
- Doctor of Philosophy
-
- License
- This thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.