Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Pattern recognition techniques for image and video post- processing : specific application to image interpolation

Abstract

Video and image enhancement relate to the age old problem of constructing new image detail to effect a more pleasurable viewing experience in video sequences and images. The image processing procedure described in this thesis is image and video interpolation, but the discussed algorithms are easily applicable to other areas in a variety of fields. The motivations behind resolution enhancement span the full spectrum of government to commercial to medical applications. For example, video quality from on-demand sites such as YouTube, Daily Motion, and Veoh, where hit counts sometimes reach 4.7 million viewers per day, have traded off quality for bandwidth; travelers who use Google Map's Street View cannot read signs, see landmarks from the provided low-resolution panoramic; Google Earth® and government contractors in image intelligence demand for increasing resolution in satellite imagery. The long list of applications goes on, including applications such as web-photo editing, MRI scan feature enhancements, on-demand creation of HDTV content, etc. The image interpolation problem is an inherently ill- posed problem, and quality assessments of the resulting images are often driven by human decision rather than numerical analysis. By computer learning, the best viewing performance can be achieved by mimicking the human decision-making process. Therefore, this thesis is concerned with modeling machine and statistical learning techniques to fit the interpolation framework, while easily modified to accommodate other video and image processing problems in general. Specifically, we learn properties in a training set by using regression to map a relationship between known and unknown information. The known information by our definitions are low-resolution images while unknown information are high-resolution images. Machine learning is a particularly broad topic, and we can approach the image interpolation problems in several different ways. Our input space is the image patch domain, where we process fixed-size contiguous subsets of the image independently. Consequently, we first discuss properties of the feature space and propose a multivariate probability distribution function to describe the image patch domain. Knowledge of the distribution and properties of the feature space is especially conducive to both parametric and nonparametric estimation techniques. Because k-nearest neighbors is a relatively simple and commonly used nonparametric estimation technique, we propose a nearest neighbor algorithm that adaptively finds the appropriate number of neighbors to use and then performs the necessary regression steps. The algorithm also imposes global constraints through a heavily approximated Markov Random Field. Due to runtime considerations, we explore quicker ways to search for the k nearest neighbors of a given input. Next, we investigate the use of kernel-based methods by employing support vector regression. We improve the generalization potential for nonlinear relationships by proposing convex optimization problems focused in kernel learning. There are various interpolation frameworks that can include the newly proposed regression technique, and we experiment with several. Ultimately, we propose a mixture of experts framework to describe the relationships in the training set. Finally, we propose a single, general, zero-phase MMSE interpolation filter to address computational complexity concerns of all learning algorithms. The idea arises from image processing analysis of machine learning techniques rather than the application of machine learning to image processing. In the development of the final filter, we analyze a general classification-based filtering scheme using polyphase representation. Because there are inherent similarities and considerable overlap between each class in such an approach, one zero-phase filter for all image content seems to logically follow as an adequate approximation that reduces the total number of computations to that of bicubic interpolation. Analyzing the frequency response, we can generate filters on-the-fly for arbitrary scaling factors

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View