Data-driven transform optimization for next generation multimedia applications

Thumbnail Image
Sezer, Osman Gokhan
Altunbasak, Yucel
Associated Organization(s)
Supplementary to
The objective of this thesis is to formulate a generic dictionary learning method with the guiding principle that states: Efficient representations lead to efficient estimations. The fundamental idea behind using transforms or dictionaries for signal representation is to exploit the regularity within data samples such that the redundancy of the representation is minimized subject to a level of fidelity. This observation translates to rate-distortion cost in compression literature, where a transform that has the lowest rate-distortion cost provides a more efficient representation than the others. In our work, rather than using as an analysis tool, the rate-distortion cost is utilized to improve the efficiency of transforms. For this, an iterative optimization method is proposed, which seeks an orthonormal transform that reduces the expected value of rate-distortion cost of an ensemble of data. Due to the generic nature of the new optimization method, one can design a set of orthonormal transforms either in the original signal domain or on the top of a transform-domain representation. To test this claim, several image codecs are designed, which use block-, lapped- and wavelet-transform structures. Significant increases in compression performances are observed compared to original methods. An extension of the proposed optimization method for video coding gave us state-of-the-art compression results with separable transforms. Also using the robust statistics, an explanation to the superiority of new design over other learning-based methods such as Karhunen-Loeve transform is provided. Finally, the new optimization method and the minimization of the "oracle" risk of diagonal estimators in signal estimation is shown to be equal. With the design of new diagonal estimators and the risk-minimization-based adaptation, a new image denoising algorithm is proposed. While these diagonal estimators denoise local image patches, by formulation the optimal fusion of overlapping local denoised estimates, the new denoising algorithm is scaled to operate on large images. In our experiments, the state-of-the-art results for transform-domain denoising are achieved.
Date Issued
Resource Type
Resource Subtype
Rights Statement
Rights URI