Implementation and comparison of Serial and Parallel algorithms of SAO in HEVC

Nagathihalli Jagadish, Harsha

View/Open

NAGATHIHALLIJAGADISH-THESIS-2017.pdf (3.022Mb)

Date

2017-12-05

Author

Nagathihalli Jagadish, Harsha

0000-0002-5974-6996

Metadata

Show full item record

Abstract

The High Efficiency Video Coding (HEVC) standard is the latest video coding project developed by the Joint Collaborative Team on Video Coding (JCT-VC) which involves the International Telecommunication Union (ITU-T) Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) standardization organizations. HEVC also known as H.265 supports encoding videos with wide range of resolutions, starting from low resolution to beyond High Definition i.e. 4k or 8k. The HEVC standard is an optimization of the previous standard H.264/AVC (Advanced Video coding) which is a very well established and widely used standard in industry and finds its applications in broadcast TV and multimedia telephony. HEVC was preceded by H.264/AVC with the bit-rate reduction of about 50% at the same visual quality. The in-loop filters are an important part of HEVC video coding standard. They attenuate discontinuities at the prediction and transform boundaries and also improves the quality by attenuating the ringing artifacts and changes in the sample intensity depending on the classification algorithm. The main advantage of these filters is it improves the subjective quality of reconstructed video. In HEVC, the size of motion predicted blocks varies from 8x4 and 4x8, to 64x64 luma samples, while the size of block transforms and intra-predicted blocks varies from 4x4 to 32x32 samples. 5 These blocks can be coded independently from the neighboring blocks which allow scope for parallelism. Various methods have been implemented serially to reduce the computational complexity of sample adaptive offset. To improve the coding efficiency, an extra step is taken to implement the code in parallel since the blocks can be coded independent of each other. The technology is rapidly evolving and moving towards a world of parallelization so as to reduce the amount of time spent of computation. Multi core and many core based computation and design are the new trends in the market. As a result, in this thesis an attempt is made to map the video coding algorithm on the GPU cores to accelerate the speed at which the execution takes place. This is done using CUDA programming for SAO algorithm. SAO has many stages of implementation. Each of these stages is implemented in parallel using NVIDIA GPUs. A comparison of the results obtained in serial and parallel are evaluated using speedup metric and the subjective quality is measured using PSNR (Peak Signal to Noise Ratio).

URI

http://hdl.handle.net/10106/28903