Digital Image and Video Coding: Color Space

Hi!
I'm glad you are back. Last time I gave a short introduction and told you that digital cameras use three type of sensors to store the amount of incident light of the respective spectrum. Virtually any camera uses RGB (red, green, blue) sensors and stores the values individually. But this is not the only way to represent the color of a pixel. Printers for example utilize the CMYK color space because the three base colors cyan, magenta and yellow sum up to black if you where to print them on top of each other. They have an extra ink for black (the K is for black) to save colored ink, though. It is easy to switch between RGB and CMYK. All there is to be done is to multiply with a certain matrix. Simplest math.

In image coding though, other color spaces are of interest. This has to do with how the human visual perception works. On the inner backside of the eye-ball there is a huge amount of rods and cones that sense incoming light and send electrical impulses towards the brain. There are about 120 million rods that enable monochromatic (light and dark) vision and only 6 million cone in total, that are split to three different types of cones for red, green and blue perception. This fact already indicates that humans are more sensitive to changes in brightness, also called luminescence and far less sensitive to color, also referred to as chrominescence

What is most important to us is that this fact can be exploited in terms of image coding. A simple color space conversion turns an RGB space to a YUV space. Here Y denotes the luminescence component and U and V are the color components. There are a lot more color spaces that work on the same principle, but have different distributions in the color components. YIQ, YCbCr just to name a few. A very basic image coder could do the following: Multiply the RGB data with the color space conversion matrix to represent data in YUV space. Then sub-sample the U and V components (usually by a factor of 2 or more). On a 2D image this could mean to simply calculate the mean color of 2 by 2 blocks and then dropping every second row and column.

This picture shows a section of one of an images chroma channels (U or V) with the numbers representing the intensity. The values of a 2-by-2 block are shown in 'a'. In 'b' the average of those 4 values (12) is assigned to the upper left pixel of the block and the other 3 pixels are discarded. To illustrate that this one color value is now used together with 4 luminescence pixels I scaled it up to the size of a regular 2-by-2 block. Imagine a block of 4 luma pixels overlaying this one big color pixel. Notice that this actually reduces the amount of data per color component by 4 and results in an overall reduction to 1/3 [Y] + 2 * 1/4 /3 [UV] = 1/2 the data.

This is a very important fact of the human visual perception and most image coders do make use of it. Often the very first step in a coder is a simple color space transform into a more suitable representation.

Digital Image and Video Coding

Pages

Sunday, April 1, 2012

Color Space

No comments:

Post a Comment