Image Encoding Standards
The sections listed below provide important information about the image encoding standards supported by Color. The image data you’ll be color correcting is typically encoded either using an RGB or Y′CBCR (sometimes referred to as YUV) format. Color is extremely flexible and capable of working with image data of either type. For detailed information, see:
The RGB Additive Color Model Explained
In the RGB color model, three color channels are used to store red, green, and blue values in varying amounts to represent each available color that can be reproduced. Adjusting the relative balance of values in these color channels adjusts the color being represented. When all three values are equal, the result is a neutral tone, from black through gray to white.
More typically, you’ll see these ratios expressed as digital percentages in the Color Parade scope or Histogram. For example, if all three color channels are 0%, the pixel is black. If all three color channels are 50%, the pixel is a neutral gray. If all three color channels are 100% (the maximum value), the pixel is white.
Animation (an older, 8-bit codec) and Apple ProRes 4444 (a newer 10-bit codec) are the two most commonly used RGB QuickTime codecs. In digital intermediate workflows, RGB-encoded images are typically stored as uncompressed DPX or Cineon image sequences.
The Y′CBCR Color Model Explained
Video is typically recorded using the Y′CBCR color model. Y′CBCR color coding also employs three channels, or components. A shot’s image is divided into one luma component (luma is image luminance modified by gamma for broadcast) and two color difference components which encode the chroma (chrominance). Together, these three components make up the picture that you see when you play back your video.
- The Y′ component represents the black-and-white portion of an image’s tonal range. Because the eye has different sensitivities to the red, green, and blue portions of the spectrum, the image “lightness” that the Y′ component reproduces is derived from a weighted ratio of the (gamma-corrected) R, G, and B color channels. (Incidentally, the Y′ component is mostly green.) Viewed on its own, the Y′ component is the monochrome image.
- The two color difference components, CB and CR, are used to encode the color information in such a way as to fit three color channels of image data into two. A bit of math is used to take advantage of the fact that the Y′ component also stores green information for the image. The actual math used to derive each color component is CB = B′ - Y′, while CR = R′ - Y′.
Note: This scheme was originally created so that older black-and-white televisions would be compatible with the newer color television transmissions.
Chroma Subsampling Explained
In Y′CBCR encoded video, the color channels are typically sampled at a lower ratio than the luma channel. Because the human eye is more sensitive to differences in brightness than in color, this has been used as a way of reducing the video bandwidth (or data rate) requirements without perceptible loss to the image.
The sampling ratio between the Y′, CB, and CR channels is notated as a three-value ratio. There are four common chroma subsampling ratios:
- 4:4:4: 4:4:4 chroma subsampled media encodes completely uncompressed color, the highest quality possible, as the color difference channels are sampled at the same rate as the luma channel. 4:4:4 subsampled image data is typically obtained via telecine or datacine to an image sequence or video format capable of containing it, and is generally employed for digital intermediate and film workflows. RGB encoded images such as DPX and Cineon image sequences and TIFF files are always 4:4:4.
The Apple ProRes 4444 codec lets you capture, transcode to, and master media at this high quality. (The fourth 4 refers to the ability of Apple ProRes 4444 to preserve an uncompressed alpha channel in addition to the three color channels; however, Color doesn’t support alpha channels.)
Be aware that simply rendering at 4:4:4 doesn’t guarantee a high-quality result. If media is not acquired at 4:4:4, then rendering at 4:4:4 will preserve the high quality of corrections you make to the video, but it won’t add color information that wasn’t there to begin with.
As of this writing, few digital acquisition formats are capable of recording 4:4:4 video, but those that do include HDCAM SR, as well as certain digital cinema cameras, including the RED, Thompson Viper FilmStream, and Genesis digital camera systems. - 4:2:2: 4:2:2 is a chroma subsampling ratio typical for many high-quality standard and high definition video acquisition and mastering formats, including Beta SP (an analog format), Digital Betacam, Beta SX, IMX, DVCPRO 50, DVCPRO HD, HDCAM, and D-5 HD.
Although storing half the color information of 4:4:4, 4:2:2 is standard for video mastering and broadcast. As their names imply, Apple Uncompressed 8-bit 4:2:2, Apple Uncompressed 10-bit 4:2:2, Apple ProRes 422, and Apple ProRes 422 (HQ) all use 4:2:2 chroma subsampling. - 4:1:1 and 4:2:0: 4:1:1 is typical for consumer and prosumer video formats including DVCPRO 25 (NTSC and PAL), DV, and DVCam (NTSC).
4:2:0 is another consumer-oriented subsampling rate, used by DV (PAL), DVCAM (PAL), and MPEG-2, as well as the high definition HDV and XDCAM HD formats.
Due to their low cost, producers of all types have flocked to these formats for acquisition, despite the resulting limitations during post-production (discussed below). Regardless, whatever the acquisition format, it is inadvisable to master using either 4:1:1 or 4:2:0 video formats.
It’s important to be aware of the advantages of higher chroma subsampling ratios in the color correction process. Whenever you’re in a position to specify the transfer format with which a project will be finished, make sure you ask for the highest-quality format your system can handle. (For more information about high-quality finishing codecs, see
A Tape-Based Workflow.)
As you can probably guess, more color information is better when doing color correction. For example, when you make large contrast adjustments to 4:1:1 or 4:2:0 subsampled video, video noise in the image can become exaggerated; this happens most often with underexposed footage. You’ll find that you can make the same or greater adjustments to 4:2:2 subsampled video, and the resulting image will have much less grain and noise. Greater contrast with less noise provides for a richer image overall. 4:4:4 allows the most latitude, or flexibility, for making contrast adjustments with a minimum of artifacts and noise.
Furthermore, it’s common to use chroma keying operations to isolate specific areas of the picture for correction. This is done using the HSB qualifiers in the Secondaries room. (For more information, see
Choosing a Region to Correct Using the HSL Qualifiers.) These keying operations will have smoother and less noisy edges when you’re working with 4:2:2 or 4:4:4 subsampled video. The chroma compression used by 4:1:1 and 4:2:0 subsampled video results in macroblocks around the edges of the resulting matte when you isolate the chroma, which can cause a “choppy” or “blocky” result in the correction you’re trying to create.
Despite these limitations, it is very possible to color correct highly compressed video. By paying attention to image noise as you stretch the contrast of poorly exposed footage, you can focus your corrections on the areas of the picture where noise is minimized. When doing secondary color correction to make targeted corrections to specific parts of the image, you may find it a bit more time consuming to pull smooth secondary keys. However, with care and patience, you can still achieve beautiful results.
Film Versus Video and Chroma Subsampling
With a bit of care you can color correct nearly any compressed video or image sequence format with excellent results, and Color gives you the flexibility to use highly compressed source formats including DV, HDV, and DVCPRO HD.
Standard and high definition video, on the other hand, is usually recorded with lower chroma subsampling ratios (4:2:2 is typical even with higher-quality video formats, and 4:1:1 and 4:2:0 are common with prosumer formats) and higher compression ratios, depending entirely upon the recording and video capture formats used. Since the selected video format determines compression quality at the time of the shoot, there’s nothing you can do about the lost image data, other than to make the best of what you have.
In general, film footage is usually transferred with the maximum amount of image data possible, especially when transferred as a completely uncompressed image sequence (4:4:4) as part of a carefully managed digital intermediate workflow. This is one reason for the higher quality of the average film workflow.
Bit Depth Explained
Another factor that affects the quality of video images, and can have an effect on the quality of your image adjustments, is the bit depth of the source media you’re working with. With both RGB and Y′CBCR encoded media, the higher the bit depth, the more image data is available, and the smoother both the image and your corrections will be. The differences between images at different bit depths is most readily apparent in gradients such as skies, where lower bit depths show banding, and higher bit depths do not.
The bit depth of your source media depends largely on how that media was originally acquired. Most of the media you’ll receive falls into one of the following bit depths, all of which Color supports:
- 8-bit: Most standard and high definition consumer and professional digital video formats capture 8-bit image data, including DV and DVCPRO-25, DVCPRO 50, HDV, DVCPRO HD, HDCAM, and so on.
- 10-bit: Many video capture interfaces allow the uncompressed capture of analog and digital video at 10-bit resolution.
- 10-bit log: By storing data logarithmically, rather than linearly, a wider contrast ratio (such as that of film) can be represented by a 10-bit data space. 10-bit log files are often recorded from datacine scans using the Cineon and DPX image sequence formats.
- 12-bit: Some cameras, such as the RED ONE, capture digital images at 12-bit, providing for even smoother transitions in gradients.
- 16-bit: It has been said that it takes 16 bits of linear data to match the contrast ratio that can be stored in a 10-bit log file. Since linear data is easier for computers to process, this is another data space that’s available in some image formats.
- Floating Point: The highest level of image-processing quality available. Refers to the use of floating-point math to store and calculate fractional data. This means that values higher than 1 can be used to store data that would otherwise be rounded down using the integer-based 8-bit, 10-bit, 12-bit, and 16-bit depths. Floating Point is a processor-intensive bit depth to work with.
Higher bit depths accommodate more image data by using a greater range of numbers to represent the tonal range that’s available. This is apparent when looking at the numeric ranges used by the two bit depths most commonly associated with video.
- 8-bit images use a full range of 0–255 to store each color channel. (Y′CBCR video uses a narrower range of 16–235 to accommodate super-black and super-white.) 255 isn’t a lot of values, and the result can be subtly visible “stairstepping” in areas of the picture with narrow gradients (such as skies).
- 10-bit images, on the other hand, use a full range of 0 to 1023 to store each color channel. (Again, Y′CBCR video uses a narrower range of 64–940 to accommodate super-black and super-white.) The additional numeric range allows for smoother gradients and virtually eliminates bit depth–related artifacts.
Fortunately, while you can’t always control the bit depth of your source media, you can control the bit depth at which you work in Color independently. This means that even if the source media is at a lower bit depth, you can work at a higher bit depth to make sure that the quality of your corrections is as high as possible. In particular, many effects and secondary corrections look significantly better when Color is set to render at higher bit depths.