Bookmark and Share

Improve Perceptual Video Quality: Skin-Tone Macroblock Detection

Register or sign in to access the Embedded Vision Academy's free technical training content.

The training materials provided by the Embedded Vision Academy are offered free of charge to everyone. All we ask in return is that you register, and tell us a little about yourself so that we can understand a bit about our audience. As detailed in our Privacy Policy, we will not share your registration information, nor contact you, except with your consent.

Registration is free and takes less than one minute. Click here to register, and get full access to the Embedded Vision Academy's unique technical training content.

If you've already registered, click here to sign in.

See a sample of this page's content below:

By Paula Carrillo, Akira Osamoto, and Adithya K. Banninthaya
Texas Instruments
Accurate skin-tone reproduction is important in conventional still and video photography applications, but it's also critical in some embedded vision implementations; for accurate facial detection and recognition, for example. And intermediary lossy compression between the camera and processing circuitry is common in configurations that network-link the two function blocks, either within a LAN or over a WAN (i.e. "cloud"). More generally, the technique described in this document uses dilation and other algorithms to find regions of interest, which is relevant to many vision applications. And implementing vision algorithms efficiently, i.e. finding vision algorithms that are computationally efficient, is obviously an important concept for embedded vision. This is a reprint of a Texas Instruments-published white paper, which is also available here (800 KB PDF).


In video compression algorithms, the quantization parameter (QP) is usually selected based on the relative complexity of the region in the picture as well as the over-all bit usage. However, complexity-based rate-control algorithms do not take into account the fact that more complex objects, such as human faces, are more sensitive to degradation during perceptual video compression. To improve the overall perceived quality of the image, it is important to classify human faces as regions of interest (ROI) and preserve as much detail in those regions as possible. The challenge is developing a reliable algorithm that will operate in real time. This white paper details a low-complexity solution that is able to run on a single-core digital signal processor (DSP) as part of an encoder implementation.

Skin-tone macroblock detection

The proposed solution is a low-complexity, color-based skin-tone detection which classifies skin-tone macroblocks (MBs) as ROI MBs and non-skin-tone macroblocks as non-ROI MBs. An MB can be defined...