March 7th, 2022

Optimizing code to darken a bitmap, part 1

I needed a function to make a 32bpp ARGB image a little darker. Here’s a naïve version, which we will use as our starting point:

union Pixel
{
    uint8_t c[4]; // four channels: red, green, blue, alpha
    uint32_t v;   // full pixel value as a 32-bit integer
};

void darken(Pixel* first, Pixel* last, int darkness)
{
  int lightness = 256 - darkness;
  for (; first < last; ++first) {
    for (int i = 0; i < 3; ++i) {
      first->c[i] = (uint8_t)(first->c[i] * lightness / 256);
    }
  }
}

You call this function with a range of pixels, and an integer representing how much you want to make the image darker, on a scale from 0 (no change at all) to 256 (complete blackness). The function goes through every pixel and applies the darkening factor to each of the first three channels. (The fourth channel is the alpha channel, which should stay unchanged.)

One obvious idea for improvement is to unroll the innermost loop.

void darken(Pixel* first, Pixel* last, int darkness)
{
  int lightness = 256 - darkness;
  for (; first < last; ++first) {
    first->c[0] = (uint8_t)(first->c[0] * lightness / 256);
    first->c[1] = (uint8_t)(first->c[1] * lightness / 256);
    first->c[2] = (uint8_t)(first->c[2] * lightness / 256);
  }
}

This gives a 1.8× improvement over the original plain version.

Next time, we’ll try to improve on this even further by doing the operations in parallel.

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

15 comments

Discussion is closed. Login to edit/delete existing comments.

Newest
Newest
Popular
Oldest
  • Mystery Man

    If I’ve understood correctly, the function obtains a set of pointers to the first and last pixels. Assuming that the pixels are stored in a contiguous region of memory, it iterates through the pixels by increasing the pointer address of the first pixel by one:

    for (; first < last; ++first) {

    But is it correct? The pixels occupy four bytes of memory, not one.

    • Jeff Howe · Edited

      [edited for clarity]
      A basic rule of C (and C++) pointers is that incrementing them adds the size of the object that the pointer is pointing to to the address in the pointer.

      first is a pointer to Pixel, which is 4 bytes in size.

      Therefore, ++first will yield an address four bytes beyond first's original value, i.e., it's the address of the next Pixel. From there, the individual RGBA's can be accessed via first's c[] member;

      Read more
      • Mystery Man

        Interesting. I thought if C and C++ supported context-sensitive pointer arithmetics, it would be in the form of Inc() or Dec() macros.

      • Jeff Howe · Edited

        C (and its stepchild C++) is smarter than that. This has been a part of the C language since before I started using it (which was ca. 1983, Lattice C compiler for MS-DOS), as documented in the first K&R book (The C Programming Language). In C, pointers are closely related to arrays (which obviously need to take into account the size of their members), so if p is a pointer to an array of 4...

        Read more
  • Henke37

    Division? I thought darkening was saturating subtraction. Oh well, too late to change the algorithm now.

    • Antonio Rodríguez

      Darkening operates on luminance, so it should maintain the proportions between all three components (which encode hue and saturation). Adding or substracting a constant does not maintain the proportions, and thus is unsuitable to change luminance without affecting hue and saturation. Multiplication and division, on the other hand, work just fine.

      In other words, if you have the color [192, 64, 255] (violet) and divide it by half, you get [96, 32, 127], which is a...

      Read more
  • Rika Ichinose

    Not sure if darkening ideally should be applied in linear colorspace or not, when the original was in perceptually linear sRGB.

    k = 1.0 - darkness / 256.0;
    darkened = linear_to_sRGB(sRGB_to_linear(src) * k);

    Blending two images definitely should, and isn’t darkening the same as blending with black…

    • Raymond ChenMicrosoft employee Author

      This exercise was not meant to be visually perfect. Nonlinear scaling is fine, as long as it ranges from 0 to 256.

  • Gaelan Steele

    Huh, is your compiler not unrolling the inner loop itself?

    Quick testing with Godbolt seems to show that the latest versions of MSVC, Clang and GCC (all at -O2) all unroll the loop themselves.

    • Raymond ChenMicrosoft employee Author

      Windows compiles the OS itself at -O2sx and MSVC does not unroll when you optimize for space.

      • 紅樓鍮

        I thought you could just use an attribute or pragma on the function and it’ll be optimized for speed.

      • Neil Rashbrook

        Speaking of which, Code Golfing refers to optimising for space, not speed. For example I have a program to find out the coordinates of a tetrahedron given its edge lengths and the fact that all of the lengths involved are integers. It's only 90 bytes¹ long, but it's so inefficient it would only run on a machine with terabytes of RAM. For 114 bytes it will run on my PC, but it's been running for...

        Read more
      • Mystery Man

        There is no point in nitpicking on a Microsoft employee's bad English usage. They do it all the time.

        For example, when they say "boot partition," they are referring to a partition that has nothing to do with the boot loader or boot process. (It's the partition on which the Windows folder resides.) When they say "Windows Subsystem for Linux," they really mean a Linux subsystem for Windows. (Compare with Visual Studio for Mac, or Office...

        Read more
  • Peter Cooper Jr.

    It's unclear to me what metric this "1.8× improvement" is referring to. If it's some measurement of execution time per loop, then I would have expected any half-decent optimizing compiler to have made the change for you, though I haven't done any testing of it myself to be sure. But the title is referencing "Code golfing", which usually refers to characters of source code in my experience, but the "improved" version has more characters. What...

    Read more

Feedback