Swapping two blocks of memory that reside inside a larger block, in constant memory

Suppose you have a large block of memory, and you want to swap two blocks that reside inside that large block. For concreteness, say you have a block of memory of the form A1, A2, B1, B2, C1, C2, D1, D2, D3, E1, and you want to swap the B’s and D’s, producing A1, A2, D1, D2, D3, C1, C2, B1, B2, E1.

A1	A2	B1	B2	C1	C2	D1	D2	D3	E1


					⤩


A1	A2	D1	D2	D3	C1	C2	B1	B2	E1

Like, say, you have a double-null-terminated string and you want to swap two strings in it. Now, of course, the easy way would be just to allocate a second buffer equal in size to the original buffer and copy the strings to the second buffer in the desired order. But just as an exercise, can you do this without allocating additional memory?

Here’s a hint: There is an algorithm to swap two adjacent buffers using no additional space. In C++, this algorithm is implemented by the std::rotate method. For example, if you have a block of memory of the form X1, X2, Y1, Y2, Y3, then you can call std::rotate(v.begin(), v.begin() + 2, v.end()) to get Y1, Y2, Y3, X1, X2.

X1	X2	Y1	Y2	Y3


⤩


Y1	Y2	Y3	X1	X2

The function is called rotate because you can view it as taking the combined buffer and doing a circular rotation until Y1 is the new first element.

You can perform the desired exchange of non-adjacent blocks by performing three rotations. Starting with A1, A2, B1, B2, C1, C2, D1, D2, D3, E1, your first rotation exchanges the B’s with the C’s: your second rotation exchanges the B’s with the D’s, and your final rotation exchanges the C’s with the D’s.

A1	A2	B1	B2	C1	C2	D1	D2	D3	E1


			⤩


A1	A2	C1	C2	B1	B2	D1	D2	D3	E1


				⤩


A1	A2	C1	C2	D1	D2	D3	B1	B2	E1


		⤩


A1	A2	D1	D2	D3	C1	C2	B1	B2	E1

(There are other ways to accomplish this with three rotations.)

But you can do better.

The way rotation works is by reversing the two buffers separately, and then reversing the combined buffer. Starting with X1, X2, Y1, Y2, Y3, we reverse the X’s (producing X2, X1) and reverse the Y’s (producing Y3, Y2, Y1), and then reverse the whole thing, giving a final result of Y1, Y2, Y3, X1, X2.

X1	X2	Y1	Y2	Y3

⇆		⇆

X2	X1	Y3	Y2	Y1

⇆

Y1	Y2	Y3	X1	X2

The cost of reversing n elements is n/2 swaps, so the cost of a rotation of n elements is n swaps. (You reverse the first part and the second part, which together add up to n/2 swaps; and then you reverse the whole thing, which is another n/2 swaps, for a total of n.)

Therefore, our three-rotation version costs a total of 2n swaps, where n is the combined size of the two blocks being swapped as well as the block that is sandwiched between them.

We can reduce the cost to n swaps by applying the reversal trick directly: Reverse the two blocks, reverse the block that is sandwiched between them, and then reverse the combo block.

A1	A2	B1	B2	C1	C2	D1	D2	D3	E1

		⇄		⇄		⇄

A1	A2	B2	B1	C2	C1	D3	D2	D1	E1

		⇄

A1	A2	D1	D2	D3	C1	C2	B1	B2	E1

template<typename BiDirIt>
void swap_discontiguous(BiDirIt first1, BiDirIt last1,
                        BiDirIt first2, BiDirIt last2)
{
    std::reverse(first1, last1);
    std::reverse(last1, first2);
    std::reverse(first2, last2);
    std::reverse(first1, last2);
}

But wait. If you look at the specification for std::rotate, you see that it requires only a forward iterator, not a bidirectional iterator. Yet std::reverse requires a bidirectional iterator. So how does std::rotate operate if it is given only forward iterators? We’ll look at that next time.

Bonus chatter: clang’s libcxx and gcc’s libstdc++ contain specializations of std::rotate for random-access iterators which perform only n/2 swaps, moving each element directly to its final destination by breaking the element movements into disjoint cycles and then moving the elements within each cycle. They also have special-cases for rotating zero elements (nop), as well as rotating a single element and swapping two equal-sized blocks (which reduces the number of swaps to n/2). I guess these cases happen often enough for the extra code to be worth it.

Author

Raymond Chen

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

2 comments

Adam Rosenfield 5 hours ago · Edited

Maybe I’m spoiling tomorrow post, but my first thought was this:

* If len(B) == len(D), directly swap the B’s and D’s, and you’re done
* If len(B) < len(D), swap the B’s with the first len(B) of the D’s, and then reiterate on C and the remaining D’s
* If len(B) > len(D), swap the D’s with the first len(D) of the B’s, and then reiterate on C and the remaining B’s

This runs like the Euclidean GCD algorithm, in terms of how it reduces the lengths of each subarray at each step.

GL 5 hours ago

Bonus bonus chatter: If the disjoint swap is using random-access iterators and the two intervals are of the same length, the algorithm should just do swaps between the corresponding elements, so that the complexity doesn’t grow with how far the two intervals are away from each other. Reductions to rotate/reverse won’t be as efficient.

Stay informed

Get notified when new posts are published.

Join the discussion.

Swapping two blocks of memory that reside inside a larger block, in constant memory

Author

2 comments

Leave a commentCancel reply

Read next

Understanding and mitigating a stack overflow in our task sequencer

How can I detect that the system is running low on memory? Or that my job is running low on memory?

A1	A2	B1	B2	C1	C2	D1	D2	D3	E1


			⤩


A1	A2	C1	C2	B1	B2	D1	D2	D3	E1


				⤩


A1	A2	C1	C2	D1	D2	D3	B1	B2	E1


		⤩


A1	A2	D1	D2	D3	C1	C2	B1	B2	E1

A1	A2	B1	B2	C1	C2	D1	D2	D3	E1

		⇄		⇄		⇄

A1	A2	B2	B1	C2	C1	D3	D2	D1	E1

		⇄

A1	A2	D1	D2	D3	C1	C2	B1	B2	E1

A1	A2	B1	B2	C1	C2	D1	D2	D3	E1


			⤩


A1	A2	C1	C2	B1	B2	D1	D2	D3	E1


				⤩


A1	A2	C1	C2	D1	D2	D3	B1	B2	E1


		⤩


A1	A2	D1	D2	D3	C1	C2	B1	B2	E1

A1	A2	B1	B2	C1	C2	D1	D2	D3	E1

		⇄		⇄		⇄

A1	A2	B2	B1	C2	C1	D3	D2	D1	E1

		⇄

A1	A2	D1	D2	D3	C1	C2	B1	B2	E1

A1	A2	B1	B2	C1	C2	D1	D2	D3	E1


			⤩


A1	A2	C1	C2	B1	B2	D1	D2	D3	E1


				⤩


A1	A2	C1	C2	D1	D2	D3	B1	B2	E1


		⤩


A1	A2	D1	D2	D3	C1	C2	B1	B2	E1

A1	A2	B1	B2	C1	C2	D1	D2	D3	E1

		⇄		⇄		⇄

A1	A2	B2	B1	C2	C1	D3	D2	D1	E1

		⇄

A1	A2	D1	D2	D3	C1	C2	B1	B2	E1