Is WriteProcessMemory faster than shared memory for transferring data between two processes?

Say you need to transfer a large amount of data between two processes. One way is to use shared memory. Is that the fastest way to do it? Can you do any better?

One argument against shared memory is that the sender will have to copy the data into the shared memory block, and the recipient will have to copy it out, resulting in two extra copies. On the other hand, WriteProcessMemory could theoretically do its job with just one copy, so would that be faster?

I mean, sure you could copy the data into and out of the shared memory block, but who says that you do? By the same logic, the sender will have to copy the data from the original source into a buffer that it passes to WriteProcessMemory, and the recipient will have to take the data out of the buffer that WriteProcessMemory copied into and copy it out into its own private location for processing.

I guess the theory behind the WriteProcessMemory design is that you could use WriteProcessMemory to copy directly from the original source, and place it directly in the recipient’s private location.

But you can do that with shared memory, too. Just have the source generate the data directly into the shared buffer, and have the recipient consume the data directly out of it. Now you have no copying at all!

Imagine two processes sharing memory like two people sitting with a piece of paper between them. The first person can write something on the piece of paper, and the second person can see it immediately. Indeed, the second person can see it so fast that they can see the partial message before the first person finishes writing it. This is surely faster than giving each person a separate piece of paper, having the first person write something on their paper, and then asking a messenger to copy the message to the second person’s paper.

The “extra copy” straw man in the shared memory double-copy would be like having three pieces of paper: One private to the first person, one private to the second person, and one shared. The first person writes their message on their private sheet of paper, and then they copy the message to the shared piece of paper, and the recipient sees the message on the shared piece of paper and copies it to their private piece of paper. Yes, this entails two copies, but that’s because you set it up that way. The shared memory didn’t force you to create separate copies. That was your idea.

Now, maybe the data generated by the first process is not in a form that the second process can consume directly. In that case, you will need to generate the data into a local buffer and then convert it into a consumable form in the shared buffer. But you had that problem with WriteProcessMemory anyway. If the first process’s data is not consumable by the second process, then it will need to convert it into a consumable form and pass that transformed copy to WriteProcessMemory. So WriteProcessMemory has those same extra copies as shared memory.

Furthermore, WriteProcessMemory doesn’t guarantee atomicity. The receiving process can see a partially copied buffer. It’s not like the system is going to freeze all the threads in the receiving process to prevent them from seeing a partially-copied buffer. With shared memory, you can control how the memory becomes visible to the other process, say by using an atomic write with release when setting the flag which indicates “Buffer is ready!” The WriteProcessMemory function doesn’t let you control how the memory is copied. It just copies it however it wants, so you will need some other way to ensure that the second process doesn’t consume a partial buffer.

Bonus insult: The WriteProcessMemory function internally makes two copies. It allocates a shared buffer, copies the data from the source process to the shared buffer, and then changes memory context to the destination process and copies the data from the shared buffer to the destination process. (It also has a cap on the size of the shared buffer, so if you are writing a lot of memory, it may have to go back and forth multiple times until it copies all of the memory you requested.) So you are guaranteed two copies with WriteProcessMemory.

Bonus chatter: Another strike against WriteProcessMemory is the security implications. It requires PROCESS_VM_WRITE, which basically gives full control of the process. Shared memory, on the other hand, requires only that you find a way to get the shared memory handle to the other process. The originating process does not need any special access to the second process aside from a way to get the handle to it. It doesn’t gain write access to all of the second process’s memory; only the part of the memory that is shared. This adheres to the principle of least access, making it suitable for cases where the two processes are running in different security contexts.

Bonus bonus chatter: The primacy of shared memory is clear once you understand that shared memory is accomplished by memory mapping tricks. It is literally the same memory, just being viewed via two different apertures.

Author

Raymond Chen

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

2 comments

Jan Ringoš 34 minutes ago

The note about WriteProcessMemory function internally making two copies took the wind out of my sail. I was experimenting with making a truly zero-copy pipe, where the Read function passes the provided target buffer pointer and size to the other side, and Write function does just WriteProcessMemory. Now it’s ruined :’-(

Melissa P 3 hours ago

oh, anonymous shared memory is the best thing ever; changes are instant, interlocked/atomic operations done by the CPU just work, there is no syscall when accessing memory, you can have 100 participants, you can fine grade control access over it (4K page size), you can designate reader/writer or producer/consumer, and with clever pointers you don't need any memory-copy at all

even pointers will work inside shared memory if both shares are mapped to the same base address in all participating processes and all pointer addresses are confined to that memory share area; you can even put a heap manager onto it

combined...

Is `WriteProcessMemory` faster than shared memory for transferring data between two processes?

Author

2 comments

Leave a commentCancel reply

Read next

How can I detect that Windows is running in S-Mode, redux

Could we use CTAD to simplify the use of WRL’s Callback function?

Author

2 comments

Leave a commentCancel reply

Read next

How can I detect that Windows is running in S-Mode, redux

Could we use CTAD to simplify the use of WRL’s Callback function?

Stay informed