November 19th, 2025
intriguing1 reaction

Is Write­Process­Memory faster than shared memory for transferring data between two processes?

Say you need to transfer a large amount of data between two processes. One way is to use shared memory. Is that the fastest way to do it? Can you do any better?

One argument against shared memory is that the sender will have to copy the data into the shared memory block, and the recipient will have to copy it out, resulting in two extra copies. On the other hand, Write­Process­Memory could theoretically do its job with just one copy, so would that be faster?

I mean, sure you could copy the data into and out of the shared memory block, but who says that you do? By the same logic, the sender will have to copy the data from the original source into a buffer that it passes to Write­Process­Memory, and the recipient will have to take the data out of the buffer that Write­Process­Memory copied into and copy it out into its own private location for processing.

I guess the theory behind the Write­Process­Memory design is that you could use Write­Process­Memory to copy directly from the original source, and place it directly in the recipient’s private location.

But you can do that with shared memory, too. Just have the source generate the data directly into the shared buffer, and have the recipient consume the data directly out of it. Now you have no copying at all!

Imagine two processes sharing memory like two people sitting with a piece of paper between them. The first person can write something on the piece of paper, and the second person can see it immediately. Indeed, the second person can see it so fast that they can see the partial message before the first person finishes writing it. This is surely faster than giving each person a separate piece of paper, having the first person write something on their paper, and then asking a messenger to copy the message to the second person’s paper.

The “extra copy” straw man in the shared memory double-copy would be like having three pieces of paper: One private to the first person, one private to the second person, and one shared. The first person writes their message on their private sheet of paper, and then they copy the message to the shared piece of paper, and the recipient sees the message on the shared piece of paper and copies it to their private piece of paper. Yes, this entails two copies, but that’s because you set it up that way. The shared memory didn’t force you to create separate copies. That was your idea.

Now, maybe the data generated by the first process is not in a form that the second process can consume directly. In that case, you will need to generate the data into a local buffer and then convert it into a consumable form in the shared buffer. But you had that problem with Write­Process­Memory anyway. If the first process’s data is not consumable by the second process, then it will need to convert it into a consumable form and pass that transformed copy to Write­Process­Memory. So Write­Process­Memory has those same extra copies as shared memory.

Furthermore, Write­Process­Memory doesn’t guarantee atomicity. The receiving process can see a partially copied buffer. It’s not like the system is going to freeze all the threads in the receiving process to prevent them from seeing a partially-copied buffer. With shared memory, you can control how the memory becomes visible to the other process, say by using an atomic write with release when setting the flag which indicates “Buffer is ready!” The Write­Process­Memory function doesn’t let you control how the memory is copied. It just copies it however it wants, so you will need some other way to ensure that the second process doesn’t consume a partial buffer.

Bonus insult: The Write­Process­Memory function internally makes two copies. It allocates a shared buffer, copies the data from the source process to the shared buffer, and then changes memory context to the destination process and copies the data from the shared buffer to the destination process. (It also has a cap on the size of the shared buffer, so if you are writing a lot of memory, it may have to go back and forth multiple times until it copies all of the memory you requested.) So you are guaranteed two copies with Write­Process­Memory.

Bonus chatter: Another strike against Write­Process­Memory is the security implications. It requires PROCESS_VM_WRITE, which basically gives full control of the process. Shared memory, on the other hand, requires only that you find a way to get the shared memory handle to the other process. The originating process does not need any special access to the second process aside from a way to get the handle to it. It doesn’t gain write access to all of the second process’s memory; only the part of the memory that is shared. This adheres to the principle of least access, making it suitable for cases where the two processes are running in different security contexts.

Bonus bonus chatter: The primacy of shared memory is clear once you understand that shared memory is accomplished by memory mapping tricks. It is literally the same memory, just being viewed via two different apertures.

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

2 comments

Sort by :
  • Jan RingoÅ¡ 34 minutes ago

    The note about Write­Process­Memory function internally making two copies took the wind out of my sail. I was experimenting with making a truly zero-copy pipe, where the Read function passes the provided target buffer pointer and size to the other side, and Write function does just WriteProcessMemory. Now it’s ruined :’-(

  • Melissa P 3 hours ago

    oh, anonymous shared memory is the best thing ever; changes are instant, interlocked/atomic operations done by the CPU just work, there is no syscall when accessing memory, you can have 100 participants, you can fine grade control access over it (4K page size), you can designate reader/writer or producer/consumer, and with clever pointers you don't need any memory-copy at all

    even pointers will work inside shared memory if both shares are mapped to the same base address in all participating processes and all pointer addresses are confined to that memory share area; you can even put a heap manager onto it

    combined...

    Read more