September 1st, 2020

More on trivial functions like CopyRect and EqualRect

Raymond Chen

Some time ago, I discussed trivial functions like CopyRect and EqualRect and why they even exist at all.

Piotr Siódmak wondered if you could just use memcpy to copy a rectangle. “It’s already there and probably already used by other parts of code, so it’s 0 bytes of overhead.”

One of the features of CopyRect and friends is that they all take far pointers to RECT. This means that you can use them to manipulate rectangles received from the operating system, even if your program uses the small or medium memory model.

The memcpy version of CopyRect would actually be _fmemcpy in order to support far pointers. And since it’s a C runtime function, it would follow the __cdecl calling convention, which is caller-clean. And you would have to pass an extra parameter to represent the size of a rectangle.

b8 08 00        mov  ax, 8          ; sizeof(RECT)¹
50              push ax
c4 5e f0        les  bx, [bp-10]    ; es:bx -> source rect
53              push bx
06              push es
c4 5e ec        les  bx, [bp-14]    ; es:bx -> destination rect
53              push bx
06              push es
9a xx xx xx xx  call _fmemcpy
83 c4 0a        add  sp, 10         ; clean the stack

This adds seven bytes to the code size.

But it’s worse than that. You also have to segment-tune your _fmemcpy function: Which functions are you going to put into a single segment for code swapping purposes? Maybe copying rectangles is something your program does constantly, so you want to keep it in your “hot” segment. Or maybe it’s something you do only occasionally, in which case you would put it in one of the more rarely-used segments. It means that when you do get around to copying a rectangle, you may have to pause to load the _fmemcpy function off the floppy drive.

And since the function is linked into your program, it means that if you are running five programs, each of them will have its own copy of _fmemcpy, with various other functions coming along for the ride in the same segments.

Why not just use the copy that the window manager already has?

The rectangle functions are in a “fixed” segment in the window manager, meaning that it is permanently loaded and cannot get swapped out. Calling it is basically free. You will never incur a segment swap by calling CopyRect or EqualRect. No need to create five private copies of a function and swap them in and out. Just use the one copy that the system already provides. That’ll help you fit your program into 256KB of memory.

Bonus chatter: The CopyRect function is also more efficient than a general-purpose _fmemcpy since it knows that it’s copying exactly eight bytes. The _fmemcpy function needs to support arbitrary-sized memory blocks, so it will spend extra time preparing for a block copy operation for maximum speed, even though that speed gain is not realized on such small copies.

¹ Why not use the two-byte push 8 instruction? Because this is 8086 code, and the two-byte “push sign-extended 8-bit immediate” instruction didn’t exist. It was first available in the 80186 processor.

Author

Raymond Chen

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

7 comments

Discussion is closed. Login to edit/delete existing comments.

Sort by :

Newest

Newest Popular Oldest

Simon Clarkstone September 2, 2020 0

Could you (ab)use the CopyRect function to copy any 8 bytes around or would it fail on some memory contents? Do you know if programmers and compilers used that trick?
Looking through the related functions https://docs.microsoft.com/en-us/windows/win32/gdi/rectangle-functions , I can see all sorts of potentially abusable ones, for programmers or compilers wanting to save space and willing to make a function call with far pointers. 🙂
Jim Nelson September 1, 2020 0

As Neil Rashbrook pointed out in the original post, another advantage to using these Windows-provided functions meant not having to include the CRT with your program. Not only did that translate to a smaller in-memory footprint, but it also meant smaller on-disk footprints and simplified code distribution. In a time when programs were carried around on floppy disk and downloaded from dial-up services, being able to provide a single zero-install .EXE for a...
Read more
As Neil Rashbrook pointed out in the original post, another advantage to using these Windows-provided functions meant not having to include the CRT with your program. Not only did that translate to a smaller in-memory footprint, but it also meant smaller on-disk footprints and simplified code distribution. In a time when programs were carried around on floppy disk and downloaded from dial-up services, being able to provide a single zero-install .EXE for a handy application was a big win.

Read less
Alex Martin September 1, 2020 0

> That’ll help you fit your program into 256MB of memory.

256MB? Who has 256MB on an 8086?
- Raymond Chen Author September 1, 2020 0
  
  Oops. Fixed, thanks.
- Peter Cooper Jr. September 1, 2020 0
  
  Yeah, I’m pretty sure that’s supposed to be 256KB (like in the initial article from February).
  
  But, what’s three orders of magnitude among friends?
  
  I’m, uh, glad I don’t need to count every byte like this in my day job. It’s amazing what old games and programs did in just a few dozen kilobytes.
  - Piotr Siódmak September 1, 2020 0
    
    Sometimes you wish people would go back at least a few steps to those times. Now you have custom script code interpreted by python interpreter writeen in typescript dynamically transpiled into javascript running in a web browser written in java emulated in .net written in javascript on a nodejs server running in an docker image on a linux VM running on a hypervisor written in java in an antivirus sandbox... because we have so many...
    Read more
    Sometimes you wish people would go back at least a few steps to those times. Now you have custom script code interpreted by python interpreter writeen in typescript dynamically transpiled into javascript running in a web browser written in java emulated in .net written in javascript on a nodejs server running in an docker image on a linux VM running on a hypervisor written in java in an antivirus sandbox… because we have so many resources, so who cares
    
    and all the program does is display a cat picture with a whole bunch on ads plastered around
    
    Read less
- Brian Boorman September 1, 2020 0
  
  Typo? He probably meant 256KB. I assume that muscle memory kicked in and type MB.