In practice, when you set a code breakpoint in the debugger, the debugger replaces the instruction at that location with a breakpoint instruction.¹ When execution reaches that instruction, it will encounter the breakpoint instruction and break into the debugger.
When the program has been stopped in the debugger, what happens next can vary from debugger to debugger. Some debuggers remove all their breakpoints when the program stops, and then restore the breakpoints when the program resumes. Other debuggers leave the breakpoints in place even when the program is stopped.
In both cases, if you inspect the memory in the debugger, you will see the original unpatched code. In the first case, it’s because the code really is unpatched; the breakpoint instructions are removed. In the second case, it’s because the debugger is lying to you and showing you the original bytes even though they aren’t what are in memory right now.
Most of the time, this deception is insignificant. Everything looks like no patching has occurred.
But sometimes you will notice.
One case where you will notice is if the program tries to read from its own code bytes. In that case, it will see the patched instructions.
Another case is where you mistakenly set a code breakpoint on data. The debugger replace the “instruction” at the data you specified with a breakpoint instruction, and then resumes execution. Your code then tries to read from that data, and instead of reading the original data, it reads the breakpoint instruction. What happens next depends on what the program tries to do with that data, but it’s usually not good.
So take care when you set your code breakpoints. Make sure they really are on code.
¹ The encoding of the breakpoint instruction on x86 is the single byte 0xCC
.
Maybe I don’t uses a debugger often enough (guilty!) but how “easy” is it to place a breakpoint in data? Other than when debugging ASM code I don’t see how that could occur.
Still, an interesting point to ponder.
@alan robinson
This is from the processor, it also requires kernel mode support. The debug registers 0-3 allow a virtual address to be written to it, and then set it so that the processor raises an exception when the address is modified. I know the AMD references better, and you can find information on this in the AMD64 Architecture Programmer Reference volume 2 chapter 13 (page 389).
@Joe Beans
It shouldn't need to do anything different than native...
How does managed code do it? Does the JIT reserve a NOOP in between every source line?
Fun fact about the “replace the instructions with an interrupt” technique: it was patented (number 3415981) by my dad back in the 60’s when he was building big computers for RCA. And let me just say, old patents are way cooler than current ones: they have fancy ribbons and embossed seals on them
Really cool, didn’t know this technique goes back to the 60s!
Re Antonio:
RISC-V takes it a step further, and both the “all 0’s” and “all 1’s” patterns are illegal instructions (remember that unprogrammed EEPROMs usually have the “all 1’s” pattern).
In the 70s it was already in wide use: in the 6502 (from 1975) the software interruption instruction is called BRK. The Apple II "Monitor", the debugger built into the firmware, used it to implement step by step execution and traces.
Making BRK's opcode 0x00 was a nice choice: if there was a jump to nowhere (code not loaded, incorrect pointer or offset, return with stack out of sync...), it would not take long to find...