Recall that the original purpose of creating a temporary fiber was to ensure that a minimum amount of stack was available for the function to perform its operations. But it would be nice if we could bypass all the fiber machinery if the existing stack is large enough.
So how can you figure out if the existing stack is large enough?
One way is to use _alloca
and catch the stack overflow. You need to put this in a separate function because the goal of the _alloca
is not to use the allocated memory directly, but rather to ensure that enough memory is available. You want to free the _alloca
-allocated memory immediately, which means returning from the function immediately.
__declspec(noinline) bool is_stack_available(size_t amount) { __try { _alloca(amount); return true; } __except ( GetExceptionCode() == EXCEPTION_STACK_OVERFLOW ? EXCEPTION_EXECUTE_HANDLER : EXCEPTION_CONTINUE_SEARCH) { _resetstkoflw(); return false; } }
If there is at least amount
of stack remaining,¹ then the _alloca
will succeed, and the function returns true
. The act of returning frees the memory. This is why it’s important that the function be noinline
: We need to make sure the function actually returns.
If there is insufficient stack, then an EXCEPTION_
STACK_
OVERFLOW
structured exception is raised. If that happens, then we handle the exception by calling _resetstkoflw
² to re-arm the guard page, and then return false
to let the caller know that the allocation failed.
This technique has the advantage of relying on the C runtime library itself to do the overflow detection. This defers the work of keeping things in sync with the implementation to the implementation, which is a good thing for maintenance.
On the other hand, the _alloca
function actually allocates the memory, converting the guard pages into real committed pages. If your function doesn’t always consume all of the reserved space, the memory is nevertheless committed to your process and considered recently-accessed, which can force other pages out of your process’s working set.
Next time, we’ll look at another way to estimate the amount of available stack space.
¹ Note that the Itanium has two stacks, so this test probes only for remaining space on the data stack. There is no obvious way to probe for remaining space in the register backing store.
² Somebody must have been billing by the character when that function name was chosen.
I am surprised that the Itanium is still a thing in 2020 (although to be fair its probably only still a thing because HP is spending $$$ to make it still a thing so they can compete with the likes of SPARC and POWER)
I've glanced back across your Itanium processor series and i'm still not 100% sure, but presumably an overflow of the register backing store would also be caught by SEH? Could you do something like count how nested you can get into a recursive function?
I suppose the difficulty would be in knowing how much spill-space your function requires, maybe some inline assembly that forces (all/as many registers as possible) to be banked. Though I'm not quite...
I have a bad feeling that I’m the one who wrote and named _resetstkoflw, about 20 years ago. Sorry, I have no excuse for the name. It was a simpler time, when vowels were too valuable to waste, I guess.
When I was a mainframe programmer, I always wondered why IBM System/360 and 370 JCL had an “ASSGN” keyword to … assign things. The keyword saved ONE letter. The limit for these kinds of keywords was 8 characters… ASSIGN is only 6 characters.
Apparently vowels cost much more than consonants back in the good old days.
Now that the Great Vowel Shortage is over, we’re getting ridiculus identifiers like
hardware_constructive_interference_size
in C++.Meh, vowels kshmowels (says someone whose surname has a cluster of Slavic consonants).
At first I wondered if it had to do with some old compiler or linker limit on the number of characters considered in an exported identifier, but that wouldn’t make sense since the first (eg) 6 chars are “resets” or “_reset” if the underscore counts, which isn’t exactly a unique-ish name.
Scarcity of vowels has been an ongoing problem with computers for decades: I blame programmers who never empty their bit buckets.
The code example seems to be missing “if {” before “GetExceptionCode()”.
Nope, it’s good. The expression inside the __except is supposed to return a code that tells the structured exception handler what to do next. If the exception was a stack overflow, we want to run the handler. Otherwise, we want to propagate it.