The good and the bad of exception filters
Every so often we get asked questions about the CLR’s support for exception filters. Why do some languages support them and others do not? Is it a good idea to add them to my new language? When should I use a filter vs. catch/rethrow? Etc. I’ll try to answer some of these questions for you here, and while I won’t go into all of them hopefully you’ll walk away with enough info to form your own opinion on the rest. Like so many things there’s good things and bad things about exception filters…
So what’s a filter?
The CLR provides a number of exception handling primitives that higher level languages can build upon. Some are fairly obvious, and map readily to language constructs that most of us know and love: try/catch and try/finally, for instance. I’d hazard to guess that everyone knows what those do, but just in case, let’s consider a quick example in C#:
if (P) throw new ArgumentException();
catch (ArgumentException e)
If P is true, then that will print out “1234”, of course. If P is false, then it will print “124”. Groovy.
But the CLR also provides two more EH primitives: fault and filter. A fault clause is much like a finally clause; it runs when an exception escapes its associated try block. The difference is that a finally clause also runs when control leaves the try block normally, whereas the fault clause will only run when control leaves the try block due to an exception. In the case above, if we replaced the “finally” with “fault” (there’s no C# syntax for that, but suspend your disbelief for a moment) then it would print “1234” if P is true, and “14” is P is false. See the difference? Most languages don’t expose this as a first-class language construct, but we do have a few that use fault clauses under the covers for specific scenarios.
So that leaves us with filters. I suppose the simplest definition of a filter is that it is a construct that allows one to build a conditional catch clause. In fact, that’s exactly what VB uses filters for. Let’s consider a more complicated example in VB:
Function Foo() As Boolean
Dim P As Boolean = True
If (P) Then
Throw New ArgumentNullException()
Console.Write(“P was False!”)
Catch ex As ArgumentNullException When Foo()
Catch ex As ArgumentException
Here you’ll note the “Catch ex As ArgumentNullException When Foo()” line is a conditional catch statement. The catch handler will only execute and print “4” when the exception is an ArgumentNullException and when Foo() returns true. If Foo() returns false, then the catch clause doesn’t execute, and the system continues to search for a catch clause that can handle the exception. In this case, the very next clause would handle the exception, and print “5”.
So, what do you think this program prints? Don’t cheat by attempting to compile and run it! Using what you know about exception handling and looking at the program structure and syntax, what would you imagine this program prints? I suspect most people would guess “12346”. I even gave you a clue with the numbering.
I think most of us would look at the example above and conclude that the result should be “12346” because when we look at the syntax above we, quite rightly, see lexically scoped language constructs. We expect that when the code in the inner finally clause starts executing that no more code anywhere in the associated try block will execute. For instance, in the example above, if P is true then when we enter the Finally we know that no more code will execute in the try block, and that we’ll never print “P was False!”. Likewise, when we evaluate one of the catch clauses, we expect that all the code in the associated try block is done executing.
And here comes the bad…
It turns out that program actually prints “13246”. My clue with the numbering was an evil ruse. After the throw, Foo() is executed first as part of evaluating the first catch clause, and then the finally within the associated try block executes. And that’s just freaky… what happened to our lexically scoped language constructs?!
This is a surprising result for most. It breaks our intuitive reasoning about the language based on the lexically scope exception handling constructs provided. Here, when we evaluate the conditional catch clause, all the code in the associated try block has not, in fact, finished executing.
Why does this happen?
The reason we see “3” before “2” is subtle, and founded in the CLR’s implementation of exception handling. The CLR’s exception handling model is actually a “two pass” model. When an exception is thrown, the runtime searches the call stack for a handler for the exception. The goal of the first pass is to simply determine if a handler for the exception is present on the stack. If it sees finally (or fault) clauses, it ignores them for a moment.
The handler may be in the form of a typed catch clause, i.e., “Catch ex as ArgumentException)”. When the runtime sees a typed catch clause, it can determine itself if this clause will handle the exception by performing a simple check to determine if the exception object is of a type that inherits from (or is) the type in the clause.
But when the runtime sees a filter, it must execute the filter in order to determine if the associated handler will handle the exception. If the filter evaluates to true, then a handler has been found. If it evaluates to false, then the runtime will keep searching for a handler.
Once a handler has been found, the first pass is over and the second pass begins. On the second pass, the runtime again runs the call stack from the point of throw, but this time it executes all finally (or fault) clauses it finds on the way to the handler it identified on the first pass. When the handler is reached, it is executed, and the exception has finally been handled.
But why is that bad?
“Okay”, you say, “I get it. I understand that filters run during the first pass. I can deal with that… what’s the big deal?” Well, let’s first consider what a finally clause is for. We typically use finally clauses to ensure that our program state remains consistent when exiting a function even in the face of an exception. We put back temporarily broken invariants. Consider that the C# “using” statement is built using try/finally, and then consider all the things one might do with that.
But when your filter runs, none of those finally clauses have executed. If you called into a library within the associated try block, you may have not actually completed the call when your filter executes. Can you call back into the same library in that case? I don’t know. It might work. Or it might yield an assert, or an exception, or, well, your guess is as good as mine. The point is that you can’t tell.
Using filters wisely (or, “the good”)
But the notion of a conditional catch clause really is quite appealing, and there are ways to use these without getting caught by the problems of when the filter actually executes. The key is to only read information from either the exception object itself, or from immutable global state, and to not change any global state. If you limit your actions in a filter to just those, then it doesn’t matter when the filter runs, and no one will be able to tell that the filter ran out of order.
For instance, if you have a fairly general exception, like COMException, you typically only want to catch that when it represents a certain HRESULT. For instance, you want to let it go unhanded when it represents E_FAIL, but you want to catch it when it represents E_ACCESSDEINED because you have an alternative for that case. Here, this is a perfectly reasonable conditional catch clause:
Catch ex As System.Runtime.InteropServices.COMException When ex.ErrorCode() = &H80070005
The alternative is to place the condition within the catch block, and rethrow the exception if it doesn’t meet your criteria. For instance:
Catch ex As System.Runtime.InteropServices.COMException
If (ex.ErrorCode != &H80070005) Then Throw
Logically, this “catch/rethrow” pattern does the same thing as the filter did, but there is a subtle and important difference. If the exception goes unhandled, then the program state is quite different between the two. In the catch/rethrow case, the unhandled exception will appear to come from the Throw statement within the catch block. There will be no call stack beyond that, and any finally blocks up to the catch clause will have been executed. Both make debugging more difficult. In the filter case, the exception goes unhandled from the point of the original throw, and no program state has been changed by finally clauses.
The problem is that we rely on programmer discipline in order to use filters correctly, but it’s easy to use them incorrectly and end up with infrequently executed code (exceptions, after all, are for exceptional circumstances) that has subtle and hard to diagnose bugs due to inconsistent program state that should have been cleaned up by finally clauses further down the stack.
Why does the CLR use a two-pass exception handling model?
The CLR implements a two-pass exception handling system in order to better interoperate with unmanaged exception handling systems, like Win32 Structured Exception Handling (SEH) or C++ Exception Handling. We must run finally (and fault) clauses on the second pass so they run in order with unmanaged equivalents. Likewise, we must not execute filters later (say, on the second pass) because one of those unmanaged systems may have remembered that it was supposed to be responsible for handling the exception. If we were to run a filter late on the second pass and decide that a managed clause really should catch the exception after previously having not declared that on the first pass, then we would violate our contract with those unmanaged mechanisms with unpredictable results.
So, in short, it’s for interop, and like many things involving interop, we have a compatibility burden that we can’t ignore.
Would a one-pass model be better?
Many have wondered over the years if perhaps the two-pass model in general is bad, and if a one-pass model would be better. Like so many things in the world, it’s just not that clear. A one-pass model would simplify the exception handling implementation, and it would make more sense in the cases shown above. However, there are advantages to the two-pass model that can’t be ignored. Perhaps the most important one is that if the search for a handler fails on the first-pass the exception goes unhandled and in general no program state has changed, even though filters are run since filters tend not to change things. The call stack is still intact, and all values that lead to the exception are still present on the stack and on the heap (assuming no race conditions.) This is frequently essential when debugging an unhandled exception. In a one-pass model, all of the finally clauses would have been run before the exception goes unhandled.
Of the languages that MS ships, only VB and F# support filters, and both do so via conditional catch clauses. In F#, you have to really go out of your way to attempt to inspect mutable global state, or to actually have a side effect, so you’re fairly safe there. In VB, though, you can call a function from the “when” clause of their catch statement, and in there you have free reign to do whatever you please. You can, without a doubt, get yourself into trouble attempting to do too much complicated work within such a filter. To keep your world safe and simple try to limit yourself to expressions that only access the exception object, and don’t modify anything. If you go beyond that, you need to consider carefully all the code executed in the try block, and if your actions in the filter will work if the backout code below has not finished executing yet.