May 26th, 2011

Enforcing Correct Concurrent Access of Class Data

Jim Springfield

Hi, this is Jim Springfield. I’m an architect on the Visual C++ team.

In any concurrent application, protecting data from concurrent access is extremely important. There are many primitives that can be used for this, such as critical sections, mutexes, reader-writer locks, etc. There are also some newer high-level approaches to concurrency such as those provided by the Concurrency Runtime, although this isn’t the focus of what I’m showing here. However, there isn’t a good way in C++ to make sure that you are really protecting data correctly when accessing it from multiple threads. You will often see a comment (likely made by the original author) next to a member that reminds you to take some lock when accessing the data. There may be many data items all using the same lock and there may be more than one lock, with some data protected by one lock and some by another.

When it comes time to access some data from a member function, you have to start asking some questions. Who is going to call this member? What locks will already be held? Could I deadlock here? While I don’t have a solution to all of these, I do have a technique that allows you to be more aggressive with trying things and more comfortable with making changes to existing code, while guaranteeing that you don’t violate the requirement that a particular lock is held.

What I’m going to show is a way to associate a lock with a data member such that whenever that data member is accessed, a check is made that the proper lock is held by the thread. The basis for the technique uses native properties to provide access to data members. With a small set of macros, you can easily retrofit existing code to provide this benefit. I developed this technique years ago and I have used it in several code bases to catch problems with concurrent access.

Here is an example of something you will typically see in code. The developer has written that a critical section should be held when accessing m_rgContextsCache.

  1. // Make sure m_cs is held when accessing m_rgContextsCache
  2. vector<FileConfig> m_rgContextsCache;

Wouldn’t it be great if this information could be specified in code AND enforced? The code below shows how to transform this into just that.

  1. PROTECTED_MEMBER(m_cs, vector<FileConfig>, m_rgContextsCache);

Now, whenever m_rgContextsCache is accessed, a user-defined function will be called if the proper lock is not held. What the macro does is to create the actual data member with a slightly modified name and a property with the name specified. Now, all you have to do is run your code and see if any errors occur. There is one “gotcha”. When members are initialized in the constructor or referenced in a destructor, the lock isn’t going to be held. For those cases, you need to directly access the member. A macro that translates a name into the modified “real” name can be used. It can also be used anywhere that it is specifically safe to access the member outside of the lock. The nice thing is that it is now very clear when you are doing this. Here is the code for this.

  1. // The USN macro is used when you need to access a data member in an “unsafe” way.
  2. // This makes sense when you know no other thread is accessing it, such as in a constructor.
  3. #define USN(name) name##_usn_

The PROTECTED_MEMBER macro is defined below. The first line creates the actual member. The second line creates the property and the remaining lines implement the get and put.

  1. #define PROTECTED_MEMBER(cs, type, name) \
  2.     type USN(name);\
  3.     __declspec(property(get=Get_##name, put=Put_##name)) type name;\
  4.     type & Get_##name()\
  5.     {_PROTECT(verify_lock(cs));return USN(name);}\
  6.     type const & Get_##name() const\
  7.     {_PROTECT(verify_lock(cs));return USN(name);}\
  8.     type & Put_##name(type const & x)\
  9.     {_PROTECT(verify_lock(cs));USN(name) = x;return USN(name);}

There are a couple of things that aren’t defined yet. The verify_lock function will return a boolean indicating whether the lock is held or not. These can be defined for any type of lock you use. There is also the _PROTECT macro. This should be defined to do whatever you want in the case of a failure. This could log, assert, crash, etc.

There are some other variations of the macro to handle some additional cases. One is to handle arrays. It provides a parameterized property which handles the index.

  1. #define PROTECTED_MEMBER_ARRAY(cs, elemtype, name, length) \
  2.     typedef elemtype type_##name[length];\
  3.     elemtype USN(name)[length];\
  4.     __declspec(property(get=Get_##name, put=Put_##name)) elemtype name[length];\
  5.     elemtype& Get_##name(size_t i)\
  6.     {_PROTECT(verify_lock(cs));return USN(name)[i];}\
  7.     type_##name& Get_##name()\
  8.     {_PROTECT(verify_lock(cs));return USN(name);}\
  9.     const elemtype& Put_##name(size_t i, elemtype const& x)\
  10.     {_PROTECT(verify_lock(cs));USN(name)[i] = x;return USN(name)[i];}

To handle a reader-writer lock, a slightly different macro is used. Instead of “verify_lock”, two other functions are used: verify_readlock and verify_writelock. Again, these can be user-defined to handle any type of reader-writer lock. There is one additional wrinkle here, however. There is a function defined called “GetWritable_##name”. The getter returns a const& to the underlying member and verifies that a read lock is held, but this won’t allow you to call methods on it that modify it. To do that, you have to explicitly call GetWritable_##name. This will return a non-const reference and verify the write lock is held.

  1. #define PROTECTED_MEMBER_RW(lock, type, name) \
  2.     type USN(name);\
  3.     __declspec(property(get=Get_##name, put=Put_##name)) type name;\
  4.     const type & Get_##name()\
  5.     {_PROTECT(verify_readlock(lock));return USN(name);}\
  6.     type & Put_##name(type const& x)\
  7.     {_PROTECT(verify_writelock(lock));USN(name) = x;return USN(name);}\
  8.     __declspec(property(get=GetWritable_##name)) type Writable_##name;\
  9.     type & GetWritable_##name()\
  10.     {_PROTECT(verify_writelock(lock));return USN(name);}

There are a couple of other variations to the PROTECTED_MEMBER macro to handle some cases that can occur. If the data member can’t be assigned to (i.e. it is a type without assignment), we need to not provide a “Put” or we will get a compile error. Similarly, we may have a type that can’t be assigned from const data. These cases occur rarely in practice, but they do occur.

  1. #define PROTECTED_MEMBER_NC(cs, type, name) \
  2.     type USN(name);\
  3.     __declspec(property(get=Get_##name, put=Put_##name)) type name;\
  4.     type & Get_##name()\
  5.     {_PROTECT(verify_lock(cs));return USN(name);}\
  6.     type const & Get_##name() const\
  7.     {_PROTECT(verify_lock(cs));return USN(name);}\
  8.     template <typename T>\
  9.     type & Put_##name(T x)\
  10.     {_PROTECT(verify_lock(cs));USN(name) = x;return USN(name);}\
  11.  
  12. #define PROTECTED_MEMBER_GET(cs, type, name) \
  13.     type USN(name);\
  14.     __declspec(property(get=Get_##name, put=Put_##name)) type name;\
  15.     type & Get_##name()\
  16.     {_PROTECT(verify_lock(cs));return USN(name);}\
  17.     type const & Get_##name() const\
  18.     {_PROTECT(verify_lock(cs));return USN(name);}

Finally, here are some examples of verify_lock and verify_unlock that can handle critical sections by pointer or by reference.

  1. inline bool verify_lock(const CRITICAL_SECTION& cs)
  2. {
  3.     return (cs.OwningThread == (HANDLE)(UINT_PTR)GetCurrentThreadId());
  4. }
  5. inline bool verify_unlock(const CRITICAL_SECTION& cs)
  6. {
  7.     return (cs.OwningThread == (HANDLE)(UINT_PTR)0);
  8. }
  9.  
  10. inline bool verify_lock(const CRITICAL_SECTION* cs)
  11. {
  12.     return (cs->OwningThread == (HANDLE)(UINT_PTR)GetCurrentThreadId());
  13. }
  14. inline bool verify_unlock(const CRITICAL_SECTION* cs)
  15. {
  16.     return (cs->OwningThread == (HANDLE)(UINT_PTR)0);
  17. }

What I typically do is put all of these macros in a header file under an #ifdef _PROTECT guard. If _PROTECT is not defined, then I simply let everything collapse to simple data members. For release builds, the code is just as fast as before.

  1. #ifdef _PROTECT
  2. // All of the code from above
  3. #else
  4. #define USN(name) name
  5. #define PROTECTED_MEMBER(cs, type, name) type name;
  6. #define PROTECTED_MEMBER_NC(cs, type, name) type name;
  7. #define PROTECTED_MEMBER_GET(cs, type, name) type name;
  8. #define PROTECTED_MEMBER_RW(cs, type, name) type name;
  9. #define PROTECTED_MEMBER_ARRAY(cs, elemtype, name, length) elemtype name[length];
  10. #endif

Finally, there is no good way that I’ve found to do this for global or static member data. Usually, it isn’t too much work to wrap global or static data into a class (along with the appropriate lock), which is what I’ve done when I need to.

Category
C++

0 comments

Discussion are closed.