{"id":102837,"date":"2019-09-04T07:00:00","date_gmt":"2019-09-04T14:00:00","guid":{"rendered":"http:\/\/devblogs.microsoft.com\/oldnewthing\/?p=102837"},"modified":"2019-09-03T18:51:30","modified_gmt":"2019-09-04T01:51:30","slug":"20190904-00","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20190904-00\/?p=102837","title":{"rendered":"The <CODE>COM_<\/CODE><CODE>INTERFACE_<\/CODE><CODE>ENTRY<\/CODE> must be a COM interface, but nobody actually checks"},"content":{"rendered":"<p>A customer had some code written with the Active Template Library, <a href=\"https:\/\/docs.microsoft.com\/en-us\/cpp\/atl\/active-template-library-atl-concepts\"> more commonly known as ATL<\/a>. Apparently, ATL is still a thing!<\/p>\n<p>Anyway, their problem was that their component that had been working just fine for many years started crashing. Their object went something like this:<\/p>\n<pre>[uuid(\"...\")]\r\nclass ATL_NO_VTABLE CAwesomeWidget :\r\n    public CComObjectRootEx&lt;CComMultiThreadModel&gt;,\r\n    public CComCoClass&lt;CAwesomeWidget, &amp;CLSID_AwesomeWidget&gt;,\r\n    public IWidgetProviderInfo,\r\n    public IWidget\r\n    \/\/\/ ... other interfaces ...\r\n{\r\n    ...\r\n\r\n    BEGIN_COM_MAP(CAwesomeWidget)\r\n      COM_INTERFACE_ENTRY(IWidgetProviderInfo)\r\n      COM_INTERFACE_ENTRY(IWidget)\r\n      \/\/ ... other interfaces ...\r\n    END_COM_MAP()\r\n\r\n    \/\/ IWidgetProviderInfo\r\n    IFACEMETHODIMP GetProviderId(GUID *pguidId);\r\n\r\n    \/\/ IWidget\r\n    IFACEMETHODIMP Frob();\r\n    ...\r\n};\r\n<\/pre>\n<p>Yes, I&#8217;m making you look at ancient ATL code, with all its wacky macros. Product maintenance is like that. <a href=\"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/20160329-00\/?p=93214\"> Deal with it<\/a>.<\/p>\n<p>Clients can do<\/p>\n<pre>void Sample()\r\n{\r\n  CComPtr&lt;IWidget&gt; widget;\r\n  widget.CoCreateInstance(CLSID_AwesomeWidget);\r\n  widget-&gt;Frob();\r\n}\r\n<\/pre>\n<p>to create the <code>CAwesome\u00adWidget<\/code> as a widget, and then ask the widget to do something.<\/p>\n<p>The <code>CAwesome\u00adWidget<\/code> did its job very well for over five years, and then suddenly it started crashing in <code>Get\u00adProvider\u00adId<\/code>, even though nobody called <code>Get\u00adProvider\u00adId<\/code>.<\/p>\n<p>They traced it back to this call:<\/p>\n<pre>void Sample2()\r\n{\r\n  CComPtr&lt;IUnknown&gt; unk;\r\n  unk.CoCreateInstance(CLSID_AwesomeWidget);\r\n\r\n  <span style=\"color: blue;\">CComQIPtr&lt;IWidget&gt; widget(unk); \/\/ \u2190 here<\/span>\r\n  widget-&gt;Frob();\r\n}\r\n<\/pre>\n<p>Obviously, this wasn&#8217;t literally what their code did, but it boils the problem down to its essence.<\/p>\n<p>The idea was that instead of creating the object for its <code>IWidget<\/code> interface, the object was created with the generic <code>IUnknown<\/code> interface, and then it was converted to an <code>IWidget<\/code>. The reason for this extra step isn&#8217;t important, but you can come up with scenarios where this might happen. (For example, maybe the <code>Co\u00adCreate\u00adInstance<\/code> is coming from a generic &#8220;creation helper&#8221; function.)<\/p>\n<p>The problem was in the definition of the <code>IWidget\u00adProvider\u00adInfo<\/code>:<\/p>\n<pre>\/\/ widget.h\r\n[uuid(\"...\")] \r\ninterface IWidgetProviderInfo\r\n{\r\n    STDMETHOD(GetProviderId)(GUID *pguidId) PURE;\r\n};\r\n<\/pre>\n<p>Do you see something?<\/p>\n<p>Actually, more honestly, I should be asking, &#8220;Do you <i>not<\/i> see something?&#8221;<\/p>\n<p>The <code>IWidget\u00adProvider\u00adInfo<\/code> interface does not derive from <code>IUnknown<\/code>!<\/p>\n<p>Once you realize this, everything unravels.<\/p>\n<p>The <code>COM_<\/code><code>INTERFACE_<\/code><code>ENTRY<\/code> macro assumes that the thing it&#8217;s given is indeed a COM interface. This assumption is made in two places:<\/p>\n<ol>\n<li>A request for <code>IUnknown<\/code> returns the first interface in the list.<\/li>\n<li>Any successful request will be accompanied by a call to <code>IUnknown::<\/code><code>Add\u00adRef<\/code>, per COM rules.<\/li>\n<\/ol>\n<p>They listed <code>IWidget\u00adProvider\u00adInfo<\/code> as the first &#8220;interface&#8221;, so it was returned as in response to a request for <code>IUnknown<\/code>, even though it wasn&#8217;t <code>IUnknown<\/code>.<\/p>\n<p>The attempt to convert the <code>IUnknown<\/code> to an <code>IWidget<\/code> involves a call to <code>IUnknown::<\/code><code>Query\u00adInterface<\/code>, but remember, that thing which the code thinks is an <code>IUnknown<\/code> is really an <code>IWidget\u00adProvider\u00adInfo<\/code>. The call to <code>IUnknown::<\/code><code>Query\u00adInterface<\/code> actually called <code>IWidget\u00adProvider\u00adInfo::<\/code><code>Get\u00adProviderId<\/code>. With the wrong number and types of arguments, of course. The crash occurred when the <code>Get\u00adProvider\u00adId<\/code> method tried to write to what it thought was a <code>pguidId<\/code>, but which was actually a <code>riid<\/code> from the <code>Query\u00adInterface<\/code>.<\/p>\n<p>The customer would have noticed that something was wrong with <code>IWidget\u00adProvider\u00adInfo<\/code> had they ever tried to use it!<\/p>\n<pre>CComPtr&lt;IWidgetProviderInfo&gt; info;\r\n\/\/ ^^ error: class \"IWidgetProviderInfo\" has no \"Release\" member\r\n<\/pre>\n<p>Merely talking about <code>IWidget\u00adProvider\u00adInfo<\/code> causes the compiler to get upset because the <code>CComPtr<\/code> destructor needs to call the <code>Release<\/code> method, which doesn&#8217;t exist. The customer never noticed this because they never used the <code>IWidget\u00adProvider\u00adInfo<\/code> interface at all! It was presumably added in anticipation of a feature that never materialized.<\/p>\n<p>Okay, so now we understand why this crashes and how it eluded compile-time detection, but how did it ever work? After all, the implementation of <code>Create\u00adInstance<\/code> needs to do a <code>Query\u00adInterface<\/code> to return the proper pointer back to the caller.<\/p>\n<p>Here&#8217;s a simplified version of how ATL implements <code>Create\u00adInstance<\/code>:<\/p>\n<pre>template &lt;class T1&gt;\r\nclass CComCreator\r\n{\r\n  static HRESULT WINAPI CreateInstance(void* pv, REFIID riid, LPVOID* ppv)\r\n  {\r\n    HRESULT hRes = E_OUTOFMEMORY;\r\n    T1* p = NULL;\r\n    ATLTRY(p = new T1(pv))\r\n    if (p != NULL)\r\n    {\r\n      p-&gt;SetVoid(pv);\r\n      p-&gt;InternalFinalConstructAddRef();\r\n      hRes = p-&gt;_AtlInitialConstruct();\r\n      if (SUCCEEDED(hRes))\r\n        hRes = p-&gt;FinalConstruct();\r\n      p-&gt;InternalFinalConstructRelease();\r\n      if (hRes == S_OK)\r\n        <span style=\"color: blue;\">hRes = p-&gt;QueryInterface(riid, ppv);<\/span>\r\n      if (hRes != S_OK)\r\n        delete p;\r\n    }\r\n    return hRes;\r\n  }\r\n}\r\n<\/pre>\n<p>Why doesn&#8217;t the call to <code>p-&gt;Query\u00adInterface<\/code> crash?<\/p>\n<p>Because it&#8217;s being called from a pointer to a <code>T1<\/code>, not from a pointer to an <code>IUnknown<\/code>. The compiler therefore knows that the <code>Query\u00adInterface<\/code> method is in fact not at slot 0 in vtable 0, but rather is at slot 0 in vtable 1 (taking it from <code>IWidget<\/code>).<\/p>\n<p>So how about fixing the problem?<\/p>\n<p>One fix is to delete the unused <code>IWidget\u00adProvider\u00adInfo<\/code> interface. However, when working with legacy code, you may be averse to taking such a drastic step, because there might be somebody who is actually using that interface in a way you failed to detect. Or maybe you want to keep the interface around because you really do plan on using it soon. In that case, you can make the interface derive from <code>IUnknown<\/code>, like it should have in the first place:<\/p>\n<pre>[uuid(\"...\")] \r\ninterface IWidgetProviderInfo : <span style=\"color: blue;\">IUnknown<\/span>\r\n{\r\n    STDMETHOD(GetProviderId)(GUID *pguidId) PURE;\r\n};\r\n<\/pre>\n<p>Customer problem solved.<\/p>\n<p>Next time, we&#8217;ll look at how to catch this problem at compile time.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>And if it isn&#8217;t, bad things happen.<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-102837","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>And if it isn&#8217;t, bad things happen.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/102837","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=102837"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/102837\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=102837"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=102837"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=102837"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}