{"id":29453,"date":"2006-10-06T10:00:04","date_gmt":"2006-10-06T10:00:04","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/oldnewthing\/2006\/10\/06\/a-very-brief-return-to-part-6-of-loading-the-chineseenglish-dictionary\/"},"modified":"2006-10-06T10:00:04","modified_gmt":"2006-10-06T10:00:04","slug":"a-very-brief-return-to-part-6-of-loading-the-chineseenglish-dictionary","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/oldnewthing\/20061006-04\/?p=29453","title":{"rendered":"A very brief return to part 6 of Loading the Chinese\/English dictionary"},"content":{"rendered":"<p>\nBack in\n<a HREF=\"http:\/\/blogs.msdn.com\/oldnewthing\/archive\/2005\/05\/19\/420038.aspx\">\nPart 6 of the first phase of the\n&#8220;Chinese\/English dictionary&#8221; series<\/a>\n(a series which I intend to get back to someday but somehow that\nday never arrives),\nI left an exercise related to the <code>alignment<\/code> member\nof the <code>HEADER<\/code> union.\n<\/p>\n<p>\nAlignment is one of those issues that\n<a HREF=\"http:\/\/blogs.msdn.com\/oldnewthing\/archive\/2004\/09\/14\/229387.aspx\">\npeople who grew up with a forgiving processor architecture tend to ignore<\/a>.\nIn this case, the <code>WCHAR alignment<\/code> member\nensures that the total size of the <code>HEADER<\/code> union\nis suitably chosen so that a <code>WCHAR<\/code> can appear\nimmediately after it.\nSince we&#8217;re going to put characters immediately after the\n<code>HEADER<\/code>, we&#8217;d better make sure those characters\nare aligned.\nIf not, then processors that are alignment-sensitive will raise\na <code>STATUS_DATATYPE_MISALIGNMENT<\/code> exception,\nand even processors that are alignment-forgiving will suffer\nperformance penalties when accessing unaligned data.\n<\/p>\n<p>\nThere are many variations on the alignment trick, some of them\nmore effective than others.\nA common variation is the\n<a HREF=\"http:\/\/blogs.msdn.com\/oldnewthing\/archive\/2004\/08\/26\/220873.aspx\">\none-element-array trick<\/a>:\n<\/p>\n<pre>\nstruct HEADER {\n HEADER* m_phdrPrev;\n SIZE_T  m_cb;\n WCHAR   m_rgwchData[1];\n};\n\/\/ you can also use \"offsetof\" if you included &lt;stddef.h&gt;\n#define HEADER_SIZE FIELD_OFFSET(HEADER, m_rgwchData)\n<\/pre>\n<p>\nWe would then use <code>HEADER_SIZE<\/code> instead of\n<code>sizeof(HEADER)<\/code>.\nThis technique does make it explicit\nthat an array of <code>WCHAR<\/code>s will come after the header,\nbut it means that the code that wants to allocate a <code>HEADER<\/code>\nneeds to be careful to use <code>HEADER_SIZE<\/code> instead of\nthe more natural <code>sizeof(HEADER)<\/code>.\n<\/p>\n<p>\nA common mistake is to use this incorrect definition for\n<code>HEADER_SIZE<\/code>:\n<\/p>\n<pre>\n<i>#define HEADER_SIZE (sizeof(HEADER) - sizeof(WCHAR)) \/\/ wrong<\/i>\n<\/pre>\n<p>\nThis incorrect\nmacro inadvertently commits the mistake it is trying to protect against!\nThere might be (and indeed, will almost certainly be in this instance)\nstructure padding after <code>m_rgwchData<\/code>, which this macro\nfails to take into account.\nOn a 32-bit machine, there will likely be two bytes of padding after\nthe <code>m_rgwchData<\/code> in order to bring the total structure\nsize back to a value that permits another <code>HEADER<\/code> to appear\ndirectly after the previous one.\nIn its excitement over dealing with internal padding, the above\nmacro forgot to deal with trail padding!\n<\/p>\n<p>\nIt is the &#8220;array of <code>HEADER<\/code>s&#8221; that makes the original\n<code>union<\/code> trick work.\nSince the compiler has to be prepared for the possibility of allocating\nan array of <code>HEADER<\/code>s, it must provide padding at\nthe end of the <code>HEADER<\/code> to ensure that the next <code>HEADER<\/code>\nbegins at a suitably-aligned boundary.\nYes, the <code>union<\/code> trick can result in &#8220;excess padding&#8221;,\nsince the type used for alignment may have less stringent alignment\nrequirements than the other members of the aggregate,\nbut better to have too much than too little.\n<\/p>\n<p>\n<a HREF=\"http:\/\/blogs.msdn.com\/oldnewthing\/archive\/2005\/05\/19\/420038.aspx#420053\">\nAnother minor point<\/a>\nwas brought up by commenter Dan McCarty:\n&#8220;Why is <code>MIN_CBCHUNK<\/code> set to 32,000 instead of 32K?&#8221;\nNotice that <code>MIN_CBCHUNK<\/code> is added to <code>sizeof(HEADER)<\/code>\nbefore it is rounded up.\nIf the allocation granularity were 32768, then rounding up the sum to the\nnearest multiple would have taken us to 65536.\nNothing wrong with that, but it means that our minimum chunk size is twice as\nbig as the <code>#define<\/code> suggests.\n(Of course, since in practice\n<a HREF=\"http:\/\/blogs.msdn.com\/oldnewthing\/archive\/2003\/10\/08\/55239.aspx\">\nthe allocation granularity is 64KB<\/a>,\nthis distinction is only theoretical right now.)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Back in Part 6 of the first phase of the &#8220;Chinese\/English dictionary&#8221; series (a series which I intend to get back to someday but somehow that day never arrives), I left an exercise related to the alignment member of the HEADER union. Alignment is one of those issues that people who grew up with a [&hellip;]<\/p>\n","protected":false},"author":1069,"featured_media":111744,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[25],"class_list":["post-29453","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-oldnewthing","tag-code"],"acf":[],"blog_post_summary":"<p>Back in Part 6 of the first phase of the &#8220;Chinese\/English dictionary&#8221; series (a series which I intend to get back to someday but somehow that day never arrives), I left an exercise related to the alignment member of the HEADER union. Alignment is one of those issues that people who grew up with a [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/29453","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/users\/1069"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/comments?post=29453"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/posts\/29453\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media\/111744"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/media?parent=29453"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/categories?post=29453"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/oldnewthing\/wp-json\/wp\/v2\/tags?post=29453"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}