July 12th, 2005

Converting from traditional to simplified Chinese, part 2: Using the dictionary

Now that we have our traditional-to-simplified pseudo-dictionary, we can use it to generate simplified Chinese words in our Chinese/English dictionary.

class StringPool
{
public:
 StringPool();
 ~StringPool();
 LPWSTR AllocString(const WCHAR* pszBegin, const WCHAR* pszEnd);
 LPWSTR DupString(const WCHAR* pszBegin)
 {
  return AllocString(pszBegin, pszBegin + lstrlen(pszBegin));
 }
 …
};

The DupString method is a convenience we will use below.

Dictionary::Dictionary()
{
 …
    if (de.Parse(buf, buf + cchResult, m_pool)) {
     bool fSimp = false;
     for (int i = 0; de.m_pszTrad[i]; i++) {
      if (pmap->Map(de.m_pszTrad[i])) {
       fSimp = true;
       break;
      }
     }
     if (fSimp) {
      de.m_pszSimp = m_pool.DupString(de.m_pszTrad);
      for (int i = 0; de.m_pszTrad[i]; i++) {
       if (pmap->Map(de.m_pszTrad[i])) {
        de.m_pszSimp[i] = pmap->Map(de.m_pszTrad[i]);
       }
      }
     } else {
      de.m_pszSimp = NULL;
     }
     v.push_back(de);
    }
 …
}

After we parse each entry from the dictionary, we scan the traditional Chinese characters to see if any of them have been simplified. If so, then we copy the traditional Chinese string and use the Trad2Simp object to convert it to simplified Chinese.

If the string is the same in both simplified and traditional Chinese, then we set m_pszSimp to NULL. This may seem a bit odd, but it’ll come in handy later. Yes, it makes the m_pszSimp member difficult to use. I could have created an accessor function for it (so that it falls back to traditional Chinese if the simplified Chinese is NULL), but I’m feeling lazy right now, and this is just a one-shot program.

void RootWindow::OnGetDispInfo(NMLVDISPINFO* pnmv)
{
 …
  switch (pnmv->item.iSubItem) {
   case COL_TRAD:    pszResult = de.m_pszTrad;    break;
   case COL_SIMP:    pszResult =
      de.m_pszSimp ? de.m_pszSimp : de.m_pszTrad; break;
   case COL_PINYIN:  pszResult = de.m_pszPinyin;  break;
   case COL_ENGLISH: pszResult = de.m_pszEnglish; break;
  }
 …
}

Finally, we tell our OnGetDispInfo handler what to return when the listview asks for the text that goes into the simplified Chinese column. With these changes, we can display both the traditional and simplified Chinese for each entry in our dictionary.

Next time, a minor tweak to our display code, which happens to illustrate custom-draw as a nice side-effect.

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

0 comments

Discussion are closed.

Feedback