Developing a Chinese/English dictionary: Introduction

Raymond Chen

The other day, one of my colleagues mentioned that his English name “Ben” means “stupid” in Chinese: 笨/bèn/ㄅㄣˋ. (His wife is Chinese; that’s why he knows this in the first place.) Knowing that the Chinese language is rich in homophones, I fired up my Chinese/English dictionary program to see if we could find anything better. (Unfortunately, the best I could come up with was 賁/贲/bēn/ㄅㄣ, which means “energetic”.) Ben seemed to take his appellative fate in stride; he seemed much more interested in the little dictionary program I had written. So, as an experiment, instead of developing tiny samples that illustrate a very focused topic, I’ll develop a somewhat larger-scale program (though still small by modern standards) so you can see how multiple techniques come together. The task will take many stages, some of which may take only a day or two, others of which can take much longer. If a particular stage is more than two or three days long, I’ll break it up with other articles, and I’ll try to leave some breathing room between stages. Along the way, we’ll learn about owner-data (also known as “virtual”) listviews, listview custom-draw, designing for accessibility, window subclassing, laying out child windows, code pages, hotkeys, and optimization. If you’re going to play along at home, beware that you’re going to have to install Chinese fonts to see the program as it evolves, and when you’re done, you’ll have a Chinese/English dictionary program, which probably won’t be very useful unless you’re actually studying Chinese… If you’re not into Win32 programming at all, then, well, my first comment to you is, “So what are you doing here?” And my second comment is, “I guess you’re going to be bored for a while.” You may want to go read another blog during those boring stretches, or just turn off the computer and go outside for some fresh air and exercise.

Those who have decided to play along at home will need the following: a copy of the CEDICT Chinese-English dictionary in Big5 format (note: Big5 format) and the Chinese Encoding Converter source code (all we need is the file hcutf8.txt). We’ll start digging in next time.


Discussion is closed.

Feedback usabilla icon