Math in Office
User, developer, and accessibility info on math editing/display in Microsoft Office and Windows. New features and specifications of the RichEdit editor. Getting OfficeMath onto web apps
Latest posts
It’s been so fine!
TL;DR I’m 51₁₆ long past the usual retirement age. So, after more than 30 years working with super talented, fun people at the best software company in the world, I’m retiring from Microsoft 😊. Stephen Chua now owns Math in Office. It all started in 1988 when I had a summer job getting Microsoft’s CodeView debugger running in protected mode up out of the MS-DOS 640 KB address space. My SST debugger could load and run real-mode DOS programs in protected mode with access to all the memory on Intel 286 and 386 CPUs. It could also execute backwards, quite handy for debugging. CodeView was cool, but Windows 2.1 see...
Math Dictation
You can dictate faster than you can write or type it, so math dictation can be handy for anyone working with math, notably on mobile devices. It can also make math more accessible. Math speech is similar to UnicodeMath, which you can use to enter equations into Word, PowerPoint, and other apps. Accordingly, we translate English math speech as recorded via Office dictation into UnicodeMath and build it up into OfficeMath. Currently we can dictate equations in English from algebra, trigonometry, and calculus into OneNote and PowerPoint. Examples of math dictation and the resulting OfficeMath are If you know ...
ITextDocument2 SetProperty and GetProperty
These methods allow programmers to set and get document properties for RichEdit controls. There is some documentation on the web, but more detail can be helpful and new properties have been added. This post describes the current set of properties except for the math properties, which are described in the post Default Document Math Properties.
Displaying Math in WordPress
The Microsoft devblogs are hosted by WordPress. Until recently, we haven’t been able to display equations using traditional math typography in our blogs except in images. Now we can embed LaTeX math which is a lot more accessible. One syntax for this is [ℒ]…[/ℒ], where the “…” has the desired LaTeX and the ℒ is the literal string “latex” (I can’t use the string “latex” itself since it would be interpreted as part of a LaTeX delimiter). Another syntax is "$ latex ...$" except leave out the space I added following the $ to prevent it from turning into LaTeX. On web pages viewed in FireFox and Safari as well as in t...
Math Speech Strings and Localization
This post describes how OfficeMath speech is localized into multiple (~18) languages. The facility doesn’t handle all Unicode math symbols or all math notations. But it handles the most common symbols and notations. There are two tables that need to be translated into the supported languages: a Unicode math symbol speech table and a math function table. Unicode Math Symbol Speech Table To localize the speech for Unicode characters, perform a binary search by character code in the math symbol speech table for the desired language. The table for English follows. Notably missing are speech strings for U+002D ‘–‘, ...
Using RichEdit for Text Processing
Suppose you’re writing a program that needs to process rich text. You could write your own functions. Alternatively, you could have RichEdit do the processing. For example, you might want to search for mathematical expressions in an RTF or HTML file or convert text in one math format to another format. Or change the kind of list numbering. Or recognize URLs, telephone numbers, etc. These manipulations don’t need a display. This post describes how to create a RichEdit instance to do the processing. Load the RichEdit dll The first thing is to load the RichEdit dll. You can use the system \windows\system32\msftedi...
Default Math Properties
Quite a few math properties have document default values. These default values are used if you don’t override them, which you can do usually by invoking a math context-menu option or programmatically by calling the ITextDocument2::SetProperty or ITextDocument2::SetMathProperties methods. Most properties pertain to “displayed” math zones, that is, math zones that begin either at the start of the document or a hard/shift Enter (CR/VT) and end at the following hard/shift Enter. The options determine math indents and characteristics such as whether integral limits are positioned below and above the integral or as sub...
RichEdit Hyperlinks
RichEdit has two kinds of hyperlinks: automatic hyperlinks (autoURLs) and friendly-name hyperlinks. As its name suggests, the autoURL is automatically recognized by RichEdit as a hyperlink and is displayed as a URL. A friendly name hyperlink has a name, which is displayed, and a hidden instruction part that contains the URL. This post describes these hyperlinks and explains how to manipulate them programmatically. The descriptions include some features that have been added recently. Automatic URLs The first autoURLs appeared in RichEdit 2.0, which shipped with Office 97, and have the usual web form, such as, ht...
Setting and Getting Text in Various Formats
You can get and set text from/into RichEdit in a variety of formats including RTF, HTML, MathML, OMML, UnicodeMath, Nemeth Braille, and speech. This post documents RichEdit options for a general way to access text using ITextRange2::SetText2(options, bstr) and ITextRange2::GetText2(options, pbstr). As such, this post is for programmers. All options work in the current Microsoft Office RichEdit (riched20.dll in an Office subdirectory) and many work in the Windows RichEdit (msftedit.dll). The options are defined in the following table in which s/g stands for SetText2/GetText2, respectively. Mutually exclusive op...
Computers I have known
A friend recommended that since I got into computers a long time ago, I should post about how computers have changed over the years. Well, here goes a trip down memory lane! Analog and button pushing The first computer I ever used was an Electronics Associates analog computer at the Perkin Elmer Corporation where I worked as an intern in the summers of 1961—1963. I wired up the machines to simulate aspects of the response of control systems that guided the balloon-borne Stratoscope II telescope. The telescope could be pointed to an accuracy of 0.1 arc seconds, which is the angle subtended by a dime at two miles...
Microsoft 365 Modern Comments
The new Modern Comments facility aims to give a similar commenting experience on the web and in native apps. On the desktop in Windows and the Mac, the editor component is RichEdit. The facility is a work in progress, and it is getting very nice. This post gives some background on incorporating RichEdit into the Microsoft 365 commenting experience. The most powerful commenting experience to date was in desktop Word, which used Word text boxes for comments. This allowed users to use most Word features in Word comments. Meanwhile, RichEdit lacks many advanced Word features. So, some Word power users have found t...
RichEdit Autoformatting
RichEdit and Word have had the most elaborate autoformatting ever since Office 2007, namely math formula auto-buildup. A UnicodeMath expression builds up into OfficeMath as soon as it is unambiguous. Word and other programs have also had less ambitious autoformatting since late in the last century. Such functionality includes auto conversion of some simple numeric fractions to composed Unicode numeric fractions, e.g., 1/2 to ½, ordinal superscripts, e.g., 1st to 1st, double dash to long dash, e.g., ab to a—b, smart quotes, e.g., to “word”, and automatic bulleted/numbered lists. Such autoformatting is available i...
Rounded Rectangles and Ellipses
In Word, PowerPoint, OneNote and RichEdit, you can enclose text in a rectangle by putting the text into a math “boxed formula” object. To try this out, type alt+= to enter a math zone and then \rect(a+b)<space>. This displays as as The “\rect(a+b)” is the UnicodeMath representation of this boxed formula. In MathML, the boxed formula object is represented by the <menclose notation=”box”> element. The <menclose> documentation mentions other notations as well, notably “roundedbox” and “circle” (which is actually an ellipse). Up through Office 2013, only the “box” option, along with hiding sides an...
OfficeMath
Microsoft Word 2007 and RichEdit 6.0 introduced the native Office math facility. PowerPoint, Excel, and OneNote followed suit in 2010, and Mac Word followed in 2011. But ironically the native math facility didn’t have a recognizable name. “Microsoft Equation Editor” (MEE) seemed natural, but that's the name of the Design Science math editor that shipped first in Office on Windows and the Mac in 1992 and was discontinued due to security problems. In fact, the post Converting Microsoft Equation Editor Objects to OfficeMath needed a name for the native math facility since it describes how you can convert MEE OLE mat...
Two Phonetic Scripts: Vietnamese and Korean
A few years ago, I visited two very interesting countries, Vietnam and South Korea. Being actively involved in writing software (mostly RichEdit) for editing the world’s scripts, I was naturally fascinated to see Vietnamese and Korean text displayed in profusion. The Vietnamese and Korean scripts were designed with a common purpose in mind: enable the languages to be read and written easily by all members of their respective countries. Earlier on, people tried to write Vietnamese and Korean by customizing the Chinese script. But while the Chinese script is well suited to Chinese languages, it’s considerably less ...
Windows 11 Notepad
The new Windows 11 Notepad uses RichEdit and runs on up-to-date Windows 11 installations. In addition to a Windows 11 look with rounded corners and a dark-theme option, the new Notepad includes several standard RichEdit editing enhancements, such as Alt+x for entering Unicode characters, Ctrl+} for toggling between matching brackets/parentheses, multilevel undo, drag & drop, color emoji, and autoURL detection. You might guess that using a RichEdit plain-text control in Notepad would be a slam dunk. RichEdit has had plain-text controls ever since Office 97 (last century!) and they’ve been used myriad times. Bu...
Function to get Unicode Fractions
Do you know that Unicode includes the fraction characters ↉ ½ ⅓ ¼ ⅕ ⅙ ⅐ ⅛ ⅑ ⅒ ⅔ ⅖ ¾ ⅗ ⅜ ⅘ ⅚ ⅞? Well thanks to existing character standards, ½ ⅓ ¼ ⅕ ⅙ ⅛ ⅔ ⅖ ¾ ⅗ ⅜ ⅘ ⅚ ⅞ were added in Unicode 1.1 in 1993, and ↉ ⅐ ⅑ ⅒ were added in Unicode 5.2 in 2009. Programs like Microsoft Word have an “Autoformat as you type” option to convert the linear fractions 1/2, 1/3, 1/4, and 3/4 into the corresponding Unicode fraction ½ ⅓ ¼ and ¾. This post gives a simple C++ function that converts the linear form of all Unicode fractions into the Unicode fraction characters. The function is relatively easy to read and understand, at lea...
UnicodeMath Color
In slide presentations and elsewhere, it can be handy to have math text color and math background color. In fact, Presentation MathML has the attributes “mathcolor” and “mathbackground”. This post describes the operators Noah Doersing added to his UnicodeMathML implementation to enter text and background color using UnicodeMath. I subsequently added them to the RichEdit math converters. Inputting math using UnicodeMath into Word and other applications typically doesn’t require special text and background color syntax since these applications have user interfaces (UI) to color text. But UnicodeMath is a useful ...
RichEdit Stories
Word and RichEdit have stories, but you won’t find their story’s definition in a dictionary. Their story is an object that stores rich text in computer memory. Rich text consists of Unicode plain text, associated character and paragraph formatting, and embedded objects such as images. Such a story can contain the narrative of a traditional story, but more generally it can contain any arbitrary set of Unicode characters. A RichEdit control is created with a built-in story called the “main story”. Other stories may be created by the client or by RichEdit for special purposes. For example, when you select some text ...
RichEdit Place Holder
Sometimes you need a text box that cues the user to type something in, such as, “Start a conversation”. As soon as the user types something, the cue text vanishes, and the user sees what the user typed. If the user deletes all the text, the cue text reappears. Such a text box is called a place-holder control. The Microsoft 365 RichEdit has such a control. This post explains how to include it in your application. Send two messages to set up a place-holder control: 1) EM_SETTEXTEX to set the place-holder text, and 2) EM_SETEDITSTYLEEX to enable the place-holder functionality. For setting the text, write someth...
LineServices
One of the key technologies behind the high-quality display of mathematical text in OfficeMath applications like Word, PowerPoint, and OneNote is a special component called LineServices along with its sibling Page/TableServices (PTS). In addition to handling math display, various versions of LineServices are responsible for line layout in Word, PowerPoint, Publisher, OneNote, RichEdit, WordPad, and the Windows 10 Calculator. LineServices was developed by one of the most amazing teams at Microsoft. Because LineServices is used by components like RichEdit and the XAML text edit controls, it’s indirectly available t...
Cascadia Code Font
For many years, I’ve wanted to use real mathematical notation in computer programs for code that is mathematical in nature. The document UnicodeMath discusses this in some detail in Section 6. Using UnicodeMath in Programming Languages. Over the years, some advances beyond ASCII (invented in the 1960’s!) have been made. For example, you can use Unicode math alphabetics as variable names in the C++ compilers for the major platforms. In math documents, the index for a summation is often a math-italic letter such as 𝑖, 𝑗, 𝑘, 𝑙, 𝑚 or 𝑛. You can use these math-italic characters in your C++ programs! In fact, an 𝑛 (U+1...
RichEdit Font Binding
Suppose a user pastes some plain text into a document. In principle, the text can contain any Unicode character. That includes virtually all characters used in the current languages of the world along with many ancient scripts and a plethora of symbols, mathematical and otherwise, that don’t belong to any language. The question arises as to what font(s) to use for the pasted characters. In general, the same font cannot be used for all characters, since TrueType glyph indices are 16-bit numbers thereby limiting fonts to 65536 glyphs. Meanwhile Unicode has over 140,000 named characters. Furthermore, even if a font ...
Switching from LaTeX to UnicodeMath Input Mode
Here’s a bit of a puzzle. When the user enters “a/b” in LaTeX mode and formats it with the Enter key, the ‘/’ is marked as “nobuildup”. If the user then switches to UnicodeMath input mode and enters a space after the linear fraction containing the ‘/’, the fraction won’t build up, by design. If you delete the / and reenter it, it’ll build up as usual in UnicodeMath mode. A problem is that the user cannot easily detect that the ‘/’ has the nobuildup attribute since the math ribbon doesn’t display that attribute. One might conclude that when switching from LaTeX to UnicodeMath input mode, Word should remove the ...
RichEdit HTML Support
RichEdit has had limited HTML support for many years, but it wasn’t general enough to document publicly. A recent RichEdit client (to be described in a future post) needs better support, so we have been improving it. For example, we have added HTML copy/paste, images, and math (of course!) to the Microsoft Office riched20.dll. Ideally RichEdit HTML should be able to represent any property that RichEdit RTF can represent. That still wouldn’t make RichEdit a general HTML editor replete with forms and JavaScript functionality. But it would add good interoperability with Office apps, Teams, and the web, all of which ...
Cool Windows Math Hot Key
The Windows key is used in a bunch of useful hot keys. Probably my favorite is Windows+Shift+s, which lets you copy any rectangular area on your screen(s) to the clipboard. I use this hot key a lot in describing application UI and other objects on the screen such as those in this post. I also use Windows+x y to see system info such as the name of the computer. Another cool Windows hot key is “Windows+.” or “Windows+;”. Using Windows+Shift+s, I get the image You can see related character displays by clicking on the characters in the bottom row. You can scroll these displays to see more characters. Clicking o...
Math Accessibility Trees
This post discusses aspects of making mathematical equations accessible to blind people. Equations that are simple typographically, such as 𝐸 = 𝑚𝑐², are accessible with the use of standard left and right arrow key navigation and with each variable and two-dimensional construct being spoken or felt when the insertion point is moved to them. At an insertion point, the user can edit the equation using the regular keyboard input methods, perhaps based on UnicodeMath or LaTeX, or using a refreshable braille display using Nemeth Braille. But it can be hard to visualize a more typographically complex equation, let alone...
Some UnicodeMath Enhancements
In the years since UnicodeMath 3.1 was published, some improvements have been made. The converter that converts UnicodeMath to OfficeMath also converts LaTeX and Nemeth math braille to OfficeMath. The converter needs ways to provide OfficeMath math-object arguments even when these arguments are not marked as such in the math format. The resulting infrastructure is available for converting all three formats to OfficeMath. n-aryands With all three formats, the n-aryand, e.g., integrand or summand, may not be identified by surrounding delimiters. But OfficeMath and MathType have n-aryand arguments as described in ...
RichEdit Emoticon Shortcuts
Seems many email messages and IM’s include emoji smiley faces like 😊. You just type :-) and you get 😉 whether you want it or not! About a year ago, the Microsoft 365 RichEdit started offering such a facility. This post describes the built-in emoticon shortcut strings and the corresponding emoji characters and the APIs for enabling the conversions. For a substantially larger set of emoticons, see https://en.wikipedia.org/wiki/List_of_emoticons. That list includes both Western and Eastern emoticons. The RichEdit emoticon shortcuts currently include only Western emoticons. The build-in emoticon shortcuts are defi...
RichEdit Hot Keys
In the early microcomputer days, MS-DOS editors like pmate and teco depended on hot keys for navigation and other tasks. With the great support for the mouse, touch, and graphical interface aids like ribbons incorporated into later personal computers, the need for navigation hot keys was greatly diminished. But there are other hot keys that can be incredibly handy, and some navigation keys are still used a lot. This post summarizes the hot keys built into RichEdit. A previous post published a summary of all RichEdit hot keys as of 2013, but that post got truncated, it’s missing some hot keys that were added re...
MathML and OMML User Selection Attributes
Some assistive technology (AT) programs use MathML or OMML as conduits for generating math speech and braille from math-enabled apps. In addition, they would like to use these formats for editing math text as well as speaking it and displaying it on refreshable braille displays. To this end, the formats need selection attributes that identify where the user selection is or should be within the MathML or OMML. These selection attributes are intended for accessibility purposes; MathML/OMML in copy, paste, and files ordinarily wouldn’t contain such attributes. User Selection A user selection can be degenerate, tha...
Displaying Enlarged Images in Popup Window
RichEdit clients may want to zoom images that the user clicks on. To satisfy this need, the Microsoft 365 version of RichEdit supports the EN_IMAGE notification, which notifies the RichEdit client when the mouse moves over an image or the image is clicked on. The client can then display an enlarged image in a new window by sending the RichEdit control an EM_DISPLAYIMAGE message. This approach is efficient since the image is already in the RichEdit control's memory and doesn't have to be streamed out and back into another control. RichEdit calls the Windows Imaging Component (WIC) to create a bitmap scaled to the ...
RichEditD2D Window Controls
This post is for desktop programmers who use RichEdit window controls in their applications and would like to have more display functionality. Examples of such controls include the desktop Outlook To, Cc, and Subject lines as well as WinForms RichTextBox controls, WordPad, and myriad other programs. It’s easy to create a RichEdit window control by calling CreateWindowEx(dwExStyle, lpClassName, …), where lpClassName identifies the class of RichEdit control. Up to January 2020, all RichEdit window controls use the RichEdit GDI/Uniscribe code path. Accordingly, they cannot benefit from the many improvements only ava...
MathML mfenced element deprecated on web
The MathML working group is planning to deprecate the <mfenced> element as well as the <mo> fence and separator attributes for use on the web. The justification is to simplify web implementations by deprecating MathML features that are redundant. This post explains how <mfenced> and the fence and separator attributes can be handled in other ways and it discusses the implications for OfficeMath. Emulating <mfenced> with <mrow> The MathML <mfenced> element is handy for representing a variety of delimited expressions, such as parenthesized, braced, and bracketed expressions. The...
How I got into technical word processing
This post is an update of an early post that doesn't appear to be archived. It tells a bit of how I started in technical word processing back in the middle of the last century. More precisely it was in 1965 that I started using a nifty (for that time) vector plotting program by Grey Freeman at the Yale Computer Center. I was a Yale grad student in theoretical physics working with Nobel Laureate Willis Lamb on laser theory. We needed to give presentations and we even made computer movies, back before anyone else had made them. After all, this was early in the days of computing! Grey’s program offered draftsman qua...