The post Switching from LaTeX to UnicodeMath Input Mode appeared first on Math in Office.

]]>One might conclude that when switching from LaTeX to UnicodeMath input mode, Word should remove the nobuildup effect from all ‘/’ in the current math zone. But the intent in LaTeX is not to build up a fraction with a ‘/’ since built-up fractions are entered in TeX and LaTeX using the special constructs \over and \frac, respectively. The only way the user can build up a nobuildup ‘/’ is to delete the ‘/’ and reenter it. ‘/’ is the only operator that’s marked as nobuildup automatically in LaTeX input mode. A ‘/’ that’s not marked as nobuildup in LaTeX mode is actually used in building up the TeX {<numerator>\over <denominator>} construct. The build-up engine supports TeX as well as LaTeX constructs, since users might use either.

In UnicodeMath input mode, you can mark an operator as “nobuildup” by preceding it with a \. So “a\/b” produces 𝑎/𝑏 and you can try to build it up with a space, but, by design, it won’t build up. It’s fairly common to want to have a simple linear fraction and that’s how it’s done. You can “quote” other operators to prevent them from building up. For example, you might want to quote delimiters, e.g., \{ and \}, which won’t then build up to fit their content.

Perhaps the math ribbons should display the “nobuildup” attribute. Then the user could see a difference. It’d also be handy for the math ribbon to display the bold and italic attributes, since these are commonly used in math zones for math-bold and math-italic characters.

The post Switching from LaTeX to UnicodeMath Input Mode appeared first on Math in Office.

]]>The post RichEdit HTML Support appeared first on Math in Office.

]]>Contents

The “HTML format” clipboard format includes header and comment data in addition to the HTML to be copied (see https://docs.microsoft.com/en-us/windows/win32/dataxchg/html-clipboard-format#description). This info needs to be added to copy HTML between RichEdit, Word, PPT, OneNote, Teams, and other apps. Frankly having to add this info seems like overkill. RTF can be copied and pasted without such overhead. We illustrate the format as written by RichEdit with the HTML for Einstein’s energy equation 𝐸 = 𝑚𝑐². In the HTML, OMML is the math format used by default since that’s what Word and PowerPoint expect. Here’s the HTML

Version:1.0 StartHTML:0000000105 EndHTML:0000000844 StartFragment:0000000417 EndFragment:0000000811 <html xml:lang="en" lang="en" xmlns="http://www.w3.org/1999/xhtml" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml"> <head><style>body{font-family:Arial,sans-serif;font-size:10pt;}</style> <style>.cf0{font-style:italic;font-family:Cambria Math;font-size:24pt;}</style></head> <body><!--StartFragment --><p><m:oMathPara><m:oMath class="cf0"> <span class="cf0"><m:r><i>𝐸</i></m:r></span> <span class="cf0"><m:r><i>=</i></m:r></span><span class="cf0"> <m:r><i>𝑚</i></m:r></span> <m:sSup><m:sSupPr><m:ctrlPr></m:ctrlPr></m:sSupPr><m:e><span class="cf0"> <m:r><i>𝑐</i></m:r></span></m:e><m:sup><span class="cf0"><m:r><i>2</i> </m:r></span></m:sup></m:sSup></m:oMath></m:oMathPara></p> <!--EndFragment --></body></html>

Here the StartHTML entry in the header gives the character position (cp) offset of the HTML <body> and EndHTML gives the cp at the end of the HTML <body>. The StartFragment gives the cp of the text that the user selected and the EndFragment gives the cp at the end of the selection. In this example, the equation 𝐸 = 𝑚𝑐² is selected and displayed on its own line (display mode rather than inline mode). The start of the displayed equation is given by the OMML <m:oMathPara>. The corresponding MathML including an mml: prefix is

<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="block"> <mml:mi>E</mml:mi> <mml:mo>=</mml:mo> <mml:mi>m</mml:mi> <mml:msup> <mml:mi>c</mml:mi> <mml:mn>2</mml:mn></mml:msup></mml:math>

The Programming details section describes how to write HTML with OMML or MathML with and without the mml: prefix. The HTML5 standard includes MathML without a prefix. RichEdit can write and read HTML with all three math formats.

Character formatting includes font and family, height, text and back color, weight, spacing, bold, italic, underline, strikeout, subscript, superscript, small caps, all caps and hyperlinks. Paragraph formatting includes numbered and bulleted lists, left, right, and centered alignments, and paragraph margins.

RichEdit can read and write the HTML <img> element with a src attribute that has a base64 encoding of the binary image data. This is a technique used widely in Microsoft Office for HTML copy/paste. For example, the tag might begin with “<img src=\”data:image/png;base64,”.

HTML content can be read in and out via messages, hot keys (Ctrl+c, Ctrl+v, Ctrl+x), and TOM methods.

A client can get HTML content by sending the EM_STREAMOUT message with wParam = SF_HTML | SF_BINARY. The SF_BINARY (0x0008) is needed to write the data in the RichEdit binary format to temporary memory and then the SF_HTML (0x00100000) writes that data out as HTML. If clipboard HTML is desired, OR the SF_CLIPBOARD (0x80000000) flag into wParam.

A client can stream in HTML content by sending the EM_ISTREAMIN message (WM_USER + 252), which streams in using the IStream interface pointed to by the lParam instead of using the usual EDITSTREAM struct. This choice is due to use of the Office HTML parser for input and the mso.dll must be loaded for that to work. Set wParam equal to 1, which signifies HTML. Currently only HTML can be streamed in using the EM_ISTREAMIN message.

Other messages that can be used are WM_COPY, WM_PASTE, WM_CUT, and EM_PASTESPECIAL which are all described on the web.

In addition to the ITextRange::Copy() and ITextRange::Paste() methods, you can input HTML content into a range by calling ITextRange2::SetText2(tomConvertHtml, bstr), where tomConvertHtml is given by 0x00900000. Similarly, you can get the HTML content from a range by calling ITextRange2:GetText2(tomConvertHtml, pbstr).

By default, RichEdit writes equations in HTML in the OMML format since that format is what Office apps like Word and PowerPoint expect. But it can write equations in MathML with or without an mml: prefix. The function to call to set which math format to use is ITextDocument2::SetMathProperties() with tomHtmlOMML, tomHtmlMathML, or tomHtmlMath

tomHtmlMathFormatMask = 0x00300000, // Mask for math-format flags tomHtmlOMML = 0, // m: tomHtmlMathML = 0x00100000, // mml: tomHtmlMath = 0x00200000, // No prefix MathML (HTML5)

The post RichEdit HTML Support appeared first on Math in Office.

]]>The post Cool Windows Math Hot Key appeared first on Math in Office.

]]>

Another cool Windows hot key is “Windows+.” or “Windows+;”. Using Windows+Shift+s, I get the image

You can see related character displays by clicking on the characters in the bottom row. You can scroll these displays to see more characters. Clicking on a character not in the top or bottom row inserts that character into your active app.

Now click on the Ω in the top row of characters. You get

Here I’ve selected the ∞ page. This has many fractions, numeric subscripts and superscripts, ∞ of course, and, if you scroll down you can access myriad math symbols such as ∰. For example, you can see

I wish C++ would use some of those relational operators instead of being stuck in the 1960’s with ASCII sequences like “!=” instead of ‘≠’! Clicking on the ⨀ in the bottom row reveals still more math symbols.

The Assistive Technology (AT) program Narrator describes some of the features of these displays and knows how to speak the emoji characters. But it doesn’t know how to speak most of the advanced math symbols. In fact, I have a hard time speaking some of those symbols myself, like ⋈ (bowtie?)

The post Cool Windows Math Hot Key appeared first on Math in Office.

]]>The post Math Accessibility Trees appeared first on Math in Office.

]]>More than one kind of tree is possible and this post compares two trees for the equation

Each tree node is labeled with its math text in UnicodeMath along with the type of node. UnicodeMath lends itself to being spoken especially if processed a bit to speak things like 𝑎² as “a squared” in the current natural language as described in Speaking of math…. The first kind of tree corresponds to the traditional math layout used in documents, while the second kind corresponds to the mathematical semantics. Accordingly we call the first kind a *display tree* and the second a *semantic tree*.

More specifically, the display tree corresponds to the way TeX and OfficeMath display mathematical text and approximates the way Presentation MathML represents mathematical text. Mathematical layout entities such as fractions, integrals, roots, subscripts and superscripts are represented by nodes in trees. Binary and relational operators that don’t require special typography other than appropriate spacing are included in text nodes. The display tree for the equation above is

└─Math zone └─ “1/2π ∫_0^2π ⅆ𝜃/(𝑎+𝑏 sin 𝜃) = 1/√(𝑎²−𝑏²)” ├─ “1/2π” fraction │ ├─ “1” numerator │ └─ “2π” denominator ├─ “∫_0^2π ⅆθ/(𝑎+𝑏 sin 𝜃)” integral │ ├─ “0” lower limit │ ├─ “2π” upper limit │ └─ “ⅆθ/(𝑎+𝑏 sin 𝜃)” integrand │ └─ “ⅆθ/(𝑎+𝑏 sin 𝜃)” fraction │ ├─ “ⅆθ” numerator │ └─ “𝑎+𝑏 sinθ” denominator │ ├─ “𝑎+𝑏” text │ └─ “sin𝜃” function apply │ ├─ “sin” function name │ └─ 𝜃” argument ├─ “=” text └─ “1/√(𝑎²−𝑏²)” fraction ├─ “1” numerator └─ “√(𝑎²−𝑏²)” denominator └─ “√(𝑎²−𝑏²)” radical ├─ “⬚” degree └─ “𝑎²−𝑏²” radicand ├─ “𝑎²” superscript │ ├─ “𝑎” base | └─ “2” script ├─ “−” text └─ “𝑏²” superscript ├─ “𝑏” base └─ “2” script

Note that the invisible times implicit between the leading fraction and the integral isn’t displayed and the expression 𝑎 + 𝑏 sin*θ* is displayed as a text node 𝑎 + 𝑏 followed by a function-apply node sin*θ*, without explicit nodes for the + and an implied invisible times.

To navigate through the 𝑎 + 𝑏 and into the fractions and integral, one can use the usual text left and right arrow keys or their braille equivalents. In OfficeMath, one can navigate through the whole equation with these arrow keys, but it’s helpful also to have coarser grained navigation keys to go between sibling nodes and up to parent nodes. For the sake of discussion, let’s suppose the tree navigation hot keys are those defined in the table

Ctrl+→ | Go to next sibling |

Ctrl+← | Go to previous sibling |

Home | Go to parent ahead of current child |

End | Go to parent after current child |

For example starting at the beginning of the equation, Ctrl+→ moves past the leading fraction to the integral, whereas → moves to the start of the numerator of the leading fraction. Starting at the beginning of the upper limit, Home goes to the insertion point between the leading fraction and the integral, while End goes to the insertion point in front of the equal sign. Ctrl+→ and Ctrl+← allow a user to scan an equation rapidly at any level in the hierarchy. After one of these hot keys is pressed, the linear format for the object at the new position can be spoken in a fashion quite similar to ClearSpeak. When the user finds a position of interest, s/he can use the usual input methods to delete and/or insert new math text.

Now consider the semantic tree, which allocates nodes to all binary and relational operators as well as to fractions, integrals, etc.

└─Math zone └─ “1/2𝜋 ∫_0^2𝜋 ⅆ𝜃/(𝑎+𝑏 sin𝜃)=1/√(𝑎²− 𝑏²)” └─ “=” text ├─ “⊠” implied times │ ├─ “1/2𝜋” fraction │ │ ├─ “1” numerator │ │ └─ “2π” denominator │ └─ “∫_0^2𝜋 ⅆ𝜃/(𝑎+𝑏 sin𝜃)” integral │ ├─ “0” lower limit │ ├─ “2π” upper limit │ └─ “ⅆ𝜃/(𝑎+𝑏 sin𝜃)” integrand │ └─ “ⅆ𝜃/(𝑎+𝑏 sin𝜃)” fraction │ ├─ “ⅆ𝜃” numerator │ │ └─ “⊠” implied times │ │ ├─ “ⅆ” text │ │ └─ “𝜃” text │ └─ “𝑎+𝑏 sin𝜃” denominator │ └─ “+” text │ ├─ “𝑎” text │ └─ “𝑏 sin𝜃” function apply │ └─ “⊠” implied times │ ├─ “𝑏” text │ └─ “sin𝜃” function │ └─ “” function apply │ ├─ “sin” function name │ └─ “𝜃” argument └─ “1/√(𝑎²− 𝑏²)” fraction ├─ “1” numerator └─ “√(𝑎²− 𝑏²)” denominator └─ “√(𝑎²− 𝑏²)” radical ├─ “⬚” degree └─ “𝑎²− 𝑏²” radicand └─ “−” text ├─ “𝑎²” superscript │ ├─ “𝑎” base │ └─ “2” script └─ “𝑏²” superscript ├─ “𝑏” base └─ “2” script

The semantic tree corresponds to Content MathML. It has drawbacks: 1) it’s bigger and requires more key strokes to navigate, 2) it doesn’t correspond to speech order, and 3) it requires a Polish-prefix mentality. Some people have developed such a mentality, perhaps having used HP calculators, and prefer it. But it’s definitely an acquired taste and it doesn’t correspond to the way that mathematics is conventionally displayed, edited, and spoken. Accordingly the first kind of tree seems significantly better for speech and editing, at least for the math encountered in grades K-12.

The choice for higher-level math is complicated by the fact that the usual meanings for superscripts, vertical bars, and other notation may be incorrect. For example, exponents are usually powers and it’s appropriate to speak 𝑎² as “a squared”. But in tensor analysis, exponents can be indices and saying them as powers is incorrect. One way around this is to say 𝑎² as “a superscript 2” or “a sup 2”, but it would be better to know the author’s intent and generate more descriptive speech. Another example is |𝑥|. In math up through calculus, this is the absolute value of 𝑥. However, in higher-level math it could mean the cardinality of the set 𝑥, or something else. In these cases and many others in advanced math, the semantic tree might reveal the author’s intent better than the display tree.

The MathML working group is studying ways to make Presentation MathML support accurate speech for ambiguous mathematical notations.

Both kinds of trees include nodes defined by the OMML entities listed in the following table along with the corresponding MathML entities

Built-up Office Math Object | OMML tag | MathML |

Accent |
acc | mover/munder |

Bar |
bar | mover/munder |

Box |
box | menclose (approx) |

BoxedFormula |
borderBox | menclose |

Delimiters |
d | mfenced or mrow with mo’s |

EquationArray |
eqArr | mtable (with alignment groups) |

Fraction |
f | mfrac |

FunctionApply |
func | mrow with &FunctionApply; |

LeftSubSup |
sPre | mmultiscripts (special case of) |

LowerLimit |
limLow | munder |

Matrix |
m | mtable |

Nary |
nary | mrow followed by msubsup w n-ary mo |

Phantom |
phant | mphantom and/or mpadded |

Radical |
rad | msqrt/mroot |

GroupChar |
groupChr | mover/munder |

Subscript |
sSub | msub |

SubSup |
sSubSup | msubsup |

Superscript |
sSup | msup |

UpperLimit |
limUpp | mover |

Ordinary text |
r | mtext |

MathML has additional nodes, some of which involve infix parsing to recognize, e.g., integrals. The OMML entities were defined for typographic reasons since they require special display handling. Interestingly the OMML entities also include useful semantics, such as identifying integrals and trigonometric functions without special parsing.

In summary, math zones can be made accessible using display trees for which the node contents are spoken in the localized linear format and navigation is accomplished using simple arrow keys, Ctrl arrow keys, and the Home and End keys, or their Braille equivalents. Arriving at any particular insertion point, the user can hear or feel the math text and can edit the text in standard ways.

The post Math Accessibility Trees appeared first on Math in Office.

]]>The post Some UnicodeMath Enhancements appeared first on Math in Office.

]]>With all three formats, the *n*-aryand, e.g., integrand or summand, may not be identified by surrounding delimiters. But OfficeMath and MathType have *n*-aryand arguments as described in the post Integrands, Summands, and Math Function Arguments. UnicodeMath has the binary operator U+2592 (▒) to treat the expression that follows the ▒ as the *n*-aryand (see Section 3.4 of UnicodeMath 3.1). In generalizing the conversion code for LaTeX and braille, it became clear that a space alone is adequate for starting *n*-aryands and we don’t need the ▒, which doesn’t look like mathematics. So, the converter now makes the first expression that follows the *n*-ary operator and limits into the *n*-aryand. For example, the integral

can be given by the UnicodeMath 1/2π ∫_0^2π ⅆθ/(a+b sin θ)=1/√(a^2-b^2) since the first expression that follows the ∫_0^2π is the fraction ⅆθ/(a+b sin θ). This works for many integrands. More complicated integrands are usually enclosed in brackets, braces, or parentheses.

A “bare” matrix, that is, one with no enclosing brackets can be entered by typing the TeX control word \matrix. In addition, there are five matrix constructs with enclosing brackets that can be entered as summarized in the following table in which … stands for the matrix contents.

LaTeX |
Char |
Code |
Form |

\matrix | ■ | U+25A0 | … |

\bmatrix | ⓢ | U+24E2 | […] |

\pmatrix | ⒨ | U+24A8 | (…) |

\vmatrix | ⒱ | U+24B1 | |…| |

\Bmatrix | Ⓢ | U+24C8 | {…} |

\Vmatrix | ⒩ | U+24A9 | ‖…‖ |

The UnicodeMath syntax for a parenthesized 2×2 matrix is \pmatrix(a&b@c&d), which builds up as

Sometimes you just want to enter a sample matrix quickly. If any of the six matrix control words are followed by a digit *d*, they insert a *d *× *d* identity matrix. For example, typing \pmatrix 3 enters

This is easier to type than \pmatrix(1&0&0@0&1&0@0&0&1), which displays the same identity matrix. Some of the matrix control words are missing in the default math autocorrect file. You can add them as described in the last section of this post.

This trigonometric expression is ambiguous: is it sin(𝑥²) or (sin 𝑥)²? Without the parentheses, the UnicodeMath for the former is “sin x^2” and for the latter is “sin x ^2”. In the latter, the space following the x builds up the sin x into a math function object and then the ^2 squares the object. But the results are very different formulas. The converter avoids the ambiguity by building up “sin x ^2” to be the same math function object as “sin^2 x”, that is, sin² 𝑥.

You can enter the common LaTeX expressions \frac{a}{b} and \binom{n}{m} in UnicodeMath input mode provided you have added math autocorrect entries to convert \frac to ⍁ (U+2341) and \binom to ⒝ (U+249D). To add math autocorrect entries, click on the lower-right box in the Equations/Conversions ribbon option to display the dialog box

Then click on the Math AutoCorrect… button to see and add math autocorrect entries. For example, to add \frac with U+2341, type as in the dialog box

And then enter Alt+x to convert the 2341 to ⍁. Probably when you type LaTeX in UnicodeMath input mode, a dialog ought to appear asking you if you’d like to switch to LaTeX input mode.

The post Some UnicodeMath Enhancements appeared first on Math in Office.

]]>The post RichEdit Emoticon Shortcuts appeared first on Math in Office.

]]>The build-in emoticon shortcuts are defined in the table

Type |
Get |
Unicode |

`%)` |
U+1F615 | |

`0:)` |
U+1F607 | |

`:'(` |
U+1F622 | |

`:')` |
U+1F602 | |

`:'-(` |
U+1F622 | |

`:'-)` |
U+1F602 | |

`:(` |
U+02639 | |

`:)` |
U+0263A | |

`:+1:` |
U+1F44D | |

`:-(` |
U+02639 | |

`:-)` |
U+1F60A | |

`:-D` |
U+1F603 | |

`:-o` |
U+1F632 | |

`:-p` |
U+1F61D | |

`:-|` |
U+1F610 | |

`:D` |
U+1F603 | |

`:fire:` |
U+1F525 | |

`:grin:` |
U+1F601 | |

`:o` |
U+1F632 | |

`:p` |
U+1F61D | |

`:smile:` |
U+1F604 | |

`:yum:` |
U+1F60B | |

`:|` |
U+1F610 | |

`;)` |
U+1F609 | |

`;-)` |
U+1F609 | |

`</3` |
U+1F494 | |

`<3` |
U+02764 | |

`>:)` |
U+1F608 | |

`B-)` |
U+1F60E |

The emoticon shortcut facility is incorporated into the RichEdit autocorrect facility. To enable the autocorrect facility, send the message EM_SETAUTOCORRECTPROC with wparam = an AutoCorrectProc callback pointer. If you don’t want to implement an autocorrect callback, set wparam = 1. This activates the built-in math autocorrect facility in math zones. It also activates emoticon shortcuts if they’re enabled. To enable the emoticon shortcuts, get the current language-option flags by sending EM_GETLANGOPTIONS, OR in IMF_EMOTICONSHORTCUTS (0x8000), and send EM_SETLANGOPTIONS with lparam equal to the result. The emoticon-shortcut option is disabled by default. Have fun

The post RichEdit Emoticon Shortcuts appeared first on Math in Office.

]]>The post RichEdit Hot Keys appeared first on Math in Office.

]]>This post summarizes the hot keys built into RichEdit. A previous post published a summary of all RichEdit hot keys as of 2013, but that post got truncated, it’s missing some hot keys that were added recently, and the hyperlinks need updating. Note that RichEdit clients, e.g., OneNote, often handle all hot keys with RichEdit never seeing the corresponding keyboard messages. Since the client receives the keyboard input, it can do whatever it wants to with that input. This flexibility is valuable particularly for localizing hot keys. RichEdit is “globalized”, but not localized. A number of the hot keys described in this post are English-centric and should be localized by the client. Other hot keys are global by nature and can be used in any locale.

The post Entering Unicode Characters explains several ways to enter arbitrary Unicode characters into applications. My favorite general-purpose way is via Alt+x, which works in Word, Outlook, OneNote, and RichEdit-based programs like WordPad. It ought to work in *all* editors! (Sadly, it doesn’t work in PowerPoint, Excel or Visual Studio, although it’d be easy for these programs to implement it ). It works by entering the Unicode hex code for the character followed by Alt+x. So, entering 2260 Alt+x enters ≠. Entering 1d44e Alt+x enters 𝑎, which is math-italic a. I use this hot key almost as often as I use Ctrl+c (copy) and Ctrl+v (paste). When I’m writing code in Visual Studio, I keep a program running RichEdit handy for entering Unicode symbols. Programs are easier to read with real Unicode characters instead of workarounds using the \xXXXX notation. You can also copy the symbols from appropriate web pages such as Mathematical operators and symbols in Unicode which has most math symbols. But if you know the Unicode code point, the Alt+x hot key is faster. It also lets you find out a character’s Unicode hex value from the character since Alt+x is a toggle: convert hex to character; convert character to hex. Try it, you will like it!

A pair of globalized hot keys set the BiDi directionality of a paragraph. These hot keys depend on knowing the difference between right and left. The WM_KEYDOWN message passes information in the message lparam that specifies right Shift or left Shift. Specifically, byte 2 of the lparam gives the key’s scan code and the value 0x36 is the scan code for the right-shift key ever since IBM shipped its first PC in August, 1981. This information lets RichEdit handle the Ctrl+RightShift hot key to switch the BiDi paragraph directionality to RTL (right to left). Similarly, Ctrl+LeftShift switches to LTR (left to right). RichEdit tracks which Alt, Shift, and Ctrl keys are depressed at any given time. This enables it to differentiate between left Alt for menus and right Alt (AltGr) for keyboard commands. But the most important use is for the Ctrl+RightShift and Ctrl+LeftShift hot keys. Lots of other Word hot keys are implemented in RichEdit.

RichEdit supports Word’s standard subscript and superscript hot keys: Ctrl+= and Ctrl+Shift+=, respectively. These hot keys toggle their respective states. For example, if you type some text, Ctrl+=, and some more text, the latter will be subscripted up until you type Ctrl+= again to go back on the base line. If you type one of these hot keys while some text is selected, that text’s script character will be toggled accordingly. In UnicodeMath, subscripts and superscripts are usually entered with the _ and ^ operators as in [La]TeX, or via the ribbon. But the standard hot keys can be handy provided the scripts are not nested. These hot keys have different meanings in a math zone: Ctrl+= builds up LaTeX or UnicodeMath into OfficeMath and Ctrl+Shift+= builds OfficeMath down into LaTeX or UnicodeMath.

The Ctrl+} hot key is copied from the Visual Studio program editor. Ctrl+} moves the insertion point from one end of a bracketed expression (…), […], {…} to the other end. This is very handy for examining text with nested parentheses or curly braces (RTF, LaTeX, computer programs, JSON, etc.).

Arrow, PgUp/PgDn, and Home/End key behavior is summarized in the following table for ordinary text (behavior in math zones may differ). A depressed state of the Shift, Ctrl, and Alt keys is given by ✓; else the key isn’t depressed.

Key |
Shift |
Ctrl |
Action |

← | Move left char | ||

← | ✓ | Move left word | |

← | ✓ | Select left char | |

← | ✓ | ✓ | Select left word |

↑ | Move up line | ||

↑ | ✓ | Move to start of paragraph | |

↑ | ✓ | Select up line | |

↑ | ✓ | ✓ | Select to start of paragraph |

→ | Move right char | ||

→ | ✓ | Move right word | |

→ | ✓ | Select right char | |

→ | ✓ | ✓ | Select right word |

↓ | Move down line | ||

↓ | ✓ | Move to end of paragraph | |

↓ | ✓ | Select down line | |

↓ | ✓ | ✓ | Select to end of paragraph |

PgUp | Move up one screen | ||

PgUp | ✓ | Move to start of screen | |

PgUp | ✓ | Select up one screen | |

PgUp | ✓ | ✓ | Select to start of screen |

PgDn | Move down one screen | ||

PgDn | ✓ | Move to end of screen | |

PgDn | ✓ | Select down one screen | |

PgDn | ✓ | ✓ | Select to end of screen |

Home | Move to start of line | ||

Home | ✓ | Move to start of story | |

Home | ✓ | Select to start of line | |

Home | ✓ | ✓ | Select to start of story |

End | Move to end of line | ||

End | ✓ | Move to end of story | |

End | ✓ | Select to end of line | |

End | ✓ | ✓ | Select to end of story |

Arrow-key behavior in vertical text corresponds to the different direction. For example, ↓ goes to the next character instead of going to the next line. See Math Selection for a discussion of how the navigation keys work in a math zone. An important point is that if you select a math structure character (start of object, end of object, or end of argument), the whole object is automatically selected.

Typically typing the Tab key inserts a Tab character (U+0009). But depending on context, the Tab key may turn into a navigation key. For example, in a table cell, the Tab key goes to the next cell and Shift+Tab goes to the previous cell (if any). If the selection is in the last cell of a table, the Tab key inserts a new row after the last row with the insertion point in the first cell of the new row.

In math zones, the Tab key goes to the next argument of the current math function and the Shift+Tab key goes to the previous argument. This behavior was originally scheduled for Word as well, but got postponed.

In dialog window controls, Tab characters are ignored. This allows dialogs to use the Tab character to move from control to control.

The Enter key usually inserts an end-of-paragraph character (U+000D—carriage return) and the Shift+Enter key inserts an end-of-line character (U+000B—VT). See Paragraphs and Paragraph Formatting for a discussion of the differences between these kinds of insertions. At the end of a table row, the Enter key inserts a new row after the current row. Inside a math object argument, an Enter key inserts an equation array. This is handy for the lower limit of n-ary objects like summations, which may have more than one subscript range. In a display math zone, Shift+Enter starts a new equation (see The Math Paragraph for details).

If the current selection is nondegenerate (selects one or more characters), the Delete key deletes the selected characters. If the current selection is degenerate, i.e., an insertion point (IP), the Delete key usually deletes the character immediately following the IP. If the character is followed by one or more combining marks, the Delete key deletes the whole combining-mark sequence. Similarly if the character is followed by a variation selector, the Delete key deletes the whole variation-selector sequence. If the Ctrl key is pressed for an insertion point, the Delete key deletes the word following the IP. See Math Selection for a discussion of how the Delete and Backspace keys work in math zones. In particular, the math object is selected if you type Delete at the start of the object or Backspace at the end the object. A second Delete or Backspace then deletes the object. This behavior exists so that you don’t delete things by mistake. If you do so anyway, you can always undo your deletion by typing Ctrl+Z.

The Backspace key is similar to the Delete key but has some differences in addition to operating on the character preceding the insert point. If the current selection is nondegenerate, the Backspace key acts the same as the Delete key and deletes the selected characters. If the current selection is degenerate, i.e., an insertion point, the Backspace key usually deletes the character immediately preceding the insertion point. If that character is a combining mark, the Backspace key deletes that combining mark alone. This differs from the Delete key at the start of a combining-mark sequence, which deletes the whole combining-mark sequence. If the preceding character is a variation selector, the Backspace key deletes the whole variation-selector sequence. If the Ctrl key is pressed for an insertion point, the Backspace key deletes the word preceding the IP. See Math Selection for a discussion of how the Backspace key works in math zones. In particular, the math object is selected if you type Backspace at the end the object. A second Backspace then deletes the object. Alt+Backspace is an alias for Ctrl+Z (undo).

The following table lists additional hot keys handled by RichEdit

Key |
Shift |
Ctrl |
Alt |
Action |

= | ✓ | Toggle subscript mode (not in math zone) Build up selected math text (in math zone) | ||

= | ✓ | ✓ | Toggle superscript mode (not in math zone) Build down selected math text (in math zone) | |

= | ✓ | Insert math zone | ||

= | ✓ | ✓ | Build down selected math text | |

= | ✓ | ✓ | ✓ | Build up selected math text (doesn’t have to be in math zone) |

– | ✓ | Insert soft hyphen (U+00AD) | ||

– | ✓ | ✓ | Insert nonbreaking hyphen (U+20✓✓) | |

, | ✓ | Cedilla accent dead key (English only) | ||

‘ | ✓ | Acute accent dead key (English only) | ||

“ | ✓ | ✓ | Smart quotes (English only) | |

~ | ✓ | ✓ | Tilde accent dead key (English only) | |

; | ✓ | Dieresis accent dead key (English only) | ||

` | ✓ | Grave accent dead key (English only) | ||

> | ✓ | ✓ | Make font bigger | |

< | ✓ | ✓ | Make font smaller | |

! | ✓ | ✓ | ✓ | Insert ¡ (inverted !, English only) |

? | ✓ | ✓ | ✓ | Insert ¿ (English only) |

} | ✓ | Move to other end of bracketed expression (…), […], {…} | ||

1 | ✓ | Single spacing | ||

2 | ✓ | Double spacing | ||

5 | ✓ | 1.5 spacing | ||

6 | ✓ | Caret accent dead key (English only) | ||

A | ✓ | Select All | ||

A | ✓ | ✓ | Toggle all caps | |

B | ✓ | Toggle bold | ||

C | ✓ | Copy selection (Ctrl+Insert is an alias) | ||

E | ✓ | Center selected paragraph(s) | ||

E | ✓ | ✓ | Insert € (except for languages noted below) | |

I | ✓ | Toggle italic | ||

J | ✓ | Justify selected paragraphs(s) | ||

L | ✓ | Left align selected paragraph(s) | ||

L | ✓ | ✓ | Cycle through bullet/numbering types | |

Q | ✓ | Alias for alt+= | ||

R | ✓ | Right align selected paragraph(s) | ||

U | ✓ | Toggle underline | ||

V | ✓ | Paste (Shift+Insert is an alias) | ||

X | ✓ | Copy selection and delete it | ||

X | ✓ | Convert from hex to Unicode and vice versa | ||

Y | ✓ | Redo | ||

Z | ✓ | Undo | ||

F3 | ✓ | If first selected letter is lower-case, change to title case; else change to lower case | ||

F8 | ✓ | ✓ | ✓ | Turn on table autofit |

F12 | ✓ | ✓ | ✓ | Same as Alt+X |

The Euro (€) isn’t inserted by Ctrl+Alt+E for the following languages: UK English, Eire English, Polish, Portuguese, Hungarian, Vietnamese, New Tai Lue, Ogham, Hawaiian, Gaelic, Sesotho, Twana, Kyrgyz, Igbo, Latvian, Georgian, Hebrew, Pashto, Latin, Maltese, Cherokee, Myanmar, Sinhalese, Syriac, Inuktitut, Khmer, Tibetan, and Hindi.

The post RichEdit Hot Keys appeared first on Math in Office.

]]>The post MathML and OMML User Selection Attributes appeared first on Math in Office.

]]>A user selection can be degenerate, that is, an insertion point, or nondegenerate in which case it selects one or more ranges of characters. Multiple disjoint selections (multiple ranges) can be made by using the Ctrl key and clicking appropriately. For math editing, multiple selections aren’t generally very useful, and this post doesn’t treat them. A nondegenerate selection has an *active* end, the end that moves when you enter Shift + an arrow key, and an *anchor* end. The two ends coincide for an insertion point.

To specify the locations of the selection ends in any MathML/OMML content, we define the attribute names selActiveEnd, selAnchorEnd, and selIP (insertion point). The values for these attributes are given in the table

“before” | Before math zone |

“after” | After math zone |

“n“ |
At offset n into an element |

The most common attribute is selIP with the value “0”, i.e., an insertion point at the start of the element, such as the MathML <mi selIP=”0″>a</mi> or the OMML <m:t selIP=”0”>𝑎</m:t>.

With elements containing more than one character like the MathML <mi>sin</mi>, the insertion point might be in between the ‘s’ and the ‘i’, in which case one has <mi selIP=”1″>sin</mi> and the OMML <m:t selIP=“1”>sin</m:t>. If the user then enters Shift+→ to select the ‘i’, the MathML is <mi selAnchorEnd=”1″ selActiveEnd=”2″>sin></mi> and the OMML is <m:t selAnchorEnd=”1″ selActiveEnd=”2″>sin</m:t>.

Another case is for an IP at the end of an object argument. For example, in the MathML fraction <mfrac><mn>1</mn><mn>2</mn></mfrac>, if the IP follows the ‘2’ in the denominator, the selIP attribute appears in MathML as <mn selIP=”1″>2</mn>. This IP is at the end of the denominator, not at the end of the fraction, and entering a character puts the character in the denominator following the ‘2’. The corresponding OMML is <m:t selIP=”1″>2</m:t>.

The offset *n* is given in code units of the Unicode encoding in use. Microsoft Office apps use UTF-16 for which most math alphanumerics are surrogate pairs, that is, 2 code units. If a fraction denominator is 𝑥, an IP following the 𝑥 is specified for a UTF-16 implementation by the MathML as <mi selIP=”2″>x</mi> even though MathML uses the single-unit ASCII letter x to represent the surrogate-pair math-alphabetic 𝑥 (U+1D465). This choice is synchronized with the selection in memory. In OMML, math alphanumerics aren’t translated to ASCII, so this size difference doesn’t occur.

An IP or selection end can follow the last element of a parent element such as being after the parenthesized expression in (𝑎 + 𝑏)² but still in the base of the superscript object. For an IP, this is given in MathML by an empty <mrow selIP=”0″/>:

<msup> <mrow> <mrow><mo>(</mo><mi>a</mi><mo>+</mo><mi>b</mi><mo>)</mo></mrow> <mrow selIP="0"/></mrow> <mn>2</mn></msup>

In OMML, it’s given by an empty run <m:r selIP=”0″/>. The empty <mrow> or empty <m:r> is also used for an IP at the end of the math zone, but still in the math zone.

If an IP is at the start of a math object, such as a fraction, but before the first argument, the selection attribute goes in the math-object element. For example, for the fraction “1 over 2”, an IP at the start of the fraction is indicated by <mfrac selIP=”0″> in MathML and by <m:f selIP=”0″> in OMML.

A selection may start before a math zone and select part or all of the math zone. Similarly, it can start inside the math zone and extend beyond it. The attribute values “before” and “after” are used for such cases, respectively. For example, the math in the statement (selection is highlighted) “the Pythagorean Theorem is 𝑎² + 𝑏² = 𝑐²” is represented by the MathML

<math selAnchorEnd="before"> <msup><mi>a</mi><mn>2</mn></msup> <mo selActiveEnd="0">+</mo> <msup><mi>b</mi><mn>2</mn></msup> <mo>=</mo> <msup><mi>c</mi><mn>2</mn></msup> </math>

where the active end follows the 𝑎². If the whole math zone is embedded in a selection with the active end at the selection end, <math selAnchorEnd=”before” selActiveEnd=”after”> is used. A selected empty math-zone place holder can be represented in MathML by <math selIP=”0″/>.

Although it would be possible to allow selection attributes on most any element, it’s simpler to process and nevertheless general enough to restrict the elements that take selection attributes to

- Character elements: MathML <mi>, <mn>, <mo>, <mtext> and OMML <m:t>
- MathML <mrow> and OMML <m:r>
- Math object elements like MathML <mfrac> and OMML <m:f>
- Math zone element: MathML <math> and OMML <oMath>

The post MathML and OMML User Selection Attributes appeared first on Math in Office.

]]>The post Displaying Enlarged Images in Popup Window appeared first on Math in Office.

]]>Enable the EN_IMAGE notification by sending an EM_SETEVENTMASK message with lParam equal to an event mask that includes the ENM_IMAGE flag defined by

#define ENM_IMAGE 0x00000400 // Event mask for mouse over image

In RichEdit window controls, the notification is sent to the parent window packaged in a WM_NOTIFY message with lParam being a pointer to an ENIMAGE struct defined by

typedef struct _enimage { NMHDR nmhdr; // Notification header UINT msg; // Message causing notification, e.g. WM_LBUTTONDOWN WPARAM wParam; // Msg wParam LPARAM lParam; // Msg lParam IMAGEDATA ImageData; // Image Data } ENIMAGE;

where nmhdr.code = EN_IMAGE defined by

#define EN_IMAGE 0x0721 // Notification when mouse is over an image

IMAGEDATA is defined by

typedef struct _imagedata { LONG cp; // cp of image in RichEdit control IMAGETYPE Type; // Image type LONG Width; // Image width in HIMETRIC units LONG Height; // Image height in HIMETRIC units } IMAGEDATA;

and IMAGETYPE is defined by

typedef enum _IMAGETYPE { IT_NONE, IT_BMP, IT_GIF, IT_JPEG, IT_PNG, IT_ICO, IT_TIFF, IT_WMP, IT_UNKNOWN // User installed WIC codec } IMAGETYPE;

In windowless RichEdit controls, EN_IMAGE is passed to the host via an ITextHost::TxNotify() call. If the image is singly selected, RichEdit doesn’t send EN_IMAGE notifications so that users can use the mouse to resize the image.

Clients can display an enlarged image whenever desired by sending the EM_DISPLAYIMAGE message defined by

#define EM_DISPLAYIMAGE (WM_USER + 386)

The message wParam is a pointer to an IMAGEDATA structure defined above. The message lParam is an ID2D1RenderTarget* for the target window. The client should supply the desired new IMAGEDATA::Width and Height in HIMETRIC units. For example, on receipt of an EN_IMAGE notification, the client can use the data in the IMAGEDATA struct included in the ENIMAGE notification struct. The Width and Height values determine the image aspect ratio, which should be maintained in the enlarged image.

Here is an example with an image of the Matterhorn in the edit control (upper image) and an enlarged image below it

The post Displaying Enlarged Images in Popup Window appeared first on Math in Office.

]]>The post RichEditD2D Window Controls appeared first on Math in Office.

]]>In January 2020, the Microsoft 365 RichEdit introduced a D2D/DirectWrite RichEdit window control with the new window class “RichEditD2D”. It uses D2D/DirectWrite for text and images and the window’s HDC for rendering embedded objects and printing. On my laptop, the Microsoft 365 RichEdit is housed in C:\Program Files\Microsoft Office\root\vfs\ProgramFilesCommonX64\Microsoft Shared\OFFICE16\riched20.dll.

The RichEditD2D window class is implemented using the ID2D1Factory::CreateDCRenderTarget() method to create a highly functional ID2D1RenderTarget for an HDC. Image display doesn’t need an HDC and is rendered correctly on the D2D/DirectWrite path. OLE objects need an HDC and are queued up for rendering after the D2D/DirectWrite rendering finishes. It’s important to support OLE objects partly because the desktop Outlook To and Cc resolved email addresses are OLE objects.

The RichEditD2D window class works well with the Win32 Outlook To, Cc, and subject lines, rendering emoji in color on the subject line. Released versions of Outlook don’t use the RichEditD2D window class, since changing the window class involves many code changes. But there’s a new message EM_SWITCHTOD2D (WM_USER + 389) with wparam = lparam = 0 that switches the current window control to D2D. Send this message as soon after creating the control as possible. In particular, if text messages have been received, the EM_SWITCHTOD2D message will fail. Outlook uses this message for the subject line and now displays color emoji . This message effectively changes a RichEdit20WPT window control to a RichEditD2DPT window control. Note that the RichEditD2D implementation hasn’t been ported to Window’s msftedit.dll used by WordPad and other non-Office programs.

RichEdit *windowless* controls have supported the D2D/DirectWrite code path for years now. For such controls, the client has to implement the ITextHost or ITextHost2 interface, which are more complicated than simply calling CreateWindowEx(). They’re also more flexible, so many programs use windowless RichEdit controls in Office and in Windows. For example, the XAML TextBox and RichTextBox controls use windowless controls running in the D2D/DirectWrite mode and automatically enable color emoji.

For windowless controls, color fonts aren’t enabled by default. To enable them, send the message EM_SETTYPOGRAPHYOPTIONS with wparam = lparam = TO_DEFAULTCOLOREMOJI | TO_DISPLAYFONTCOLOR, where TO_DEFAULTCOLOREMOJI = 0x1000 and TO_DISPLAYFONTCOLOR = 0x2000. In a windowless control, you can do this by calling

ITextServices::TxSendMessage(EM_SETTYPOGRAPHYOPTIONS, TO_DEFAULTCOLOREMOJI | TO_DISPLAYFONTCOLOR, TO_DEFAULTCOLOREMOJI | TO_DISPLAYFONTCOLOR, nullptr);

The post RichEditD2D Window Controls appeared first on Math in Office.

]]>