The post MathML mfenced element deprecated on web appeared first on Math in Office.

]]>The MathML <mfenced> element is handy for representing a variety of delimited expressions, such as parenthesized, braced, and bracketed expressions. The expressions can contain separators. Examples are (đ + đ), (đ + đ], and the quantum mechanical expectation value â¨đ|â|đâŠ, in which the â|â is a separator. The <mfenced> element corresponds quite closely to the OMML delimiters element <d> used in Office app files, which is why the OfficeMath MathML writers use it.

To show how <mfenced> can be emulated by an <mrow>, consider (đ + đ]. Using <mfenced>, it is represented by

`Â Â <mfenced close="]">`

`Â Â Â Â Â Â Â <mi>a</mi><mo>+</mo><mi>b</mi>`

`Â Â </mfenced>`

Since left parenthesis is the default start delimiter, the <mfenced> doesnât need the attribute open=”(“, although it could have it. The equivalent <mrow> representation is

`Â Â <mrow>`

`Â Â Â Â Â Â Â <mo>(</mo>`

`Â Â Â Â Â Â Â Â Â Â Â < mi>a</mi><mo>+</mo><mi>b</mi>`

`Â Â Â Â Â Â Â <mo>]</mo>`

`Â </mrow>`

Here the <mo> fence=”true” attribute isnât needed since the MathML operator dictionary assigns fence=”true” to parentheses, brackets, braces and other Unicode characters that are fences by default. You need the attribute fence=”false” or stretchy=”false” if you donât want the delimiters to grow to fit their content.

Comparing these representations, we see that <mfenced> is more compact. On the other hand, the <mrow> emulation is more general in that you can include attributes like different math colors on the individual delimiters and you can embellish the delimiters with accents. If you want a delimited expression with just the open delimiter, e.g., {đ + đ, you omit the <mo> for the close delimiter. Similarly, a delimited expression with no open delimiter, e.g., đ + đ}, omits the open delimiter. For more discussion of <mfenced> and <mrow>, see the MathML 3.0 spec.

The <mfenced> element is an example of Polish prefix notation: you know up front what kind of math object is involved. In contrast, you must parse an <mrow> emulation of <mfenced> to figure out what it represents. The parsing is a little tricky, but itâs not that hard since the delimiter roles are implied by the order in which the delimiters appear inside the <mrow>.

The basic principle is that the start and end delimiters are fences, and any delimiters in between are separators. The main OfficeMath MathML reader uses a SAX parser, which cannot look ahead. But the reader can store information for looking behind. The algorithm is: the first delimiter of an <mrow> is a start delimiter <mo> and other delimiters are marked as separators. When the parser comes to the end of the delimiter expression (</mrow>), it remarks the last delimiter as an end delimiter. If there are only two delimiters, there are no separators. If thereâs only one delimiter, itâs a start delimiter unless it comes at the end. This algorithm converts <mrow> delimiter elements into the OfficeMath <d> equivalent. It will be used soon in Office apps since FireFox removed support of <mfenced> (OneNote counted on it in FireFox!) and the Chromium code base wonât support it either. Yes, Chromium will support âcoreâ Presentation MathML. Many browsers are based on Chromium, e.g., Chrome and Edge.

Some MathML elements are âinferred mrowâsâ in that they treat multiple children as a single argument and the algorithm works with them as well. Such elements include <math>, <msqrt>, <menclose>, <mphantom>, <mpadded> and <mtd>.

Best practice <mrow> delimiter emulation restricts the contents of the <mrow> to the contents of the delimited expression. But what if there are other things inside an <mrow> such as in (note: <math> is an inferred <mrow>)

<math > <mo>(</mo> <mi>a</mi> <mo>+</mo> <mi>b</mi> <mo>)</mo> <mo>+</mo> <mo>|</mo> <mi>a</mi> <mo>+</mo> <mi>b</mi> <mo>|</mo> </math>

Two tricks are useful: with no form-disambiguating attribute like âformâ=âprefixâ on the delimiter <mo>âs (as in this example), use the default form value given in the MathML operator dictionary. This works for all default delimiter pairs, but not for â|â which can be used as a separator (infix), open delimiter (prefix), or close delimiter (postfix). For â|â use the algorithm above with a small twist: when there is an active â|â start delimiter, treat a â|â as an end delimiter. When finished processing any delimiter expression, reset the state to âno delimitersâ. As such â|â is alternately a start delimiter and end delimiter. This algorithm cannot produce nested absolute-value expressions. To nest an absolute value, use appropriate form attributes, or, best practice, put the absolute value in its own <mrow>.

The post MathML mfenced element deprecated on web appeared first on Math in Office.

]]>The post How I got into technical word processing appeared first on Math in Office.

]]>When I finished my PhD in 1967, I went to Bell Labs to continue working on laser physics and after a year got seduced by the idea of labeling graphs with real built-up, i.e., 2D, mathematical expressions. To this end, I created the SCROLL language (**s**tring and **c**haracter **r**ecording **o**riented **l**ogogrammatic **l**anguage), which was the first language capable of âtypesettingâ mathematical equations on a computer. I published it in AFIPS Conf. Proc. **35**:525-536, AFIPS Press, Montvale, N.J. (1970). Admittedly SCROLLâs typography was limited. For example, the user had the responsibility of spacing the math, in contrast with TeX, Word 2007, and other sophisticated systems. But it was the first program capable of displaying built-up math, and it was fine for that time to be able to show nicely labeled results at various conferences.

After my two-year stint at Bell Labs, one of my fellow graduate students at Yale, Marlan Scully, suggested coming to the Optical Sciences Center at The University of Arizona to work on lasers and things and in particular to write *Laser Physics*, a book we had talked about writing some day with Willis. Well for a North Easterner, Tucson, Arizona was a most fabulous and interesting place and certainly one way to start seeing the rest of the world. So instead of going to Bell Labs in Murray Hill to work with the great computer science group there (and maybe later on eqn/troff, a TeX competitor), I went to Tucson. Marlan, Willis, and I (well mostly me, with two excellent consultants!) wrote the book and I personally typed over two/thirds of it using a superb new kind of typewriter called the IBM Selectric. It had handy type balls that you could exchange, so you could have italic, symbols, script, and other typefaces. What a huge improvement over the swapping out of keys which we had to do with the older IBM typewriters. The reason I had to type so much of the book was because even with the Selectric our secretaries couldnât type math very well, especially with subscripted superscripts, integrals, and the like common in the laser theories we were writing about.

*Laser Physics *was typeset in South Korea and the drafts confused *Îą* and *a,* *Î˝* and *v* (nu and vee, since Times New Roman also confuses them), and other symbols. It took me over a month to straighten things out, even though the original manuscript was correct. Such problems tweaked my interest in preparing technical documents on computers. Publishing in physics journals was much easier, but you still had to spend significant time proof reading galley proofs.

Around 1978 I got a Diablo daisy-wheel printer to go with my IMSAI Z80 microcomputer. Not only was it much faster than the Selectric, it had many daisies some of which were proportionally spaced, and it was designed to work as a computer printer. I had gotten into microcomputing thinking that by computerizing my house Iâd learn something about experimental physics, since Willis taught me that a real physicist needs to know something about both experiment and theory. To handle the proportional spacing, I wrote a printer driver. My colleague Rick Shoemaker, another microcomputer addict, and I decided to write a book called *Interfacing Microcomputers to the Real World*, and we âtypesetâ it using my printer driver and a daisy-wheel printer. Addison-Wesley published the book, just as it had published *Laser Physics*, but this time using our nice proportionally spaced camera-ready proofs.

Well clearly, we needed to be able to typeset math, so I generalized the printer driver to do so using algorithms like those for the SCROLL language. Another physicist, Mike Aronson, who had written the PMATE editor I was using, suggested that the input format should resemble real linearized math as in the C language rather than the Polish prefix format used in SCROLL. So I wrote a translator to accept a simplified linear format, the forerunner of UnicodeMath which we use in Office apps today. The translator was coded so tightly in Z80 assembly language that it along with the rest of the formatter fitted into 16KB of ROM for a controller some friends of mine created for Diablo daisy-wheel printers. When used with a tractor feed, it could print the whole document with one daisy, roll the document back, print with the next daisy, etc. It was positively wild watching the printer type the symbols in place after printing the main text.

As a laser physicist, I was naturally symbiotically attached to the idea of laser printers, so when HP came out with their early laser printers, I converted the program to 8086 code for use on IBM PCs and HP LaserJets. The editor and formatter ran just fine in MS-DOS in the PCâs incredibly roomy 640KB. Rick and I updated our microcomputer book to *The IBM-PC from the Inside Out*, once again published by Addison-Wesley from our camera-ready copy. I called the program the PS Technical Word Processor, and my users and I wrote many papers and books using it. Well many by a typical professorâs standards, i.e., not by Knuthâs (!), and essentially none by Microsoftâs standards. I really wanted to distribute the approach more widely. With myriad improvements, e.g., LineServices, we now have OfficeMath. And yet thereâs still much to do!

The post How I got into technical word processing appeared first on Math in Office.

]]>The post Unicode Math Calligraphic Alphabets appeared first on Math in Office.

]]>Note: the January 2021 meeting of the Unicode Technical Committee accepted 52 variation sequences for the upper-case script letters: 26 for roundhand and 26 for chancery. See L2/20-275R for the latest proposal.

1) Hereâs an example of chancery and roundhand Fâs being used in the same document:

2) Here are examples featuring Pâs and Câs in which script letters denote infinity categories

3) Still another paper has the following

4) Both script styles are in the OMS encoding for LaTeX as illustrated by

\documentclass{article} \usepackage{calrsfs} \DeclareMathAlphabet{\pazocal}{OMS}{zplm}{m}{n} \newcommand{\La}{\mathcal{L}} \newcommand{\Lb}{\pazocal{L}} \begin{document} $\La\Lb$ \end{document}

This LaTeX snippet displays a roundhand L followed by a chancery L

Accordingly, the need for both chancery and roundhand alphabets is attested.

Complicating the addition of new alphabets is the fact that the current math-script alphabets may be chancery in one font and roundhand in another. Cambria Math, the first widely used Unicode math font, has chancery letters at the math-script code points, while the Unicode Standard has roundhand letters at those code points. For example, hereâs the upper-case math-script H (U+210B) in Cambria Math followed by the one in the Unicode Standard:

The STIX math fonts have also had roundhand letters at the math-script codepoints, but in the STIX Two Math font, they have been changed to chancery. This removes the worst conflict in defining the new alphabets, although other math fonts might have roundhand letters at the current math-script codepoints.

We discuss two unambiguous ways to allow math-chancery and math-roundhand symbols to appear in the same plain-text document:

- Follow a character in the current math-script alphabets with one of two variation selectors much as we use variation selectors (U+FE0E, U+FE0F) for emoji to force text and emoji glyphs, respectively. Specifically, to ensure use of the math-chancery alphabet, follow the current math-script letter with U+FE00. To ensure use of the math roundhand alphabet, follow the current math-script letters with U+FE01.
- Add the missing bold and regular script alphabets

The variation selector approach has the advantages

- Contemporary software supports variation selectors for East Asia and emoji, so adding new variation selector usage shouldnât be much of a burden
- The variation selector U+FE00 is already used with a number of math operators
- No new code points need to be allocated
- Typical documents can continue to do what they have been doing: ignore the distinction
- If a math font doesnât support the variation selectors, it falls back naturally to the current script letters instead of displaying the missing-glyph box (but the style difference is lost)

Adding two variation selectors for the math script letters may make people ask why we didnât use variation selectors for the math alphabets in the first place, but we all know the arguments in favor of what we did (see the blog post on Math Font Binding). Adding two variation selectors seems to solve the script quandary quite well, although the use of variation selectors is generally a poor one for situations where symbol shapes need to be used in a contrastive mannerâthis case should therefore not serve as a general precedent, but should be seen as an exception, tailored to fit this specific case. One way to implement the variation-selector combinations is to use the OpenType feature tags âcv01â and âcv02â.

The second approach adds the missing normal and bold script alphabets. These two new alphabets could go in the 1D380âŚ1D3FF block which is reserved for math alphabets. Programs continue to display what they currently display by default.

It might be worthwhile for programs like Microsoft Word to have a math document-level property that specifies which script alphabet to use for the whole document. Then a user who wants the fancy script glyphs could get them without making any changes except for choosing the desired document property setting. A similar setting could be used for choosing sans-serif alphabets as the default. Such alphabets are often used in chemical formulas.

The choice of chancery glyphs for the math script letters in Cambria Math is partly my fault. I had expected to see roundhand letters in Cambria Math as in the Unicode code charts. In my physics career I used math-script letters a lot, starting with my PhD thesis on Zeeman laser theory (1967) and followed by many papers published in the Physical Review and elsewhere and in my three books on lasers and quantum optics. Occasionally in a review article, chancery letters were substituted for roundhand letters because the publishers didnât have the latter. And in the early days, the IBM Selectric Script ball and the script daisy wheels only had chancery letters. So I kind of got used to this substitution. Cambria Math was designed partly to look really good on screens, which didnât have the resolution to display the narrow stem widths of Times New Roman and roundhand letters well. ClearType rendering certainly helped, but it seemed like a good idea to use less resolution demanding chancery letters. (Later Word 2013 disabled ClearType for various reasons and many readers of this blog have complained passionately ever since! With high resolution screens as on my Samsung laptop or the Surface Book, even Times New Roman looks crisp and nice with only gray-scale antialiasing, so hopefully this problem will diminish in time.)

LaTeX has the \mathsf{} and \mathsfit{} control words for math sans-serif upright and italic characters, respectively, and they work with Greek letters. Unlike the chancery/roundhand distinction, which is seldom used contrastively, upright and italic are usually used contrastively in mathematics. The Unicode Standard has upright and italic sans-serif math alphabets corresponding to the ASCII letters, but not for the Greek letters. Accordingly, these two math Greek alphabets should probably be added. The STIX Two Math font has them in the Private Use Area for the time being since users requested them.

Thanks to Asmus Freytag, John Hudson, Rick McGowan and Ken Whistler for enlightening discussions that substantially improved this post.

The post Unicode Math Calligraphic Alphabets appeared first on Math in Office.

]]>