Using GELLMU to Write XHTML with MathML

William F. Hammond

30 May 2001

Copyright © 2001 William F. Hammond

Email: hammond@math.albany.edu

About this document

This document is an example of what it is describing. The intention is that the user should be able to view its source while also viewing its rendering, and that should be instructive. The plain text representation of the intermediate XML document (perhaps also available with a browser's "view-source" function) may also instructive.

There are at least two widely distributed browsers that render XHTML with MathML natively.

  1. The World Wide Web Consortium's Amaya.
  2. The MathML branch of the Mozilla browser.

It is also reported that Microsoft's Internet Explorer with a plugin from Design Science can be used to render the text/html serving of this document. The 0.8.1 version of Mozilla appears to render MathML elements only from the the text/xml serving of this document. Amaya appears to handle both versions the same way.

This document has passed validation with Murray Altheim's document type definition for XHTML 1.1 plus MathML 2.0.

How to use GELLMU for XHTML with MathML

Differences between the XML and classical versions of HTML

Most of what is required is explained in Using GELLMU to Write HTML. Somewhat more can be understood by examining the source markup for that document as well as the source for this document.

The most important things to note about the difference between classical HTML and the version of HTML formalized under XML:

Case sensitivity
All element and attribute names in the XML version of HTML are case sensitive and lower case. All entity names are case-sensitive, and some entity names are mixed case.
No omitted tags
Every open tag must be matched by an close tag. In GELLMU this means that the markup "\foo" is never acceptable except sometimes when it is used for invoking a newcommand.
XML declared-empty tag syntax

Whereas in classical HTML a tag such as br is correctly marked up with "<br>", for the XML version of HTML it must be marked up with "<br/>" or, more generally, any amount of whitespace can be inserted between "<br" and "/>". Since the markup "<br/>" is, in fact, wrong for classical HTML, it is recommended that "<br />" be used for the XML version of HTML in order to avoid unnecessary incompatibility with old browsing tools.

With GELLMU one simply marks up "\br;". If the syntactic translator is invoked with the Emacs Lisp function gellmu-xhtml, the empty tag form preferred for XHTML is created; for the whitespace free form use gellmu-xml. Note that gellmu-html is not correct for any XML document type.

Document Type Handling

An XHTML plus MathML document should begin with the markup \documenttype[mathml-altheim]{html}. It is, after all, an HTML document. For XHTML, the XML version of HTML, the HTML element names, including that of the root element, must all be marked up with lower case characters. XML element names generally are case sensitive but not necessarily lower case. The MathML element names used here are all lower case, but that might not be true forever, and some of the MathML entity names, which are also case sensitive, have mixed case.

The documenttype option string "mathml-altheim" points to a list of document type information known to the syntactic translator indicating that the document type is an XML document type with the formal public identifier for Altheim's merged XHTML 1.1 plus MathML 2.0 document type definition and also that the content encoding is UTF-8 and that the root tag should have an attribute list with the single "xmlns" attribute having the value "http://www.w3.org/1999/xhtml". It also provides the string "mathml.dtd" as system identifier for the document type definition. This value is special for the development version of Mozilla, which appears for now not to fetch any network served system identifier. (Currently I have 0.8.1, build id 2001032722.) For validation purposes one may bypass the incomplete copy distributed with Mozilla by using that system identifier as a catalog front.

The string "mathml-altheim" is more formally a key to the Emacs lisp variable "gellmu-doctype-info". New keys may be added by a user either interactively in Emacs once gellmu.el is loaded (as a library) or else by adding a few lines of code such as dotemacs-clip to the Emacs configuration file, which often has the filesystem name ".emacs".

Macro Substitution Meta-Commands

The term meta-command refers to a command that does not correspond to an XML element. Meta-commands usually receive substantial intelligent processing from the GELLMU syntactic translator, whereas ordinary commands simply correspond to XML elements and receive only syntactic transliteration from the syntactic translator. The name documenttype is that of a meta-command.

For basic GELLMU there are three types of meta-commands providing classical macro substitution: (1) macro, (2) newcommand, and (3) Macro. (Regular GELLMU, which provides a more comprehensive layer of LaTeX markup emulation, also has a macro substitution meta-command called mathsym for declaring mathematical symbols entirely apart from standard XML namespaces with optional provision for conveying semantic information such as semantic types to XML processors.)

Each of macro and Macro takes two arguments delimited by braces. The first argument is called its name -- though it can be any string except one with unbalanced braces or the character "\" except as the first character. The second argument is called its value string; it may not contain unbalanced braces.

The syntactic translator makes three forward scans of the document first expanding each macro from the point forward of its location, then expanding each newcommand forward from its location, and finally expanding each Macro forward from its location.

newcommand provides substitution with arguments, more or less as in LaTeX except that it is macro substitution rather than new element creation, and there may be any number of arguments. If the name argument is followed by an option, then that is the number of arguments that the new command will take. A second option is the default value, as in LaTeX2E, of the first argument. In the value field one references the first argument, as with LaTeX, using \#1, the 12th argument using \#12, etc. Each argument may have arbitrary content except unbalanced braces. The command name of a newcommand should consist only of word characters or numbers.

None of the three macro substitution meta-commands may be invoked before it is defined except for the case where it appears in the value string of another. One must be mindful that each is expanded forward from the point of its location. If one of these is used in the value string of another after it is defined, then the expansion of the first in that value string will take place in that value string, but if it is used in the value string of another before it is defined, then the expansion will take place forward of the location of the other at each of the expansion sites of the other. (This may or may not be of consequence.)

Restrictions for avoiding infinite loops.

  1. None of the macro substitution meta commands may reference its own name in its value string.
  2. If a meta-command taking arguments is used in the value string of another meta-command taking arguments, then the definition of the latter must precede that of the former.

Examples

This is an XHTML document with MathML markup prepared using the basic layer of GELLMU and the XML namespaces regime for extending the basic tagset of XHTML.

The following relation is sometimes called the parallelogram law.

a2 + b2 = 12 a+ b2 + a-b2

Balancers should be stretched when appropriate. Here the simple fraction 1/2 is multiplied with a complex fraction.

12 ab cd

This is MathML markup of the formula for the roots of the quadratic polynomial   a x2+b x+c .

x = -b ± b2-4 a c 1/2 2 a

This is MathML markup for a 2 × 2 matrix.

A= abcd

Taylor's Theorem:

fx = j=0 fj0j! xj

This is a form of the Weierstrass infinite product expansion of the gamma function.

0 tx e-t dtt = 1x k=1 1+1kx 1+xk

Dirac's δ-function, which is actually a distribution in the sense of L. Schwartz rather than a function, is characterized by the property that for every C function f with compact support one has:

-fδ = f0  .

In particular, when f is the characteristic function IS of a set S:

Sδ = 1  if S contains 0. 0  otherwise.