Rendering text is hard. A typical internationalized text rendering system cumulatively represents hundreds of thousands of lines of code, spread between operating systems, software libraries, and application programs. And yet, despite millions of person-hours and dollars worth of engineering, much of contemporary software still struggles to handle non-European, non-alphabetic text correctly.
Programmers have heretofore typically assumed that "text", simpliciter, is fundamentally a one-dimensional sequence of symbols that is divided into a series of horizontal lines. In fact, neither of these assumptions are universally true. Unfortunately, modern text rendering stacks are frequently still oriented towards the needs of, and assumptions peculiar to, European alphabetic scripts. Attempting to retrofit non-European scripts onto an essentially European-Alphabetic model has only exacerbated the inherent complexity of the problem.
The highly specialized domain knowledge required to handle non-European writing systems correctly is a serious impediment to internationalizing software. In the case of new or minority scripts, it might well be impossible at present. That fact that these scripts are excluded from common software and operating systems can contribute to their marginalization. The predominant font technology, OpenType (https://learn.microsoft.com/en-us/typography/opentype/spec/), addresses this problem by providing built-in "shaping models" that attempt to capture the logic of how the world's writing systems work. OpenType is highly effective at handling those classes of scripts within its purview, but it cannot easily account for behavior beyond what its designers have anticipated.
I therefore present Dubsar, a new computer typography system that provides high quality rendering of complex and minority writing systems and TEI-encoded documents. Dubsar collapses nearly all aspects of text rendering into a single mechanism: a simple programming language, DubsarScript, which allows users to programmatically describe how glyphs are drawn and positioned. Dubsar is not limited in what kinds of writing systems it can express precisely because Dubsar fonts just are arbitrary DubsarScript programs. Moreover, Dubsar embeds an expansive notion of text designed to accommodate writing systems such as Maya, which has so far resisted computerization due to its incredible complexity. Technologists who wish to support internationalized text in their software need only implement an interpreter for the DubsarScript language, which can be readily accomplished without any knowledge of how potentially unfamiliar writing systems work. Due to its relative simplicity, Dubsar furthermore empowers users of uncomputerized minority scripts to create fonts which would otherwise be impossible.
The difficulties of text rendering are only magnified in the context of TEI documents. Assumptions that were already tenuous within the plain text regime, such as unidimensionality, here cease to be true in any capacity. Dubsar is especially conductive to rendering TEI documents by virtue of the same properties that lend it to handling internationalized text.
This paper will first introduce the landscape of contemporary computer typography. It will then describe the Dubsar system: how it works, how it is implemented, and how Dubsar can be used to render TEI documents. Finally, the paper will give a brief overview of how Dubsar fonts might be authored.