UEB Fundamentals for Code Maintainers (draft)

International Council on English Braille
Unified English Braille


UEB FUNDAMENTALS FOR CODE MAINTAINERS


February 2010 (draft 2010-02-18)


1. PURPOSE AND SCOPE OF THIS DOCUMENT

As with any braille code, UEB will undoubtedly need to evolve along with the changing nature of written notation and the needs of its users. Symbols will need to be added, and the rules of usage will need to be adjusted. This document presents the basic “designing rules” underlying Unified English Braille, as an aid to those who are charged with carrying out such maintenance and extension of UEB over time. It is not intended for, nor is it needed by, persons who wish to read and write UEB. All that is needed for that purpose is the current version of the UEB Rules {reference}.

Much of what we call “designing rules” comes down to a detailed description of how valid symbols are formed in UEB and, in consequence, what restrictions must be placed upon the usage rules so that such symbols can always be positively identified and so always have an unambiguous meaning. A basic design goal of UEB is to provide a tactile writing system that is fully equal to print in its range and clarity of expression. In other words, braille is not derivative from print any more than print is derivative from braille―rather, both are writing systems that represent an abstract sequence of symbols―and ideally braille should be able to do that just as well as print. Thus it can be said that the first rule for code maintainers is to think in terms of what the braille means―or in less formal terms, in the braille-to-print direction―, rather than in the print-to-braille direction that is, understandably, the way that most people who make braille think most of the time.

Most of the material here has been taken from Appendices B and C of the original design committee’s final (2004) report, also known as “The Reader Rules” {reference}. That document, along with the other committee reports from the “research” phase of UEB development, contains historical information that may provide some further background detail, such as the committee charge, the motivation behind some of the decisions taken and the alternatives considered, etc. However, its symbol lists, usage rules, etc. are no longer current, having been superseded by the definitive document for UEB, namely the current UEB Rules.

2. BRAILLE SYMBOL STRUCTURE —
FORMAL DEFINITIONS AND DESIGNING RULES

2.1 THE NEED FOR SYMBOL CONSTRUCTION RULES

Since some braille symbols comprise more than one braille cell (character), it is important that symbols and usage rules be defined in such a way that the extent of any symbol, that is its physical beginning and end, can always be determined no matter what symbols are placed next to it. Clarifying the extent of symbols is also clearly necessary to provide a secure basis for an extendable code. In this section we develop a construction system that provides such clarification and security.

It is important to understand that here we are not, for the most part, concerned with the “meaning” of the symbols discussed; that is the distinct matter of symbol assignments. Rather, we are concerned primarily with the manner in which symbols are formed from braille characters—that is, their “structure”. There are inevitable references to meaning, however, in the discussions on the space symbol and the capitalization and grade 1 (letter sign) indicators, because in those cases the meaning plays a role in the structural rules.

2.2 TERMINOLOGY

In this topic, we define some basic terms that are used in a precise way in later sections.

Print Character: A print character is a single letter, digit, punctuation mark, or other print sign customarily used as an elementary unit of text.

This definition of a print character relies, in effect, on common understanding as to the way text based on western-alphabet languages is formed, and it knowingly avoids, at least for the moment, the surprisingly complex issues that arise when virtually every kind of expressive mark ever made by man must be considered. (See the “General Structure” or equivalent section of the current Unicode Standard at http://www.unicode.org.) A definition like this rather clearly considers the semicolon to be a single character, even though two distinct marks are made in forming the character. On the other hand it is less clear as to whether an accented letter is one character or two; ultimately such a choice is likely to come down to a practical consideration, such as the actual frequency of accented letters, and thus be made differently for, say, English and French.

Print Symbol: The terms “print symbol” and “print character” are used interchangeably.

Braille Character: A braille character is any one of the 64 possible combinations of the 6 dots, including the space.

Braille Character Categories: The 64 braille characters are categorized as follows:

1. 1 space.

2. 8 prefixes, subdivided as follows:

2a. 6 general prefixes:

  • # (dots 3-4-5-6)
  • @ (dot 4)
  • ^ (dots 4-5)
  • _ (dots 4-5-6)
  • (dot 5)
  • . (dots 4-6)

2b. 2 special prefixes:

  • ; (dots 5-6)
  • , (dot 6)

3. 55 roots, subdivided as follows:

3a. 12 lower roots:

  • 1 (dot 2)
  • 2 (dots 2-3)
  • 3 (dots 2-5)
  • 4 (dots 2-5-6)
  • 5 (dots 2-6)
  • 6 (dots 2-3-5)
  • 7 (dots 2-3-5-6)
  • 8 (dots 2-3-6)
  • 9 (dots 3-5)
  • 0 (dots 3-5-6)
  • (dot 3)
  • (dots 3-6)

3b. 26 alphabetic roots:

  • abcdefghijklmnopqrstuvwxyz

3c. 17 strong roots:

  • & (dots 1-2-3-4-6)
  • = (dots 1-2-3-4-5-6)
  • ( (dots 1-2-3-5-6)
  • ! (dots 2-3-4-6)
  • ) (dots 2-3-4-5-6)
  • * (dots 1-6)
  • < (dots 1-2-6)
  • % (dots 1-4-6)
  • ? (dots 1-4-5-6)
  • : (dots 1-5-6)
  • $ (dots 1-2-4-6)
  • ] (dots 1-2-4-5-6)
  • (dots 1-2-5-6)
  • [ (dots 2-4-6)
  • / (dots 3-4)
  • + (dots 3-4-6)
  • > (dots 3-4-5)

Prefix characters are those characters that, in current customary braille, even beyond English, are quite often associated with the characters that follow them. For all the prefixes except the number sign, their use for that purpose follows from the fact that they have only right-hand dots, and thus are most easily read when they are up against something on the right. In the case of the number sign, usage as a precursor to numbers is the universal long-standing custom. The reason for distinguishing “general” and “special” prefixes will become evident as the structural rules are developed.

Root characters are those that more commonly have a direct meaning in themselves, or that complete the symbol when following a prefix. The subdivision of roots into three smaller classes plays no role in the symbol construction rules of this section; these distinctions are descriptive and made in anticipation of possible use for other purposes only. “Strong” roots are so named because they all have dots in the top and bottom row of the braille cell and in the left and right columns, and thus are physically unambiguous — that is, they can be positively identified even when not near other braille cells.

Braille Symbol: A braille symbol is one or more consecutive braille characters that, as a unit, either (1) stand for a single print character or (2) indicate how subsequent braille symbol(s) are to be interpreted. Symbols of the first kind are called “graphic symbols”; those of the second kind are called “indicator symbols”. Herein, the unqualified word “symbol” normally means a braille symbol, unless the context makes it clear that the print is intended.

Mode: The term “mode” is used to describe the effect of an indicator symbol on subsequent symbols. If, for example, an indicator causes all the letters of a word to be interpreted as capitals, then “capitals mode” may be said to be in effect over that word.

2.3 THE PREFIX-ROOT CONCEPT IN BASIC FORM

As noted above regarding prefixes and roots, a great many current braille symbols are either simple roots, or combinations of a prefix followed by a root. This suggests the following natural generalization: Let all symbols be either a single root character or a series of prefixes terminated by a root character. Mathematically, one could reduce this to: A symbol consists of zero or more prefixes terminated by a root. The extent of symbols constructed in this way would always be readily perceived when reading, regardless of the order in which the symbols appeared, because the root character would always mark the end of each symbol and therefore the beginning of the next. Moreover, this prefix-root notion would be well grounded in traditional usage.

Unfortunately, despite the simplicity and appeal of this concept, there are a few symbols of historic English Braille—such as the double capital sign and the letter sign—that do not fit the mold, and which the original design committee felt should not be changed. In the end, therefore, the structural rules that were derived are necessarily a little more complex than this basic idea, though they are based upon it, as will be evident.

2.4 PURE PREFIXES TO BE USED ONLY AS INDICATORS

Symbols consisting only of prefixes, such as the double capital sign, may be called “pure-prefix symbols”. Consulting current usage and the fact that prefixes convey a sense of affecting the meaning of subsequent symbols rather than having complete meaning in themselves, the original design committee felt that such symbols should never be used for graphic symbols, but only for indicators. On the other hand, symbols terminated by roots can be used both for graphic symbols and for indicators.

2.5 GENERAL CHARACTERISTICS OF THE STRUCTURAL RULES

These are primarily worded as “reading rules” that permit the reader, starting at a braille character that is known to be the beginning of a symbol, to determine unambiguously where the symbol ends. The rules are based on the form of the symbol alone, not its meaning, since the reader may not know the meaning on first encounter and thus may need to isolate the symbol as a first step to discovering that meaning.

The several rules give rise to corresponding categories of braille symbols, which are considered in order, starting with the space.

2.6 SPACE SYMBOL

The braille space symbol is simply a braille space character. In other words, in the case of braille space, “character” and “symbol” may be considered interchangeable terms, for the braille space character never combines with other braille characters to form a larger symbol.

The space symbol stands for some amount of “white space” in print, including line endings, and may itself be sensed as the end of a line. Thus all white space in print corresponds to white space in braille, and vice versa, so that the braille reader always remains completely and directly aware of this important print device. However, the braille spacing is not intended to provide a measurement of the “amount” of white space in print, which is rarely important in itself. That is, the braille reader cannot determine, from the amount of space encountered in the braille text, just how much space is present in the print text. This is because in some cases any amount of print space is represented by a single space in braille, and in other cases the number of spaces appearing in the braille text is governed by braille formatting rules, such as in the alignment of items in an outline or table.

(Spaces used as separators within print numbers are treated specially. In a formal sense, the separator and following digit are treated as a single symbol in UEB, so as to preserve the integrity of the complete number, i.e. to avoid terminating numeric mode.)

2.7 GENERAL SYMBOLS

Reading Rules:

(1) If a symbol begins with a root character, then that is the entire symbol (called a “simple root”).

(2) If a symbol begins with a general prefix, continue to read succeeding characters, including prefixes of either type, until either a root or a space is encountered. If terminated by a root, that root is part of the symbol; if by a space, it is not part of the symbol.

The following are examples of general symbols usable in any context.

(Note: In some braille editions of the following list, the braille sequence dots 46, 123456 may precede each entry. That is not a part of the symbol; it is a “dot locator” whose presence enables precise determination of the following character’s cell position.)

  • x
  • “m
  • .a
  • #b
  • ^”+
  • _;4
  • @,_;4

Of course some of these, particularly the last, would be unlikely candidates for actual assignment in the code; but all obey the rules for general symbols.

The following are examples of general symbols that may be used only before space:

  • ..
  • #
  • ^”
  • _
  • _;

Because these symbols consist of prefixes only, they must be used only for indicators. Moreover, because these symbols may be used only before space, or in other words at the end of a nonspace sequence, they are best assigned only to the role of “closing” indicators for modes established for entire passages. In any case, any assignment of such symbols for use must specify that they can be positioned only before a space, without exception.

General symbols terminated by a root are intended to be used for all kinds of symbols in braille, both indicators and graphic symbols; hence the adjective “general” both for the class of symbols and for the kind of prefix that begins them when they aren’t simple roots. The only restriction is that, when an alphabetic print symbol has both a lowercase form and an uppercase form, then the lowercase form should be assigned to this class, and designated as such, with the associated uppercase form assigned to a corresponding augmented general symbol, that is to a symbol consisting of the same characters preceded by a dot 6 character. (The augmented symbols are described next.)

2.8 AUGMENTED GENERAL SYMBOLS

Reading Rules:

(1) If a symbol begins with a dot 6, and the next character is a space, then the dot 6 is the entire symbol.

(2) If the character after the dot 6 is a root or a general prefix, then the remainder of the symbol is read just as for a general symbol, described in the preceding topic.

The following are examples of augmented general symbols usable in any context:

  • ,x
  • ,”m
  • ,.a
  • ,#b
  • ,^”+
  • ,_;4
  • ,@,_;4

The following are examples of augmented general symbols that may be used only before space:

  • ,
  • ,”
  • ,..
  • ,#
  • ,^”
  • ,_
  • ,_;

The same restrictions apply to the use of these before-space-only symbols as were recited for the similar subcategory under “General Symbols” above.

There is an important restriction on code maintainers for assignment to symbols of this class. It must not be sensible for augmented general symbols ever to follow the dot 6 special symbols (see topic below), for in such a case the dot 6 sequence would appear to include the initial dot 6 character of the augmented symbol. This could lead, for example, to a “word capital” indicator being read as a “passage capital” indicator. Because all the dot 6 special symbols establish capitalization, and differ only as to extent, this restriction on augmented general symbols will be met if each such symbol cannot be capitalized. In turn this condition will be met if the symbol is either already a capital (as with the ,x example) or is a symbol to which the concept of capitalization does not apply.

2.9 SPECIAL SYMBOLS

Reading Rules: If a symbol begins with dot 6 followed by another special prefix, or with dots 5-6 in any case, the extent of the symbol is determined by reading to the right until one of the following events occurs:

(a) A space, root, or general prefix braille character is encountered. In that case the symbol is terminated just before that braille character.

(b) Another special prefix character, with highest dot lower than the highest dot of the preceding character, is encountered. In that case the symbol is terminated just before that “lower” braille character. (Put another way, symbol breaks would occur before dot 6 following dots 5-6, and not otherwise as long as only special prefixes are encountered. This rule could be summarized as “when the dots drop, stop”.)

If we consider for the moment only symbols up to three characters in length, the above reading rules imply that there are exactly 8 viable special symbols, listed in the table below in a “lowest-to-highest” order, with rightmost character varying most rapidly, an ordering that is important as will be apparent. Next to each symbol is a symbol number for reference, and in some cases the assignment as of this writing (February 2010):

  • ,, 1. Capitalized word indicator
  • ,,, 2. Capitalized passage indicator
  • ,,; 3.
  • ,; 4.
  • ,;; 5.
  • ; 6. Grade 1 symbol indicator
  • ;; 7. Grade 1 word indicator
  • ;;; 8. Grade 1 passage indicator

Note that the limitation to three characters per symbol is arbitrary and for purposes of the above listing and this discussion only. If a future code maintainer finds it useful for certain rarely-used modes, special symbols of any size may be defined, as long as the rules of formation and usage follow the same pattern as described here.

Because these symbols all lack terminating roots and are therefore usable only for indicators, that is in connection with modes, it is always possible without loss of generality to impose a rule upon the order of indicator presentation when several modes must commence at once. To avoid conflict, in keeping with our reading rule, transcribers must always put sequences of special symbols in descending numeric order as listed above. Thus to indicate a grade 1 passage beginning with a fully capitalized word, one would use symbol 8, ;;;, followed by symbol 1, ,,, not the other way around.

Code designers must also be careful that, within each of the two groupings defined by repetitions of the same character, they do not define any modes that sensibly may need to be juxtaposed. In the above list, those groups are: symbols 1-2 (“dot 6 special symbols”) and 6-8 (“dots 5-6 special symbols”). (The middle group, symbols 3-5, could be called “dots 6, 5-6 special symbols.) Moreover no mode may be defined for any of the dot 6 special symbols that may be needed before any augmented general symbol. These restrictions are necessary to avoid obvious overlap problems, and they are easily reduced to two simple rules:

(1) All the dot 6 special symbols must establish (or cancel) capitalization and differ only as to extent, and

(2) All the dots 5-6 special symbols must establish (or cancel) grade 1 mode and differ only as to extent.

The dots 6, 5-6 special symbols remain unassigned as of this writing (February 2010). As they are ineligible for graphics symbols, a possible future use would be for indicators establishing special mode(s).

Note that certain contractions of traditional English Braille that are retained in UEB, namely the final-letter contractions “ence”, “ong”, “ful”, “tion”, “ness”, “ment”, and “ity”, must all be regarded as two-symbol sequences under this rule.

2.10 SUMMARY OF SYMBOL CONSTRUCTION RULE CHARACTERISTICS

(1) This system assures that all symbols, regardless of context and regardless of the reader’s familiarity with meaning, are unambiguously recognizable as to their extent (boundaries) (see next section).

(2) The system allows code maintainers henceforth always to think in terms of braille symbols as the fundamental units rather than individual braille characters, and to assign meanings to those symbols in natural mnemonic groups without concern about the possibility that chance juxtapositions could cause recognition problems in actual use.

(3) The system encompasses the symbols used in earlier English Braille codes to a remarkable degree, as well as other codes of interest. In the opinion of the original design committee, this is not really an accident nor is it a new discovery, but implicit in Louis Braille’s original design.

(4) The system provides an orderly, efficient and suitably rich basis for future expansion. On this last point, we first note that the number of potentially available symbols is without limit in principle. If an arbitrary target limit of 3 cells is adopted for practicality’s sake, we may further calculate the number of available root-terminated general and augmented symbols within that limit to be 3,410, certainly an adequate number for a broad range of notation.

3. PROOF THAT SYMBOL EXTENTS ARE RECOGNIZABLE

This proof merely formalizes what may already be obvious, namely that the various classes of symbols and symbol ordering rules are defined in such a way that it is always possible to discern the boundary between two adjacent symbols. To put it another way: if one knows where the current symbol starts, it is always possible to tell where it ends, and therefore where the next one starts, even if one does not know the “meaning” of either symbol.

The proof is easily constructed and followed, although it is a bit tedious, like most proofs of the “enumerate all the cases” variety. Still, we think it worth spelling out, for in the process our symbol construction rules, and the reasons behind them, are likely to be further clarified.

We begin by listing the various classes of symbols defined by the reading rules. To minimize verbosity in the proof itself, we assign a 2-letter name to each class. The list follows, and for information purposes a count of the symbols with three or fewer characters is given in parentheses for each class (but note that the number of symbols in every class except space is actually unlimited, because the symbols can be arbitrarily long):

  • sp space (1)
  • ge general symbols terminated by root, usable anywhere (3025)
  • gw general symbols not terminated, usable only before (white) space (438)
  • au augmented general symbols, terminated by root, usable anywhere (385)
  • aw augmented general symbols not terminated, usable only before (white) space (55)
  • sc special capitals indicators (sequences of two or more dot 6’s) (2)
  • sm special “mixed” indicators (sequences consisting of one or more dot 6’s, followed by one or more dots 5-6’s) (3)
  • sl special grade 1 indicators (sequences of one or more dots 5-6’s) (3)

Since there are 8 classes, there are in principle 64 possible 2-symbol sequences to consider.

However, since the space symbol is complete in itself, and moreover not part of symbols in any other class, all combinations involving space as either the first or second symbol need no further consideration. That leaves the 49 combinations involving the other 7 symbol classes.

But further, since the symbols of class gw and aw can only be followed by space, none of the other combinations commencing with either of those two classes are allowed. Moreover, classes ge and au are both self-delimiting, because by definition they are terminated by the root character, and so the boundary is determined regardless of the class to the right.

The only remaining cases, then, are the 21 combinations formed by one of the classes sc, sm or sl on the left and one of the seven nonspace classes on the right. Even many of those could be grouped, but at this point it is just as easy to enumerate them and do the grouping by reference:

1. sc ge: By definition any special symbol is terminated just before a root or general prefix, and by definition a ge or gw always begins with such a character, defining the boundary.

2. sc gw: The reasoning is the same as case 1.

3. sc au: This case should not occur, because the rules for code designers forbid assigning to class au any symbol that might need capitalizing, and the transcriber rules require any capitalization indicators to immediately precede the first affected symbol.

4. sc aw: The reasoning is similar to case 3.

5. sc sc: This case cannot occur, as the rules for code designers require that 2 symbols both in class sc (or both in class sl) should never need to be juxtaposed. This is fulfilled by having the various symbols within each class indicate different extents of the same modes; it would never be meaningful, much less necessary, to initiate two different extents at the same time.

6. sc sm: This case cannot occur by the transcriber symbol-ordering rules; two indicators in those classes should always be written in the order sm sc.

7. sc sl: The reasoning is similar to case 6.

8. sm ge: The reasoning is the same as case 1.

9. sm gw: The reasoning is the same as case 1.

10. sm au: Because class sm symbols always end with at least one dots 5-6 character, and class au always begin with a dot 6, the “stop when the dots drop” rule would terminate the sm symbol at the proper point.

11. sm aw: The reasoning is similar to case 10.

12. sm sc: The reasoning is similar to case 10.

13. sm sm: The reasoning is similar to case 10.

14. sm sl: The reasoning is similar to case 6.

15. sl ge: The reasoning is the same as case 1.

16. sl gw: The reasoning is the same as case 1.

17. sl au: The reasoning is similar to case 10.

18. sl aw: The reasoning is similar to case 10.

19. sl sc: The reasoning is similar to case 10.

20. sl sm: The reasoning is similar to case 10.

21. sl sl: The reasoning is the same as case 5.

4. SUMMARY RULES AND GUIDELINES FOR UEB CODE MAINTAINERS

This section mostly just collects into one place, for convenience, the “rules about rules-making” that are expressed or implied in preceding sections. It also presents some general guidelines that, while not really rules, have been found useful by earlier committees and hence to some degree characterize the “spirit” of UEB. The latter are labeled as customs, good practice or guidelines, to avoid any confusion.

4.1 Any new graphic symbol (one which has an inherent meaning and hence has, or could have, a direct representation in print), even if it also has an indicator effect, should be assigned be to an unassigned general or augmented general symbol terminated by root (class ge or au as defined in section 3 above). If the symbol is a letter from an alphabet and has both lowercase and capital forms, the lowercase form should be from class ge and the uppercase form from class au. In other words, the capital should be formed by adding dot 6, as you would expect, and such letter(s) become subject to the extended capitalization indicators. If the new graphic symbol is not alphabetic, then it is generally good practice to avoid class au except for those cases where the symbol is in some sense a “big” version of the symbol without the dot 6 (e.g. the dash vs. the hyphen).

4.2 Symbols that have only an indicator effect may be assigned from class ge without concern about special ordering rules, and so for simplicity’s sake that is generally good practice.

4.3 Symbols that have only an indicator effect may be considered from class au, but in that case special ordering rules will need to be enacted to assure that the ordering of special symbols (see section 2.9) is maintained.

4.4 Assignments to class sc (formed by two or more dot 6’s) must be limited to indicators that establish or cancel capitalization.

4.5 Assignments to class sl (formed by one or more dot 56’s) must be limited to indicators that establish or cancel grade 1.

4.6 Assignments to class sm (formed by one or more dot 6’s followed by one or more dot 56’s) can be considered for the indicators of special modes, but in that case special ordering rules must also be enacted to assure that special symbols always occur in the proper order (see section 2.9).

4.7 The rule that, when they occur together, any class sl symbol (typically initiating grade 1) must precede any class sc symbol (typically initiating capitalization) cannot be changed.

4.8 The rule that any class sc symbol (typically initiating capitalization) must immediately precede the first letter to which it applies cannot be changed.

4.9 Customarily, the root character for a technical symbol assignment in UEB has usually not been a letter taken from the English name of the symbol. This is in part because UEB may be used for technical purposes by non-English-speaking populations. But that practice also makes room for two important cases where an alphabetic root does make sense, i.e. where the symbol in print actually incorporates a letter in some sense (as does the at-sign, for example), and where the symbol is a letter in a non-Roman alphabet that corresponds to a Roman letter. (Coincidentally, though not as a deliberate goal, it also leaves room for numerous prefix-root contractions of the “mother” and “world” type to be added at a later date, if a future committee, in consultation as always with the users of UEB, should deem it advisable.)

4.10 The assignment of technical symbols so as to conflict with existing contractions can be considered, but since such symbols would always need a grade 1 symbol indicator in any grade 2 context, it is good practice to avoid such assignments.

5. MAINTENANCE OF THESE FUNDAMENTALS

While we can hope and expect that these “rules about rules” would be less subject to change than the rules themselves over time, they are not necessarily immutable. It is conceivable that some future overhaul or extension of UEB might be considered desirable, and consequently the basic structural rules revisited with a view to providing for more or different kinds of symbols. The structure presented here is not the only one that can work, and among the other possibilities, a more general structure could be devised that would incorporate this one as a subset, thus preserving existing symbols. In such a case, we would only hope that the dual goals of UEB as expressed in section 1, namely both clarity and range of expression, would be preserved, and consequently that the equivalent of the proof in section 3 would be produced.

UEB as it is, both in these fundamentals and in the rules as understood by transcribers and readers in daily use, is the brainchild of user-controlled committees that have worked since 1991 to define a braille code that would be technically sound as well as practical and pleasing to its users. As proud as we are of that brainchild, we are reminded of the words of the poet Khalil Gibran: “Your children are not your children … For their souls dwell in the house of tomorrow, which you cannot visit, not even in your dreams.”

ICEB contact information
ICEB home page
Page content last updated: 2010-02-18