I have been trying to learn, in an on-and-off fashion, literary Mongolian for several years now, and I had to learn Manchu (a much more pleasing language than Mongolian to my mind) when, many years ago, I was researching the Manchu translation of the great Chinese historical novel, Sanguo Yanyi 三國演義 "Romance of the Three Kingdoms" (Ilan Guran i Bithe ᡳᠯᠠᠨ ᡤᡠᡵᡠᠨ ᡳ ᠪᡳᡨᡥᡝ in Manchu), so it was inevitable that I would eventually get round to discussing the Mongolian and Manchu scripts, which in Unicode are unified (together with the "Todo" reformation of the Mongolian script and Sibe extensions for Manchu) as a single "Mongolian" script, although their user communities view Mongolian and Manchu as distinct scripts in their own right.
I have nothing against the unification of Mongolian and Manchu at the character encoding level, but the Byzantine complexity of the Mongolian encoding model that was chosen has in my opinion severly hindered the development of fonts and software support for the Mongolian and Manchu scripts. Although Mongolian has been encoded in Unicode since version 3.0 (1999), up until now there has been no realistic support for Mongolian from any of the major vendors, mainly because the rules defining Mongolian shaping behaviour have never been fully and openly defined. With Vista, for the first time, there will be support for Mongolian, including an almost-working font ("Mongolian Baiti"), but the shaping behaviour implemented by Microsoft has been largely based upon a private and undocumented interpretation of the rules for shaping behaviour, which do not always accord with the definition of Standardized Variants in the Unicode Standard (i.e. the font is not conformant to the Unicode Standard). To be fair to Microsoft, they had very little choice if they were to provide some sort of support for Mongolian, given that nobody seemed willing to do the work necessary to define the finer details of Mongolian shaping behaviour.
No doubt I will be returning to the problems of Mongolian shaping behaviour at a future date, but in the meantime if you are interested, do buy a copy of the new Unicode 5.0 book, and take a read of the section on Mongolian (13.2), which has been completely rewritten by me, and is hopefully an improvement on the previous text -- you will notice that we are still hoping to eventually get out a Unicode Technical Report documenting exactly how Mongolian shaping behaviour should work, but it may still be a while yet. And if you do buy the book, don't forget to also have a read of the section on Phags-pa (10.3) which was written by me, and the section on Yi (12.6) which has been thoroughly revised by me for the new edition.
Anyway, today I'm going to discuss the first and only addition to have been proposed for the Mongolian block since it was introduced seven years ago, MONGOLIAN LETTER MANCHU ALI GALI LHA, which is a character required for representing Tibetan LH (as in "Lhasa") in the Manchu script ("ali gali" is a Mongolian term used to refer to special letters that are used for representing Sanskrit and other foreign languages). What is interesting is why this letter was missed from the original repertoire of characters included in the Mongolian block. Well, the main source for "ali gali" letters was, I think, Tongwen Yuntong 同文韻統, a work on the Chinese transcription of Sanskrit and Tibetan that was first published by imperial order in 1749, and later reissued in a much expanded edition. The original 1749 edition does not show Tibetan LHA, but the later edition (I don't know the date) has this entry on Tibetan LHA :
This shows the syllable LHA written, from top to bottom, as :
- Tibetan ལྷ་
- Manchu ᠯᡥᠠ
- Mongolian ᠯᠠᠾᠠ᠋ (!)
- Chinese 拉哈
In Mongolian LH is written as a ligature of the letters LA (U+182F ᠯ) and HA (U+183E ᠾ), although here it is a bit weird as it seems to be written with an extra tooth (as LAHA rather than LHA). From an encoding point of view, it may be noted that the Mongolian LH ligature is encoded as a distinct letter (U+1840 ᡀ), which is probably unnecessary, and opens up the possibility of multiple spellings for LHA (either <1840 1820> ᡀᠠ᠋ or <182F 183E 1820> ᠯᠾᠠ᠋).
As with Mongolian, the Manchu LH here is a ligature of the letters LA (U+182F ᠯ) and HA (U+1865 ᡥ). Thus, in Tongwen Yuntong Manchu LHA is not written using a special letter, which I think is the reason why no Manchu letter LHA was encoded originally. However, in other Qing dynasty texts Tibetan LHA is not represented as a ligature of LA and HA, but by means of a special letter created by adding a circle diacritic to the right of the letter LA. For example, in the imperial vocabulary in five scripts (Manchu, Tibetan, Mongolian, Uighur and Chinese), Wuti Qingwen Jian 五體清文鑒, the special letter LHA is used for the Manchu transliteration of Tibetan words, as can be seen in this example showing the Tibetan words lha dril ལྷ་དྲིལ་ "spirit bell" and lha rnga ལྷ་རྔ་ "spirit drum" :
Wuti Qingwen Jian 五體清文鑒 (Beijing: Minzu Chubanshe, 1957) p.662
Notice how in the example from Tongwen Yuntong the syllable LHA is written as the sequence l (the head), h (two teeth and a circle diacritic) and a (the tail), whereas here it is written as the sequence lh (the head, with a circle diacritic on the stem) and a (the tail).
Another example of this special letter can be seen in this Tibetan Buddhist text entitled "Praises to the Green Saviouress [Tara]", which is written in Chinese and Manchu transliteration :
At the top of the last line of the page (i.e. the rightmost line) the Tibetan phrase lha dang lha min ལྷ་དང་ལྷ་མིན་ "gods and demi-gods" is written in Manchu and Chinese transliteration. As in Wuti Qingwen Jian, the syllable LHA is represented by the addition of a circle diacritic next to the letter LA. In fact, the circle diacritic is used productively to generate aspirated letters in Manchu, for example in U+189A MONGOLIAN LETTER MANCHU ALI GALI GHA ᢚ, U+189D MONGOLIAN LETTER MANCHU ALI GALI JHA ᢝ, U+189F MONGOLIAN LETTER MANCHU ALI GALI DDHA ᢟ, U+18A1 MONGOLIAN LETTER MANCHU ALI GALI DHA ᢡ and U+18A8 MONGOLIAN LETTER MANCHU ALI GALI BHA ᢨ. This diacritic circle could have been encoded separately, and these letters represented as combining sequences, but the circle diacritic wasn't encoded and these letters are encoded as precomposed letters, so it is necessary to encode a new character to represent Manchu LHA (see N3041). This new letter will be making its début as U+18AA in Unicode 5.1.