Thursday, 22 December 2005

BabelMap Version 1.11.6

BabelMap version 1.11.6 has been released today. This release finally updates the French version of BabelMap to reflect Unicode 4.1. The French version of BabelMap not only has a French User Interface, but all the Unicode data (character names, block names, property names, etc.) are also in French. The French translation of Unicode is maintained by Patrick Andries, François Yergeau and Alain LaBonté, and is available at the Hapax website. Although Unicode 4.1 was released in March of this year, it obviously takes some time for the French translations to be produced, and so the French version of BabelMap has been out of sync with the English version for almost nine months. Unfortunately, when Unicode 5.0 is released in March next year, the two versions will go out of sync again, until the French translations of the Unicode 5.0 additions are ready.



The French version of BabelMap can be downloaded from the BabelStone or Hapax websites.

There are some interesting points to be made about the French character names.

Firstly, the French character names are the official ISO/IEC 10646 French names, but are not officially endorsed by Unicode, which does not currently publish localized character names. Although there has been some suggestions that applications should use localized character names instead of the official Unicode character names, the difficulty of producing synchronised multilingual versions of hundreds or thousands of new character names for each new release of Unicode means that this would not be practical until the Unicode repertoire has stabilized (which will probably not be for at least another ten years).

Secondly, the French character names are not subject to the same restrictions as the official English character names, which cannot be changed, even when they are wrong or misleading. Not only do the French names not knowingly copy errors in the original English names, but when mistakes in the French names are discovered they can simply be rectified. For example, U+0670 ARABIC LETTER SUPERSCRIPT ALEF is annotated "actually a vowel sign, despite the name" in the Unicode code charts; but the French name for this character is VOYELLE DIACRITIQUE ARABE ALIF EN CHEF, reflecting the fact that the character is a vowel sign. Likewise, U+A015 YI SYLLABLE WU does not actually represent the syllable "wu", but is a special syllable iteration mark; and so the French name for the character is MARQUE D'ITÉRATION YI.

Thirdly, whereas the English character names are frequently named using a transliteration of the character's name in the language of usage, the French names provide a translation or description of the character's meaning where possible. This makes it a lot easy to identify the meaning of a character. For example, the names for three Tibetan signs used in Bhutan to mark the relative status of the addressee of a letter or document use Tibetan transliterations in the English character names, which makes their meaning opaque to most users. Indeed the name for U+0F0A is completely wrong, but nobody noticed this for many years, as the name means nothing unless you can read Tibetan (in ASCII transliteration, which most native Tibetan speakers probably cannot). On the other hand, the French names clearly and unambiguously describe each character's usage :

  • U+0F0A TIBETAN MARK BKA- SHOG YIG MGO = FIORITURE TIBÉTAINE DE PÉTITION HONORIFIQUE
  • U+0FD0 TIBETAN MARK BSKA- SHOG GI MGO RGYAN = FIORITURE TIBÉTAINE POUR DONNER UN ORDRE
  • U+0FD1 TIBETAN MARK MNYAM YIG GI MGO RGYAN = FIORITURE TIBÉTAINE POUR S'ADRESSER À UN ÉGAL