Saturday, 15 July 2006

BabelMap Version 5.0.0.1

Unicode 5.0 was finally released yesterday (although it won't be published in book form until later this year), several months after its original anticipated date of release (see What's New in Unicode 5.0 for a sumary of what's new). This is a small triumph for me as I am responsible for the introduction of one of the new scripts now covered by Unicode, the historic 'Phags-pa script that was used for writing Chinese, Mongolian and other languages during the 13th and 14th centuries (there is a worthwhile story here about the long and sometimes fraught passage from initial proposal to final encoding of the script, but it will have to wait for another day).

A new version of Unicode inevitably means the release of new versions of my flagship software products, BabelMap and BabelPad, and so I am pleased to announce that BabelMap version 5.0.0.1 is now available for download. Up until a few days ago a new Unicode 5.0 enabled version of BabelPad was also ready for release, but as usual I couldn't leave things well alone, and decided to add in just one more feature; and of course this feature required me to entirely disembowel the code, so that it is now in a wretched and lifeless state (as my friends in the programming fraternity know, I am a keen exponent of the art of eXtreme reFactoring) ... but hopefully BabelPad (with many great new features) will be released before the end of the month.

New BabelMap Features

My number one question from new BabelMap users is "Why is such-and-such a character displayed as a little square box ?" or "Why doesn't BabelMap support such-and-such a script ?" The reason for such questions is almost invariably that the characters they want to see are not available in the default font that BabelMap uses when it is first started (Tahoma), and they do not realise that they have to select an appropriate font to see a particular character. For me it is obvious that any given font only supports a particular subset of the Unicode repertoire (due to the 64K glyph limit for TrueType fonts, it is physically impossible for any font to cover the entire Unicode repertoire of 99,098 characters), and so you may need to select different fonts to display different characters; but for many people this is not at all evident. I have therefore changed BabelMap so that you can either select a single font to display all characters (good for seeing what a particular font covers) or use a user-defined virtual, composite font in which each Unicode block is mapped to a particular font on your system, with the result that different Unicode blocks will be rendered using different fonts (good if you are more interested in characters than fonts). By default BabelMap will use a composite font when run for the first time, so that most characters in the BMP should be displayed OK if you are running Vista, and hopefully I should get fewer questions from new users about little square boxes.

Composite Font Mappings Dialog

image



Future Enhancements

My OpenType Analysis Tool is still half-finished, and with no time to work on it, it won't be available until sometime year.

I am also planning to add in the ability to take a picture of a character as rendered using the selected font, which will be made available to the clipboard as a bitmap image ... useful if you want to display a character on a web page in situations where you doubt that the end user will have an appropriate font.

P.S. You may notice that I have done away with the arbitrary version numbering system that I previously employed (which never got beyond version 1 and never would have), and replaced it with a four-digit version number that is linked to the version of Unicode that the particular release of BabelMap/BabelPad supports. The first three digits of the version number now correspond to the Unicode version supported, and the last digit is the version of the BabelMap/BabelPad released for this version of Unicode. Thus, the new release of BabelMap is version 5.0.0.1, as it is the first release supporting Unicode 5.0.0.



Addendum [2006-07-20]

Following hot on the heels of the announcement of the release of the Unicode 5.0 character database (but not the publication of the actual Unicode 5.0 standard) on 2006-07-14 comes the notice of publication of the corresponding ISO/IEC 10646: 2003 Amendment 2, two weeks earlier (on 2006-07-01).

It's a bit of a chicken and egg relationship between Unicode and ISO/IEC 10646, further confused by the fact that although ISO/IEC 10646 Amd.2 was published before Unicode 5.1, Unicode 5.1 includes four characters (U+097B, U+097C, U+097E and U+097F) from ISO/IEC 10646 Amd.3, which won't be published until next year ... along with Unicode 5.1. And by that time we'll be well into the work of Amd.4 (corresponding to Unicode 5.2 or 6.0), which should finally include Egyptian Hieroglyphs (or at least the Gardiner subset).


6 comments:

John Cowan said...

Excellent! The 5.0 version is more usable than the 4.1 version because of the composite font capability. However, could you make a few teeny enhancements when you have a chance?

1. In the Composite Font Mappings dialogue box, it would be much easier to pick a suitable font if the fonts were sorted in decreasing order of coverage of the block rather than alphabetical order by name. When using a composite font, after all, you want to see as many glyphs as you can. Then you can change to a font with less coverage and more legibility if the high-coverage font is just too horrible.

2. For the same general reasons, I think that there should be a button that has the effect of choosing the highest-coverage font available in each block across all blocks, thus producing the highest-coverage composite font currently possible at a single stroke.

Thanks for all your work.

Anonymous said...

Hi,

There is a problem I encountered in BabelMap with WinXP:

I could not start BabelMap because of some unhandled exception, but this could be solved by deleting the BabelMap keys in the registry.

Selecting "List All Styles of Fonts" does not work any longer (the program crashes). I suspect this is because I have fonts with styles like "light" installed on my machine.

This option worked with older versions without any problems, and I guess this is why BabelMap crashed on startup: I had this style option on.

Andrew West said...

John,

Thanks for the kind comments.

I'm not entirely happy with the font configuration dialog as it is myself; in particular having two font selection mechanisms (dropdown of all fonts on the left and list of fonts for a given block on theright) seems wrong to me, but removing the full list is problematic, so I've left it in for the time being. Feedback such as yours is helpful in deciding where improvements can be made.

"In the Composite Font Mappings dialogue box, it would be much easier to pick a suitable font if the fonts were sorted in decreasing order of coverage of the block rather than alphabetical order by name."

Probably changing the font list control to use columns with sortable headings ("Font Name", "Characters") like the list of blocks on the left would be a good idea. I agree that having the initial sort order by character coverage would be sensible, as this would make the most likely candidate the default selection.

"I think that there should be a button that has the effect of choosing the highest-coverage font available in each block across all blocks, thus producing the highest-coverage composite font currently possible at a single stroke."

Yes, I thought about having something like that, but unfortunately it probably would not give very satisfactory results for most users because for many blocks (particularly the major BMP blocks such as Basic Latin, etc.) there are very many fonts that have the maximum font coverage. Trying to determine a good algorithm for the default mappings on first use was a major headache, and not something that I have resolved satisfactorily -- my basic strategy was to use Vista fonts if available. The initial default mapping is by font name alone, as it can take several seconds to analyse all the fonts on the system, and nobody wants to hang around for several seconds when you run the app. However, once you're in the font configuration dialog all the fonts have been analysed, and I guess I could have a button that would ask the app to automatically determine the "best" mappings based on a combination of character coverage, minimising the total number of fonts used (e.g. best to have all Latin blocks use the same font, and all CJK blocks use the same font if possible) and using recognised fonts where the choices are otherwise equal.

Sorry about the verbose reply, but it has given me some food for thought, and I think that I will incorporate your suggestions in the next release.

Andrew West said...

anonymous,

"I could not start BabelMap because of some unhandled exception, but this could be solved by deleting the BabelMap keys in the registry."

I'm sorry about the problems you're having. I have done quite extensive testing, and the age-old programmers' plaint "but it works on my machine!" is probably applicable in this case. If you could send me the original registry entries (which I assume have now been deleted and so you can't) I would probably be able to ascertain what the problem was.

"Selecting "List All Styles of Fonts" does not work any longer (the program crashes). I suspect this is because I have fonts with styles like "light" installed on my machine."

This should work, and does work on my system (XP SP2), where I also have some "light" styles of fonts (the particular style should not matter). If deleting the BabelMap registry keys [HKEY_CURRENT_USER\Software\BabelStone\BabelMap] does not clear the problem, then it must be something weird about one of the fonts on your system. I probably need more information to solve this, but I will check over the code to see if there is anything obviously wrong.

Anonymous said...

Hi,

the problem I reported was apparently some caching bug in windows. I removed font after font to see if I have some corrupted file on my machine. BabelMap worked perfectly when I removed Arial Unicode MS, but even after readding it to the font folder it seems to work. I suspect that the font was not properly installed, not sure why it worked before than.

Sry for the false alarm! :)

Andrew West said...

Glad to hear that your problem is gone now. I too am mystified as to why uninstalling and reinstalling Arial Unicode MS should make things work. If you do have any further problems please feel free to email me.