Sunday, 24 October 2010

BabelPad and BabelMap Version

New versions of BabelPad and BabelMap that support Unicode 6.0 have been released today, and can be downloaded directly :

  • (simply unzip the file BabelPad.exe and run it from wherever you like)
  • (simply unzip the file BabelMap.exe and run it from wherever you like)

Creative Commons License
This screenshot of BabelMap is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License (CC-BY-SA-3.0) by Andrew West.

Important Technical Note

BabelPad and BabelMap were scheduled for release on 11 October, to coincide with the release of Unicode 6.0 on that day, but their release was delayed due to a Blue Screen of Death crash that occured with the beta versions of both BabelMap and BabelPad when the Windows function ExtTextOutW is called within a path bracket, and the selected font is Symbola font version 6.00, and the ETO_GLYPH_INDEX flag is set, and the glyph index passed to the function corresponds to U+1F5FD STATUE OF LIBERTY (this problem only occurs in BabelPad when in Simple Rendering mode, which bypasses Microsoft's Uniscribe rendering engine). The glyph for U+1F5FD in the Symbola font has a mega-complex glyph outline (which oddly enough is the glyph for an angel, whilst the glyph for the Statue of Liberty is actually at U+FFFED), which probably results in a buffer overrun somewhere within Windows GDI. In order to work around this problem I have had to rewrite, refactor and retest core sections of the source code.

The newly released versions of BabelPad and BabelMap fix the problem described above, and should be safe for use with the Symbola font under normal usage scenarios, but the glyphs for U+1F5FB (MOUNT FUJI) through U+1F5FF (MOYAI) are rendered very slowly because of their extreme complexity (several thousand points for each glyph), resulting in sluggish response in BabelMap when scrolling through the Miscellaneous Symbols And Pictographs block, and potentially extremely sluggish performance in BabelPad.

Moreover, the Symbola font may still cause a Blue Screen of Death crash (reporting an infinite loop) on some systems when rendering U+1F5FD STATUE OF LIBERTY at high point sizes with standard Windows applications such as Notepad (my test case is to set Notepad to use the Symbola font at 72 points, and then paste in a string comprising twelve instances of U+1F5FD — my XP machine then blue screens, although my Vista machine is OK). This general Windows-level vulnerability to Symbola version 6.00 means that BabelMap may still blue screen if you insert multiple instances of U+1F5FD into the BabelMap edit buffer, and BabelPad may still blue screen if you attempt to display a document with multiple instances of U+1F5FD at a large point size in Complex Rendering mode (i.e. using Uniscribe). For this reason, you are strongly advised not to install Symbola version 6.00, but if you do install this font I cannot be responsible for any loss or damage incurred due to a system crash when running either BabelPad or BabelMap.

BabelPad Enhancements

  • BabelPad now emulates the Alt-X functionality found in Microsoft Word and WordPad (position the caret after a hexadecimal code pont value and hit Alt-X to convert it to the corresponding Unicode character; and position the caret after a Unicode character and hit Alt-X to convert it to its corresponding hexadecimal code point value)
  • Convert Unicode character names to their corresponding Unicode character (due to the difficulty of disambiguating strings such as "bell symbol for bell with cancellation stroke" where "bell", "bell symbol", "symbol for bell" and "bell with cancellation stroke" are all Unicode character names, the selected text must be an exact Unicode name or formal alias, and not a partial name or a longer text string containing a Unicode name; although you can use the contextual convert utility to convert structured data such as <UnicodeName>Vulgar Fraction Three Quarters</UnicodeName> to <UnicodeName>¾</UnicodeName>) ["Convert : Unicode Name to Character" from the main menu or the right-click menu]
  • Import Shift-JIS encoded documents with emoji extensions defined by DoCoMo, KDDI or SoftBank [select "Shift-JIS plus DoCoMo/KDDI/SoftBank emoji" from the "Encoding" dropdown list of the "Open File" dialog]
  • Convert Han ideographs to their pinyin or jyutping readings (not perfect as characters with multiple readings are converted to a slash-separated list of readings, even when one reading is considerably more common than another, but this feature may be useful for some users in some situations)
  • Title casing options for either Script Neutral title casing (e.g. The Owl And The Pussy-Cat Went To Sea) or English title casing (e.g. The Owl and the Pussy-Cat Went to Sea) ["Options : Title Casing" from the menu]
  • The default script colours when colour coding by script is selected ["Options : Display Colours : Colour Code by Script" from the menu] have been harmonized with the default script colours used for BabelMap, and an option to reset all script colours to their default values has been added to the "Configure Script Colours" dialog (this needs to be selected for the new default colours to be used).

BabelMap Enhancements

  • Script colours when colour coding by script has been selected are now user configurable ["Options : Customize Colours..." from the menu]
  • When colour coding of characters has been selected, the character with focus is no longer highlighted in red
  • The character with focus in the character grid is now indicated by its cell having an inset appearance
  • Option to rotate of not rotate the glyphs for vertical scripts (Mongolian and Phags-pa) where the selected font has rotated glyphs for vertical layout (in previous versions of BabelMap the glyphs are always rotated) ["Options : Other Options : Rotate Vertical Scripts" from the menu]
  • The Export Font Glyphs utility has been improved to ensure glyphs are not accidentally clipped in some cases
  • The Han Radical Lookup utility has been updated to cover CJK-D (now covers all all 74,616 CJK unified ideographs)
  • The Advanced Character Search utility now has an option to only give the total number of characters matching the selected criteria, and not list them all (this makes searches which return a large number of results, for example when querying how many characters were introduced in a particular version of Unicode, very fast)


mfarah said...

Once again, thank you very much for your efforts. BabelMap in particular has been a great tool for my needs.

Ronald Kyrmse said...

BabelMap is - allow me to say this - indispensable and unsurpassed! Many thanks.

Steve Hollasch said...

For some reason BabelMap is unable to launch help ("Failed to launch help."). Is there another file that needs to reside side-by-side with babelmap.exe?

Andrew West said...

There used to be a help file, but I am afraid that it is out of date and no longer available; and I do not currently have the time to create a new one. Hopefully you will find that BabelMap is intuitive to use, but if you do have any questions please do not hesitate to ask.

Drabkikker said...

A very useful tool, great!
Okay, total noob question here: How do I get the 'empty' character sets (the ones that are blank but which have values assigned to them; e.g. Egyptian hieroglyphs) to display? Do I need to install fonts for that, or maybe a keyboard? Your advice would be much appreciated.

Andrew West said...

Unicode only assigns characters, but does not provide the fonts. Therefore, in order to be able to see newly assigned scripts and characters you need to install the appropriate fonts. The best place for finding specialist Unicode fonts is Alan Wood's Unicode resources site, but for recently assigned characters or scripts there may not yet be any fonts available.

Drabkikker said...

@Andrew: Thanks! Yup, that's what I thought. That link looks very promising, I'll go and have a look.

3155ffGd said...

The Numeric Value of U+5146 is given as "-727379968". I take it that's not intended?

Andrew West said...

Thanks for finding that -- it is of course incorrect, and should be "1,000,000,000,000". I am not working on BabelMap/BabelPad at present, but will fix it in the next release later this year.

Jono said...

I was wondering if there is any way to pass clipboard text to Babel map's font coverage utility as a command line, so that one could select text in a document and check what fonts would have the coverage.

Or if you could just add an option to read the clipboard content to the util when Babelmap is started, as a default action.

Final request. Could Babel Map indicate font coverage for default installs of XP,Vista,W7.

Andrew West said...

I was wondering if there is any way to pass clipboard text to Babel map's font coverage utility as a command line, so that one could select text in a document and check what fonts would have the coverage.

You could paste the text into the edit buffer, and check the "all characters in the edit buffer" radio button in the Font Coverage utility. Alternatively, you could open the document in BabelPad, and run BabelPad's font coverage utility ("Font Coverage..." from the "Tools" menu).

Or if you could just add an option to read the clipboard content to the util when Babelmap is started, as a default action.

Could do, but not sure it would be very useful as the user can quite easily manually paste the contents of the clipboard into BabelMap's edit buffer.

Could Babel Map indicate font coverage for default installs of XP,Vista,W7.

I don't think so. Listing coverage for fonts that are not installed on the user's system (or which conflicts with the actual coverage on the user's system) would be very confusing for most users.

fdwr said...

Praise: Great to see emoji and the latest scripts. The char props, font analysis tool, and advanced search are all invaluable. TY
Request: Please modify F2's current behavior to set focus to the Go To Code Point edit prompt, rather than just jumping back to the earlier code point, as I'm very often needing to jump around to new code points by value.

Andrew West said...

Thanks for the praise. I'll consider your F2 request ... it just depends upon whether there are other users who like the current behaviour or not.

Rock said...

An indispensable tool for someone like me who needs to access the new medieval Latin fonts that are included in the Unicode 6.0 spec.

May I make one suggestion for improvement?

Remembering the Unicode values or name for any particular character can be tiresome. It would be great if there was a Favourites toolbar under the main menu, where you can drag the special characters you use the most, and a button is created there. That way, and if the Edit Buffer at the bottom was expanded a little, you would have a complete Font Viewer/Character Map/Editor all in one package. I could type in Latin, and insert any of those unusual characters by clicking a button on my Favourites toolbar, with its symbol on the button.

Keep up the good work

Rock said...

Having now checked out your BabelPad, my previous comment for a Favourites toolbar would be better directed to BabelPad than BabelMap.

Your Manchu and Tibetan buttons are just what I had in mind for my most frequently used Latin characters/marks.

I wish, I wish, I wish.....

Andrew West said...

BabelMap has a "Bookmark" feature which allows you to bookmark up to 32 characters (highlight the required character and press the Insert key), which is not quite what you want, but does allow you to quickly enter a favourite character -- select the required character from the Bookmark menu and then simply hit the Return key to enter it into the edit buffer.
For BabelPad, the Insert menu does already list quite a few frequently used format and punctuation characters, but I agree that a user-definable list of favourite characters would be a very useful feature for BabelPad, and I will add it in the next version (later this year, probably October).

Rock said...

Although not directly connected to BabelMap:

I download a TrueType font, Andron Scriptor Web, which has the Latin Extended-D character set, as defined in the latest Unicode v. 6.0 spec, with the Latin symbols I am interested in starting at U+A750. When I load this font into BabelMap and scroll down to A750 all the characters are there.

However, when I open the Windows XP Character Map, and load the same font, and select Unicode for the character set, I find that Character Map shows characters up to U+2767, then it jumps to U+E004, thus jumping right over the Latin Extended-D characters.

Do you have any thoughts on why this might be happening?

Andrew West said...

The simple answer is that character map is (to put it politely) not very good, and has not been updated to reflect new versions of Unicode since it was created. The version that ships with XP is stuck in the world of Unicode 3.0 from 1999, and will only display characters that were defined in Unicode 3.0 (i.e. only 49,259 out of the current total of 109,449 characters). The versions of character map that ship with Vista and 7 are not much better.

Rock said...

Thanks for that Andrew.

I got into investigating the MS Character Map as I was putting together a Visual Basic 2005 app for a simple Latin Text Editor, and I wanted buttons across the top to provide quick access to the special Latin symbols, but they just would not display (with the Andron font selected, and a code such as U+A702. With help from the VB forum I discovered that if Character Map could not see a character, then neither could the RichTextBox control on my form. All characters that Character Map could see could all be displayed no problem. So I am struggling to see the connection between the MS Character Map and the Unicode version it uses, and why a RichTextBox control behaves in the same way.

Rock said...

Update. I have now discovered that if I use a TextBox or a Label, rather than a RichTextBox, then all unicode characters are displayed, regardless of whether Character Map sees them or not. WordPad uses a RichTextBox, and Notepad uses a TextBox. This explains why copied characters from BabelMap pastes fine in NotePad, and can produce unexpected results when pasting in Wordpad. Hope this helps someone.

NoviceNotes.Net said...

Most Honourable and Noble, Mister West, please note my enduring affinity for your unparalleled work; diligence in propagation, and perfection of the thing called "Unicode", and your exemplary benevolence in continued development of priceless, powerful, and rather prerequisite software (yes, i was going for alliteration... doh!) in BabelStone software.

I thought of a recommendation. Typically, as a User, I must-have BabelMap in my "QuickLaunch" items. It occurred to me, momentarily, how convenient to launch BabelPad from a [user conf'd] button, on/off displayed maybe, toobar-wise, in BabelMap. Yes, I realize BabelPad has something of a babelMap, built-in (in fact, my original, only knowledge of such a map, was of BabelPad's). Rather than I ramble so much, more, I conclude: you dig?

Love you, man, as much as platonic love might be, 'tween two unbeknown-to-each-other individ's, such as ourselves. hehe... :wink:
Rock on! hugs n hugs. n air-kisses, one-cheek, two-cheek (all fancy-like, you know).

KevinCarmody said...

Thank you for BabelMap - a great utility. One problem - UCN outputs the code point in hex but it should be decimal. E.g. U+2070 SUPERSCRIPT ZERO in UCN should be \u8304, not \u2070.

KevinCarmody said...

Please ignore my previous post. BabalMap's UCN is the C/C++ UCN, which takes 4 hex digits. I was trying to use it for RTF \u codes, which look the same but take a decimal sequence. NCR decimal is close enough for RTF \u codes.

izhnannyk said...

BabelMap is absolutely indispensable for anyone who is not satisfied to stay within the narrow confines of ASCII/ANSI. But now I have a problem. In the past I have always been able to put any glyph from any Unicode font installed on my computer into the Edit Buffer, copy it, and paste in into a Word doc. This morning, however, when I paste I get only a box or whatever empty symbol goes with the font in question. What am I doing wrong?

Andrew West said...

I'm sorry, I have no idea what has caused this problem. Perhaps Word has been updated, and now behaves differently.

Fanolian said...

Thanks for BabelMap and BabelPad. I find a little glitch about ⁒ (U+2052 COMMERCIAL MINUS SIGN) in BabelMap with "Character Name Display and Search" set to "Unicode Name plus Aliases".
I think the word in question should be abzüglich (German) instead of abzlich (German), which the glitch is using U+E161 in, I guess, 細明體_HKSCS.
Here is a screenshot of the glitch:

Andrew West said...

Hmm, it displays OK on my system, so BabelMap is not setting the wrong text. It appears your system is reinterpreting the code page and reading it as if it were Chinese. What Operating System are you using, and what are your locale settings?

Fanolian said...

I use Win7 64bit SP1 Ultimate English version with all updates. Here are the settings in Region and Language:
Formats: Chinese (Traditional, Hong Kong S.A.R.)
Location: Hong Kong S.A.R.
Display language: English (I don't have any other languages installed from Windows Update)
Current language for non-Unicode programs: Chinese (Traditional, Hong Kong S.A.R.)

I find that if I set the Formats to English (United States) or some other locales, the problem is gone.
However, setting to the followings produces different glitches on U+2052 (Restart BabelMap after each change of locale):
Chinese (Simplified, P.R.C): abz黦lich (U+9EE6)
Japanese (Japan): abz・lich (U+30FB)
Korean (Korea): abz?lich (U+003F)
Russian (Russia): abzьglich (U+044C)

Andrew West said...

Thanks for the additional information. I have been able to reproduce the problem by setting the format to Chinese (Simplified, P.R.C). I will investigate, but as I am busy on other projets at present I will not be able to do so for a few weeks.

MaxStirner said...

Could you please include kind of a simple PDF exporter in the future releases? At this time it is very hard to transmit Unicode text to another destination because the Unicode font and BabelPad must also be present on every system to read the text

Andrew West said...

No, I'm afraid that PDF export functionality is not something that I intend to add to BabelPad. There are plenty of "Print to PDF" tools available that you can already use with BabelPad (e.g. CutePDF).

Sevendy said...

When I use the Windows (7 64-bit) utility Character Map, copying and pasting characters into a Windows document brings along the font info for each character. BabelMap doesn't appear to provide any font info at all; even the Edit Buffer doesn't seem to know what font is associated with each character. Am I doing something wrong? If not, what use is the Composite Font feature? It only displays in the last selected Single Font.

Sevendy said...

(I suspect that this may be related to the problem "izhnannyk" saw back in September.)

Andrew West said...

When I use the Windows (7 64-bit) utility Character Map, copying and pasting characters into a Windows document brings along the font info for each character. BabelMap doesn't appear to provide any font info at all;

That is correct; when you copy from BabelMap it only does a plain text copy. I will investigate the possibility of implementing a rich text copy like Character Map.

the Edit Buffer doesn't seem to know what font is associated with each character.

That is correct. The edit buffer is an ordinary Windows edit control, which only allows for a single font. I hope to implement a rich edit control which will use multiple fonts in a future version of BabelMap.

Am I doing something wrong?


If not, what use is the Composite Font feature?

It allows the character grid to to display different Unicode blocks using different fonts.

I suspect that this may be related to the problem "izhnannyk" saw back in September.

No, that cannot be the case, as BabelMap has always had this behaviour, and izhnannyk was reporting a change in behaviour.

Andrew West said...

The latest version of BabelMap now supports rendering of multi-script text in the edit buffer using the user-defined composite font, as well as RTF copy.

kemihiiri said...

‘latest version of BabelMap’ is a dead link :(