lauantai 14. maaliskuuta 2009

Double resourcing

This winter I have been implementing localization support for various mobile platforms such as Android, Silverlight, Qt and Symbian. In all these platforms I have faced a feature that I don’t like and that is bad for localization. As far as I know there is no name for this “feature” so I use the name I invented: double resourcing. What does double resourcing mean? Shortly it means situation where you do not add actual strings into your user interface (UI) resource files but instead you add all strings to flat string resource file(s) and you refer those strings from your UI resource files. Let me show a simple sample. Here is a UI resource of a button

[button id=”browseButton” caption=”&Browse...” with=”100” height=”30” color=”red” /]

The above line does not represent any actual resource file format but is purely used as demonstration purposes. The caption attribute contains the string of the button. In most cases you only localize this. Sometimes you also have to localize with and/or color. The good thing of above format is that all the data for button are in a single file and even in a single XML element. So the context of &Browse... string is the caption of browseButton. Button also has a context that is most likely form or dialog. Now if you use double resourcing you no longer write the string value in the caption attribute but instead you have to maintain two files. First a flat string file

browse_button_caption &Browse...

Secondly the UI resource file becomes to

[button id=”browseButton” caption=”link::strings.txt::browse_button_caption.” with=”100” height=”30” color=”red” /]

The caption attribute contains a reference to the actual string resource. In this case the string value is found from strings.txt file with browse_button_caption context.

Why is double resourcing a bad idea? First of all it makes creating and maintaining resource files a lot harder. You no longer put string into the resource file but you add the strings to a flat string file and put the link or id of the string to the resource file. When you move the strings from the original user interface resource to a flat string file you lose the context of the string. Context is a very important part in the localization process. Without it the translator may not figure out the full meaning of the string and this may lead to a bad translations. Also because you do not know where the string belongs to it is hard to check if the translations fit to the size of the element where the string came from.

I am not happy to notice that many platforms, even brand new, use double resourcing. Symbian has always used it. It is not wonder because frankly speaking Symbian is a nightmare for developers. How should its resource format being any better! There are more platforms using double resourcing. Even some samples of WPF recommend you to store strings into flat .resx files and place only links to the .xaml files. This is odd because XAML is a very good resource format and a single XAML file contains all elements of a page or window. You only need to translate the .xaml file. The reason for using .resx with WPF might be that there did not exist any good localization tools supporting XAML. Resx is well know format that is familiar for most developers and translators even very few translator can edit .resx with text editor without breaking it. However this is not a good reason. Using only XAML files makes creating and maintaining WPF projects much easier. Nowadays there are good localization tools that can localize XAML files. Microsoft should remove the sample files using double resourcing as soon as possible and to instruct users to localize XAML files.

Recently I have been implementing localization support for two very promising platforms: Qt and Android. Guess what? Both use double resourcing! I cannot believe this. Both Qt and Android has good UI resource format (ui and layout files). The vendors should encourage user to localize these files but it seems that platform developers do not take localization seriously. Somehow they believe if they provide flat resource string file it is all they need to do for localization. Hey, this is wrong. Only the UI files contain strings with full context and this gives translators all information they need to create high quality translations. UI file format is indeed more difficult than string file format but developers and translators should leave reading and writing of resource files for localization tools. They can cope with even complex XML or binary based resource files without breaking them.

As far as see the platform vendors should concentrate on following:
  • Implement structured UI resource file that contains all UI information such as strings, layouts, colors, etc. XML is a good format for this.
  • Implement mechanism where there can be localized UI files.
  • Implement feature where the developer can choose what language resource to use.
Microsoft has done this. They only problem is that some of its sample files do not encourage to use UI file localization. Android has almost same situation but all samples use double resourcing. Qt has not implemented runtime UI resources format at all. Symbian is hopeless. It does not use XML in the resource file format, it has several different incompatible resource formats and most of their samples use double resourcing. The platform has so many problems that even with this new Symbian foundation I don’t expect the platform surviving more than ten years.

Localization tool vendors should implement:
  • Tool can read the original UI files
  • Tool can visually show the items of the UI files.
  • Tool can create localized UI files.
  • Tool can compile the localized resource files to runtime format that is most often binary format (baml, mo, rss)
Of course every platform needs a flat string resource file but this should be used only for string resources such as error messages and dynamic strings.

Symbian, Google and QtSoftware should abandon double resourcing and move to UI file localization. Some platform vendors such as Borland/CodeGear has always used properly structured UI files and never used double resouring. In my first blog entry I told that Delphi is the best tool if you consider localization. In this blog and next to come I will every now and then compare bad things to good things and in most cases it will be same as comparing other platforms to Delphi.