sunnuntai 12. marraskuuta 2017

Grammatical numbers, grammatical genders and abbreviated numbers

I have been quiet for a very long time. It was not my plan but raising kids, writing new localization tool and having a new job have taken my time.

Past few years I have been working on internationalization and localization industry writing tools and APIs for I18N and L10N. I want to show one of those APIs.

https://github.com/soluling/I18N

It is a set of open source APIs for .NET and Delphi to fill gap between native I18N API of Windows and CLDR. I am a big fan of CLDR. It has a great number of cool stuff. Unfortunately there are some major limitations. First is that no all OS support full set of CLDR. This bring the need to use a CLDR library or add-on. There is one, ICU, but I don't like it too much. It is C++/Java only. It is huge. It is old. It is hard to use. It requires CLDR XML data. We need easier API to I18N. My plan is to implemented some of them. At start I implemented API for grammatical numbers, grammatical genders and abbreviated numbers. My approach differs from ICU. Instead of writing one cross platform API that requires CLDR XML data I each API on its native language. This means .NET API is 100% C# and Delphi API is 100% Delphi. Also the rules are in C# or Delphi. I wrote a tool that extract data from CLDR and generates both .cs and .pas files that are compiled into the library. The result is 100% native API that is ready to be used without any DLLs or XML files.

Last week I gave a session about these API at CodeRage XII. It was a pleasant experience. CodeRage is a Delphi oriented virtual conference that contains tons of good sessions about how to write better Delphi applications.

Unlike Android, ,NET and Delphi do not have a resource format suitable for grammatical numbers and genders. This is why my API uses multi patterns composed together and stored into a standard resource string. For example a .NET string for grammatical number would be something like this

"{plural, ski, one {I have {0} ski} other {I have {0} skis}"

Then instead of Format you would use API's MultiPattern.Format. The functions uses language specific rules to calculate what pattern to use and uses it.

The translator can add any number of pattern in the string. If you use a good localization tools (as shown on my Github pages) the tool takes care that you enter right amount of patterns.

If we need to show large numbers to the user we have traditionally converted numbers into strings and showed them. For example number like 143503000000 would be
143,503,000,000 if showed on English system but 143 503 000 000 when showed on Finnish system. In most cases this would be enough to properly show the number. However to make the number to occupy less space and to make it easier to understand the magnitude of the numbers we can use the abbreviation API. The above number becomes to 144B or 140B  depending on how many digits we want to use. In Finnish it would be 144 mrd. In Japanese 1440億. The API uses language specific rules to round and abbreviate the number.

If you are a .NET or Delphi programmer I hope you start using these APIs.