sammy34 sammy34 - 1 month ago 15
JSON Question

Bug hunt: CLDR 30 JSON data no longer has currencySpacing information

We've been using jquery/globalize in our web application with the CLDR 29 data in JSON format without any problems. Just recently, Unicode released CLDR 30 (and shortly after, version 30.0.1 with some fixes).

When we upgrade to the CLDR 30(.0.1) data, our client-side currency-formatting tests are failing because, for many cultures, the "currencySpacing" information in numbers.json is no longer there. For example, assuming the culture ar-AE, the Globalize library tries to load CLDR data at the path...

/main/ar-AE/numbers/currencyFormats-numberSystem-arab/currencySpacing/beforeCurrency

...which doesn't exist in the latest CLDR 30 numbers.json data for this (and many other) cultures.

We've been trying to traverse the stack to see what's causing this problem. We started with the DTD. The DTD for CLDR 30 (along with that for CLDR 29) includes the line...

<!ELEMENT currencyFormats ( alias | ( default*, currencySpacing*, currencyFormatLength*, unitPattern*, special* ) ) >


...which implies that the currencySpacing is an optional element. That said, we can't find anything in the CLDR 30 release notes or Delta that suggests that this information was changed for a large number of cultures.

In the XML data (the "ground truth") we see that the
currencySpacing
element is only used in main/root.xml in both CLDR 29 and CLDR 30, i.e. apparently no significant changes in this respect in the XML.

This makes us wonder if it's a problem in the tool that's used for generating the JSON data from the XML data. The tool is called
ldml2json
and is also used by the cldr-json project. To rule out a bug in the cldr-json project, we built the tool ourselves and generated the JSON data ourselves. This generated data was then also missing the "currencySpacing" information in the numbers.json files. So it doesn't seem to be an issue with the cldr-json project.

If we understand correctly, this means that the problem is either:


  • The ldml2json tool has a bug

  • jquery/globalize is incorrect to assume that this information always exists



If the latter is true, then I guess this should be raised as a jquery/globalize bug. Investigating the former would require us to debug from source presumably. Before we go investing time in either, we wanted to ask: Is anybody else seeing this problem, and is there any known solution? Our hope is that there's someone out there who's a bit more experienced in the CLDR+JSON+Globalize stack that can help us out!

Answer

This was caused by this change: http://unicode.org/cldr/trac/changeset/12636/trunk/common/main/root.xml

Before this change, the root locale's currencySpacing information for the arab number system was inherited by all the other locales. Now it's no longer there.

I'm not sure how the missing currencySpacing should be handled, but the java and C documentation both state that the data can be null. Both appear to use a hard-coded default in that case: http://bugs.icu-project.org/trac/browser/icu4j/trunk/main/classes/core/src/com/ibm/icu/impl/CurrencyData.java#L86

So this is probably a bug in globalize.

Update: Bug report and pull request.