locked
"zh-Hans" vs "zh-CHS" and "zh-Hant" vs "zh-CHT" RRS feed

  • Question

  • Hi there,

    I just discovered that to conform with the IETF standard, MSFT changed the language codes "zh-CHS" to "zh-Hans" and "zh-CHT" to "zh-Hant" (for Chinese Simplified and Traditional respectively). This occurred in .NET years ago and both the old and new codes are available when using .NET 4.0. In previous versions of .NET only the old codes are available. This is the simplified story anyway. In any case, the Microsoft Translator itself also recognizes both sets of codes based on my testing, but if you call "GetLanguagesForTranslate()", it only returns the old codes. Is this by design? Are there any plans to ever start returning the new codes instead, or even both codes? I need to update my app to deal with both sets of codes and need to take this into consideration. Thanks.


    Larry (25+ years on MSFT platforms, mostly in C/C++)



    Monday, March 26, 2012 12:50 PM

Answers

  • Hi Larry,

    Indeed both formats are accepted, and only the older naming convention is emitted. This is by design for now, and we are not going to change that for V2 of the API. This would only change in a major version update.

    Chris Wendt
    Microsoft Translator

    Tuesday, March 27, 2012 5:32 AM

All replies

  • Hi Larry,

    Indeed both formats are accepted, and only the older naming convention is emitted. This is by design for now, and we are not going to change that for V2 of the API. This would only change in a major version update.

    Chris Wendt
    Microsoft Translator

    Tuesday, March 27, 2012 5:32 AM
  • Hi Chris,

    Ok, thanks for the info. I'll be updating my resx localization tool accordingly. For developers in general however, it's problematic since "zh-Hans" and "zh-Hant" are really the de facto standard now. As I'm sure you know (being an expert in this area), MSFT recommends their use over the older names which are really for legacy use only. When determining if a language is supported however, a call to "GetLanguagesForTranslate()" indicates that "zh-Hans" and "zh-Hant" aren't supported, even though they are. Programmers are therefore forced to code special checks to handle these new names, even though they're actually the official standard now. These should really be added to the collection returned by "GetLanguagesForTranslate()", and either remove  "zh-CHS" and "zh-CHT", or more likely, leave them there for legacy reasons (so as not to break existing code).

    As a side-note (rant warning :), while I know it has nothing to do with the Microsoft Translator team, I'm not sure why MSFT decided to change these codes in the first place. While it might have been a noble goal to follow the IETF standard (though I'm not really familiar with this), the old codes were already out there, and most developers don't benefit by the new codes, nor care about what these codes are called (a code is a code). Just the opposite occurs in fact, since now everyone including MSFT itself has to deal with two codes that represent the same language (and the resulting problems). My own program needs to be fixed to handle this (after a customer contacted me with an issue), others have cited problems on the web (and far more probably haven't publicised theirs), and MSFT itself had to deal with this in their own code. This includes adding both codes to .NET even though they're actually the same language (in 4.0 they distinguished between the two by adding the name "legacy" to the full language name of the older codes), adding special documentation to highlight this situation in MSDN, making "zh-Hans" the parent culture of "zh-CHS" (not sure if it was always this way but it's a highly questionable relationship), and even adding special automated code to newly created "add-in" projects in Visual Studio 2008 (only to later remove this code in Visual Studio 2010, without explanation and therefore causing confusion for developers - long story). In any case, this is not your doing of course, but I don't see how anyone benefits from this change in practice. Only those developers who really care about following the IETF standard would be impacted, and that number is likely very low. For all others, the new codes are just an expensive headache. Again, not blaming you of course :)

    Anyway, thanks again for your feedback Chris. Appreciated.


    Larry (25+ years on MSFT platforms, mostly in C/C++)


    Tuesday, March 27, 2012 12:38 PM
  • Hi Larry,

    I feel the pain. It is a normal thing that standards do evolve over time, though, and our software needs to cope with it.

    Chris Wendt
    Microsoft Translator

    Tuesday, March 27, 2012 1:18 PM