Zend Framework

Properly handle upper case language name in TMX

Details

  • Type: Improvement Improvement
  • Status: Resolved Resolved
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: 1.9.5
  • Fix Version/s: 1.10.0
  • Component/s: Zend_Translate
  • Labels:
    None

Description

At present, tmx file adapter simply puts whatever language name provided in the array, keeping the case of the language in the tmx file as a key in the array.

Therefore, if the file contains "EN", it will have $_data['EN']['test message'] = "test message - english";

However, a proper locale string must be "en". So, if the xml file contains "EN", adapter does not find the message for query against locale "en".

It will return "test message" instead of "test message - english". It is not efficient to change "EN" in xml to "en" as this file is generated by an editor which keeps the language as upper case.

ZF should do one of the following:

1. Convert locale to lower case for user.
2. Error out the xml file as it is not a valid file. (As you cannot set locate to "EN").

Activity

Hide
Thomas Weidner added a comment -

1. can not be done... it would disallow en_US as it would convert it to en_us making the same problems for region as before for language.

2. can not be done as invalid files are ignored while processing a directory search

Show
Thomas Weidner added a comment - 1. can not be done... it would disallow en_US as it would convert it to en_us making the same problems for region as before for language. 2. can not be done as invalid files are ignored while processing a directory search
Hide
Ankit Shah added a comment -

For 1. it makes sense that the last two are upper cased. However, is it not possible to make sure that we store en_us as en_US by simply splitting the string and making sure that first part is always lower cased and second part is always upper cased?

Show
Ankit Shah added a comment - For 1. it makes sense that the last two are upper cased. However, is it not possible to make sure that we store en_us as en_US by simply splitting the string and making sure that first part is always lower cased and second part is always upper cased?
Hide
Ankit Shah added a comment -

So, to elaborate, how about if we change line # 114 on Tmx.php from:

$this->_tuv = $attrib['xml:lang'];

to:

$tuv_array = explode("_", $attrib['xml:lang']);
$this->_tuv = strtolower($tuv_array[0]). (($tuv_array[1])?"_".strtoupper($tuv_array[1]):"");

Show
Ankit Shah added a comment - So, to elaborate, how about if we change line # 114 on Tmx.php from: $this->_tuv = $attrib['xml:lang']; to: $tuv_array = explode("_", $attrib['xml:lang']); $this->_tuv = strtolower($tuv_array[0]). (($tuv_array[1])?"_".strtoupper($tuv_array[1]):"");
Hide
Ankit Shah added a comment -

sorry properly formatted message for to:

$tuv_array = explode("_", $attrib['xml:lang']);
$this->_tuv = strtolower($tuv_array[0]). (($tuv_array[1])?"".strtoupper($tuv_array[1]):"");

Show
Ankit Shah added a comment - sorry properly formatted message for to:
$tuv_array = explode("_", $attrib['xml:lang']); $this->_tuv = strtolower($tuv_array[0]). (($tuv_array[1])?"".strtoupper($tuv_array[1]):"");
Hide
Thomas Weidner added a comment -

Does not work as a locale can include also other informations.
And we expect that the lang attribut holds the locale information and not only the lang.

xml:lang could for example look like this:
ar_Arab_JE
or
de_DE_Punji

Show
Thomas Weidner added a comment - Does not work as a locale can include also other informations. And we expect that the lang attribut holds the locale information and not only the lang. xml:lang could for example look like this: ar_Arab_JE or de_DE_Punji
Hide
Ankit Shah added a comment -

Hi Thomas,

Thanks for your quick reply and attention. I apologize if I am totally missing the point.

In any case, how about just simply changing the language part "EN" to lower case "en":

$this->_tuv = strtolower($tuv_array[0]). (($tuv_array[1])?"".implode("", array_slice($tuv_array,1) ):"");

I think the code definately needs to change one way or the other. If the case cannot be changed to lowercase, then xml should not be preocessed. In my opinion, it does not make sense to process and store XML data if it cannot be accessed.

Show
Ankit Shah added a comment - Hi Thomas, Thanks for your quick reply and attention. I apologize if I am totally missing the point. In any case, how about just simply changing the language part "EN" to lower case "en":
$this->_tuv = strtolower($tuv_array[0]). (($tuv_array[1])?"".implode("", array_slice($tuv_array,1) ):"");
I think the code definately needs to change one way or the other. If the case cannot be changed to lowercase, then xml should not be preocessed. In my opinion, it does not make sense to process and store XML data if it cannot be accessed.
Hide
Thomas Weidner added a comment -

I agree with xml:lang being an locale identifier.
That's the reason why I did not close this issue.

But, and this is more important for you, a locale is never uppercased.

This means that EN will also not be recognised afterwards.
en-us, en_us, en-US, en_US, en-us-Latn, en-us-ISO8859 and so on would then be recognised and switched to en_US.

In future an notice will be raised when an unidentified locale has been found. But the data will still be added. You could be in need of this feature when you extend the base class. And you could turn off the notice by an already existing option.

Show
Thomas Weidner added a comment - I agree with xml:lang being an locale identifier. That's the reason why I did not close this issue. But, and this is more important for you, a locale is never uppercased. This means that EN will also not be recognised afterwards. en-us, en_us, en-US, en_US, en-us-Latn, en-us-ISO8859 and so on would then be recognised and switched to en_US. In future an notice will be raised when an unidentified locale has been found. But the data will still be added. You could be in need of this feature when you extend the base class. And you could turn off the notice by an already existing option.
Hide
Ankit Shah added a comment -

Thanks, Thomas. That makes sense. If at all possible, it would be great if you can keep the update to the class such that a person extending the tmx adapter can override behavior for matching locale by overriding a method.

Either way, I appreciate your time and please keep up the good work.

Show
Ankit Shah added a comment - Thanks, Thomas. That makes sense. If at all possible, it would be great if you can keep the update to the class such that a person extending the tmx adapter can override behavior for matching locale by overriding a method. Either way, I appreciate your time and please keep up the good work.
Hide
Thomas Weidner added a comment -

New feature implemented with r19261 as described before.

Show
Thomas Weidner added a comment - New feature implemented with r19261 as described before.

People

Vote (0)
Watch (1)

Dates

  • Created:
    Updated:
    Resolved: