Zend Framework

Use of PHP's setlocale() may break functionality in Zend_Locale_Format

Details

  • Type: Bug Bug
  • Status: Resolved Resolved
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: 0.8.0
  • Fix Version/s: 1.0.3
  • Component/s: Zend_Locale
  • Labels:
    None

Description

Problems arise when PHP converts integers and floats to strings using decimal place characters other than a period ('.'), which is the only decimal place separater supported by BCMath functions and the Zend_Locale_Math "proxy" class.

To reproduce the errors below on the ZF Community Development Server:
1. Copy TestConfiguration.php.dist to TestConfiguration.php
2. Comment the first line below in TestConfiguration.php, and comment out the second line below:

//define('TESTS_ZEND_LOCALE_FORMAT_SETLOCALE', 'fr_FR@euro');
define('TESTS_ZEND_LOCALE_FORMAT_SETLOCALE', false);

3. cd tests/Zend/Locale
4. php AllTests.php

zfdev Locale 737$ pwd
/home/gavin/www/zftrunk/tests/Zend/Locale

zfdev Locale 738$ uname -a
Linux www.zfdev.com 2.6.9-42.EL #1 Wed Jul 12 23:16:43 EDT 2006 i686 i686 i386 GNU/Linux

zfdev Locale 739$ php -v
PHP 5.1.6 (cgi-fcgi) (built: Oct 24 2006 19:38:28)
Copyright (c) 1997-2006 The PHP Group
Zend Engine v2.1.0, Copyright (c) 1998-2006 Zend Technologies

zfdev Locale 740$ svn update
At revision 3991.

$ php AllTests.php
X-Powered-By: PHP/5.1.6
Content-type: text/html

PHPUnit 3.0.0 by Sebastian Bergmann.

.......F..F..F......F..FF

Time: 00:06

There were 6 failures:

1) testToNumber(Zend_Locale_FormatTest)
string 0,1234567 expected
Failed asserting that <string:0,1234567> is equal to <string:0,0000000>.
/var/www/html/zftrunk/tests/Zend/Locale/FormatTest.php:97
/var/www/html/zftrunk/tests/Zend/Locale/AllTests.php:46
/var/www/html/zftrunk/tests/Zend/Locale/AllTests.php:62

2) testToFloat(Zend_Locale_FormatTest)
string 0,1234567 expected
Failed asserting that <string:0,1234567> is equal to <string:0,0000000>.
/var/www/html/zftrunk/tests/Zend/Locale/FormatTest.php:177
/var/www/html/zftrunk/tests/Zend/Locale/AllTests.php:46
/var/www/html/zftrunk/tests/Zend/Locale/AllTests.php:62

3) testtoInteger(Zend_Locale_FormatTest)
value 1234567- expected
Failed asserting that <string:1234567-> is equal to <string:1234567>.
/var/www/html/zftrunk/tests/Zend/Locale/FormatTest.php:253
/var/www/html/zftrunk/tests/Zend/Locale/AllTests.php:46
/var/www/html/zftrunk/tests/Zend/Locale/AllTests.php:62

4) testToNumberFormat(Zend_Locale_FormatTest)
string 0,1234567 expected
Failed asserting that <string:0,1234567> is equal to <string:0,0000000>.
/var/www/html/zftrunk/tests/Zend/Locale/FormatTest.php:657
/var/www/html/zftrunk/tests/Zend/Locale/AllTests.php:46
/var/www/html/zftrunk/tests/Zend/Locale/AllTests.php:62

5) testToFloatSetlocale(Zend_Locale_FormatTest)
Failed asserting that <string:1Â 234,50> is equal to <string:0,00>.
/var/www/html/zftrunk/tests/Zend/Locale/FormatTest.php:758
/var/www/html/zftrunk/tests/Zend/Locale/AllTests.php:46
/var/www/html/zftrunk/tests/Zend/Locale/AllTests.php:62

6) testRound(Zend_Locale_MathTest)
Failed asserting that <string:513695,4> is equal to <string:513695.4>.
/var/www/html/zftrunk/tests/Zend/Locale/MathTest.php:5020
/var/www/html/zftrunk/tests/Zend/Locale/AllTests.php:46
/var/www/html/zftrunk/tests/Zend/Locale/AllTests.php:62

FAILURES!
Tests: 25, Failures: 6.

Issue Links

Activity

Hide
Thomas Weidner added a comment -

So the original problem is that php's BCMath implementation is not setLocale aware...

PHP converts the strings as demanded in setLocale and BCMath can only work with english locale.

Show
Thomas Weidner added a comment - So the original problem is that php's BCMath implementation is not setLocale aware... PHP converts the strings as demanded in setLocale and BCMath can only work with english locale.
Hide
Gavin added a comment -

Yes, BCMath functions are not locale-aware. Allowing PHP to perform implicit conversions of integers and floats to strings before using BCMath (and Zend_Locale_Math) functions only works, if the currently setlocale(LC_NUMERIC, 'XYZ') yields strings following the "C" locale convention, where 'XYZ' is the locale currently in effect. Explicit conversion using strval() does not bypass this issue, since strval() also localizes numbers converted to strings.

Conversion of floats and integers to strings depends on the currently in-effect locale:
http://www.php.net/setlocale

This locale setting is part of the standard C library. However, there are variances between end-users systems' behaviors for setlocale(), and there is no guarantee the localization performed by PHP functions will be consistent with CLDR. The locales deployed and supported by end-users systems for the setlocale() are not reliably predictable.

However, the "C" locale ought to always be present, per the ISO C standard. Thus, I propose a patch which essentially disables the user's current locale setting, according to PHP's setlocale(), for the duration of certain ZF i18n methods. The patch enables the i18n components to pass the new unit tests I recently added on my test systems.

More testing on other systems/platforms is needed, since the patch relies less on the behavior of PHP than on the behavior of the system's C library implementation of setlocale(). For example, should the patch also include testing if setlocale(LC_NUMERIC, 'POSIX') works, if setlocale(LC_NUMERIC, 'C') does not work?

Show
Gavin added a comment - Yes, BCMath functions are not locale-aware. Allowing PHP to perform implicit conversions of integers and floats to strings before using BCMath (and Zend_Locale_Math) functions only works, if the currently setlocale(LC_NUMERIC, 'XYZ') yields strings following the "C" locale convention, where 'XYZ' is the locale currently in effect. Explicit conversion using strval() does not bypass this issue, since strval() also localizes numbers converted to strings. Conversion of floats and integers to strings depends on the currently in-effect locale: http://www.php.net/setlocale This locale setting is part of the standard C library. However, there are variances between end-users systems' behaviors for setlocale(), and there is no guarantee the localization performed by PHP functions will be consistent with CLDR. The locales deployed and supported by end-users systems for the setlocale() are not reliably predictable. However, the "C" locale ought to always be present, per the ISO C standard. Thus, I propose a patch which essentially disables the user's current locale setting, according to PHP's setlocale(), for the duration of certain ZF i18n methods. The patch enables the i18n components to pass the new unit tests I recently added on my test systems. More testing on other systems/platforms is needed, since the patch relies less on the behavior of PHP than on the behavior of the system's C library implementation of setlocale(). For example, should the patch also include testing if setlocale(LC_NUMERIC, 'POSIX') works, if setlocale(LC_NUMERIC, 'C') does not work?
Hide
Thomas Weidner added a comment -

There is a simple solution I just found:

php's localeconv() gives you the values which are needed for knowing the decimal point and the seperator for a string converted number within the actual locale.

A simple str_replace would then do the rest !!

Show
Thomas Weidner added a comment - There is a simple solution I just found: php's localeconv() gives you the values which are needed for knowing the decimal point and the seperator for a string converted number within the actual locale. A simple str_replace would then do the rest !!
Hide
Gavin added a comment -

Yes, I looked at that earlier, but then we are trusting ourselves to successfully convert numbers expressed in all locales supported by all versions of setlocale() for all platforms (C libraries), including number grouping, spacing, decimal place character, thousands separators, signs for negative or positive numbers, etc. Using the "C" locale probably will produce more reliable results, because then we rely on the underlying platform to give us properly formatted results.

I prefer if we didn't try to make our own "to 'C' locale" converter for use with setlocale().

Show
Gavin added a comment - Yes, I looked at that earlier, but then we are trusting ourselves to successfully convert numbers expressed in all locales supported by all versions of setlocale() for all platforms (C libraries), including number grouping, spacing, decimal place character, thousands separators, signs for negative or positive numbers, etc. Using the "C" locale probably will produce more reliable results, because then we rely on the underlying platform to give us properly formatted results. I prefer if we didn't try to make our own "to 'C' locale" converter for use with setlocale().
Hide
Thomas Weidner added a comment -

I dont see the problem...

localeconv() returns the internal settings for setLocale().
All we have to do is

  • trim the string
  • convert the negative sign if present
  • delete thousand seperators/nuber grouping and spaces
  • convert the decimal seperator

And then we have a normalized english notation string...

The problem is:
Eigther we say that we relay on BCMath and this means we do not support setLocale(xxx) but only setLocale(C)
Or we normalize the value ourself.

We can not set a locale with setLocale ourself because this does not only effect the actual script but the whole process which could cause problems in other scripts !

I would not want to say our users that they have to do setLocale(LC_ALL,'C'); within their bootstrap file. This would be a big big nono for the useage of the complete I18N core.

Show
Thomas Weidner added a comment - I dont see the problem... localeconv() returns the internal settings for setLocale(). All we have to do is
  • trim the string
  • convert the negative sign if present
  • delete thousand seperators/nuber grouping and spaces
  • convert the decimal seperator
And then we have a normalized english notation string... The problem is: Eigther we say that we relay on BCMath and this means we do not support setLocale(xxx) but only setLocale(C) Or we normalize the value ourself. We can not set a locale with setLocale ourself because this does not only effect the actual script but the whole process which could cause problems in other scripts ! I would not want to say our users that they have to do setLocale(LC_ALL,'C'); within their bootstrap file. This would be a big big nono for the useage of the complete I18N core.
Hide
Gavin added a comment -

I believe the patch attached to this issue will prove more reliable than us creating our own normalization code to translate all known (and future) localized numbers to the "C" locale standard. I do not believe the patch has the problems mentioned above.

Show
Gavin added a comment - I believe the patch attached to this issue will prove more reliable than us creating our own normalization code to translate all known (and future) localized numbers to the "C" locale standard. I do not believe the patch has the problems mentioned above.
Hide
Thomas Weidner added a comment -

From PHP's documentation:

:::::::::::::::::::::::::
Warning
The locale information is maintained per process, not per thread. If you are running PHP on a multithreaded server api like IIS or Apache on Windows you may experience sudden changes of locale settings while a script is running although the script itself never called setlocale() itself. This happens due to other scripts running in different threads of the same process at the same time changing the processwide locale using setlocale().
::::::::::::::::::::::::::

This is why I would not want to have setLocale() changing the actual locale for this issue.

Show
Thomas Weidner added a comment - From PHP's documentation: ::::::::::::::::::::::::: Warning The locale information is maintained per process, not per thread. If you are running PHP on a multithreaded server api like IIS or Apache on Windows you may experience sudden changes of locale settings while a script is running although the script itself never called setlocale() itself. This happens due to other scripts running in different threads of the same process at the same time changing the processwide locale using setlocale(). :::::::::::::::::::::::::: This is why I would not want to have setLocale() changing the actual locale for this issue.
Hide
Gavin added a comment -

Threaded PHP processes are very strongly discouraged, especially with the extension needed by the ZF. Some extensions are not thread-safe.

Do you know anyone using a threaded installation of PHP?
Do you know anyone credible recommending using PHP in a threaded installation?

Show
Gavin added a comment - Threaded PHP processes are very strongly discouraged, especially with the extension needed by the ZF. Some extensions are not thread-safe. Do you know anyone using a threaded installation of PHP? Do you know anyone credible recommending using PHP in a threaded installation?
Hide
Thomas Weidner added a comment -

In my opinion we should not include problems if we can prevent them easily.

And just because some ZF components are using non-threadsave extensions does not mean that the I18N core should also not be thread save.
I would like to have the I18N core being used also in non-threadsave environments by users without problems.

The change I have in mind is very easy and works on all environments.

Show
Thomas Weidner added a comment - In my opinion we should not include problems if we can prevent them easily. And just because some ZF components are using non-threadsave extensions does not mean that the I18N core should also not be thread save. I would like to have the I18N core being used also in non-threadsave environments by users without problems. The change I have in mind is very easy and works on all environments.
Hide
Gavin added a comment -

I disagree, but if you choose to implement your own "convert to 'C' locale", then include unit tests covering number conversion for all locales to show your function works. Also include the ability to detect new locales that are not yet tested. Different platforms have different mixes of locales installed and supported by setlocale().

Show
Gavin added a comment - I disagree, but if you choose to implement your own "convert to 'C' locale", then include unit tests covering number conversion for all locales to show your function works. Also include the ability to detect new locales that are not yet tested. Different platforms have different mixes of locales installed and supported by setlocale().
Hide
Gavin added a comment -

Also, do not forget that different versions of different platforms may have different versions of locale data. Therefore, formatting differences may exist between these versions.

Show
Gavin added a comment - Also, do not forget that different versions of different platforms may have different versions of locale data. Therefore, formatting differences may exist between these versions.
Hide
Thomas Weidner added a comment -

Problems solved with SVN 5806.

Have to wait until Community tests are approved to close this issue.
Tests on community server have proved ok.

Show
Thomas Weidner added a comment - Problems solved with SVN 5806. Have to wait until Community tests are approved to close this issue. Tests on community server have proved ok.

People

Vote (0)
Watch (1)

Dates

  • Created:
    Updated:
    Resolved: