|
|
|
I am also have this issue on
RHEL5, PHP 5.1.6, PCRE 6.6 I check redhat default build spec for PCRE 6.6 and --enable-utf8 is specified. At this point I am not 100% sure it is UTF-8 related but I do know since ZF1.0RC3 the filters Zend_Filter_Alnum and Zend_Filter_Alpha are not matching strings that should be valid. These patterns continually fail regardless of the value input:
$pattern = '/[^\p{L}\p{N}' . ($this->allowWhiteSpace ? '\s' : '') . ']/u'; So I modified the pattern to use the old alnum which works both with and without utf-8 specified. So these patterns are working: $pattern = '/[^[:alnum:]' . ($this->allowWhiteSpace ? '\s' : '') . ']/u'; The Unicode property patterns (the ones beginning with \p) will not work if UTF-8 support is unavailable in PCRE.
From the PCRE man pages UTF-8 SUPPORT
To build PCRE with support for UTF-8 character strings, add
--enable-utf8
to the configure command. Of itself, this does not make PCRE treat
strings as UTF-8. As well as compiling PCRE with this option, you also
have have to set the PCRE_UTF8 option when you call the pcre_compile()
function.
UTF-8 AND UNICODE PROPERTY SUPPORT
From release 3.3, PCRE has had some support for character strings
encoded in the UTF-8 format. For release 4.0 this was greatly extended
to cover most common requirements, and in release 5.0 additional sup-
port for Unicode general category properties was added.
In order process UTF-8 strings, you must build PCRE to include UTF-8
support in the code, and, in addition, you must call pcre_compile()
with the PCRE_UTF8 option flag. When you do this, both the pattern and
any subject strings that are matched against it are treated as UTF-8
strings instead of just strings of bytes.
If you compile PCRE with UTF-8 support, but do not use it at run time,
the library will be a bit bigger, but the additional run time overhead
is limited to testing the PCRE_UTF8 flag occasionally, so should not be
very big.
If PCRE is built with Unicode character property support (which implies
UTF-8 support), the escape sequences \p{..}, \P{..}, and \X are sup-
ported. The available properties that can be tested are limited to the
general category properties such as Lu for an upper case letter or Nd
for a decimal number, the Unicode script names such as Arabic or Han,
and the derived properties Any and L&. A full list is given in the
pcrepattern documentation. Only the short names for properties are sup-
ported. For example, \p{L} matches a letter. Its Perl synonym, \p{Let-
ter}, is not supported. Furthermore, in Perl, many properties may
optionally be prefixed by "Is", for compatibility with Perl 5.6. PCRE
does not support this.
From testing on my platforms ( openSUSE/SUSE-OSS 10.0, 10.1, 10.2 ), I have
the following results: Up to ( but not including ) apache2-mod_php5-5.2.0-10.rpm testing against the From mod_php5 version that ships with openSUSE 10.2 ( 5.2.0-10 ) to the On openSUSE mod_php5-5.2.0-10 and greater is built against the system PCRE Workaround, on openSUSE 10.2 an upgrade of the system PCRE to the latest Zend_Filter_Alpha and Zend_Filter_Digits also use Unicode property matching.
Sorry, I forgat about those ..... It should be fixed now. Graham Anderson or anyone else running a none UTF-8 enabled system, could you please run the unit tests for: Zend_Filter_AllTests?
I'll await your response before i close the issue. Thanks to Julian Davchev for testing this:
— svn update jmut@dexter:/storage/www/frameworks/zendframework$ php ......................................... Time: 00:00 OK (109 tests) |
||||||||||||||||||||||||||||||||||||||||||||||||||
I can see why this change was made, but for me it is causing problems on certain platforms. On some, the extendes PCRE syntax "\p{}" doesn't seem to match anything. I could quite figure out what exactly is causing this problem. Up to now, I tried the following platforms:
Not working:
Fedora Core 5, PHP 5.1.6, PCRE 6.3
Fedora Core 6, PHP 5.1.6, PCRE 6.6
Working:
Fedora 7, PHP 5.2.2, PCRE 7.0
Solaris 10 x86, PHP 5.1.5, PCRE 6.6
Debian Etch, PHP 4.4.4, PCRE 6.7
Maybe, you could amend the docs to state the exact prerequisites to get Zend_Filter_Alnum et al. to work. That would surely help me a lot.