Issue Type: Improvement Created: 2006-07-19T02:18:05.000+0000 Last Updated: 2007-07-05T14:43:15.000+0000 Status: Resolved Fix version(s): - 1.0.0 RC3 (23/Jun/07)
Reporter: Georg von der Howen (ghowen) Assignee: Darby Felton (darby) Tags: - Zend_Filter
Related issues: - ZF-1483
If you try to filter or validate a string that contains, for example, a German umlaut, and the server uses UTF-8 encoding, you will find that filter and validation classes do not support such characters.
Posted by Mark Evans (sparky) on 2006-07-21T07:02:45.000+0000
The problem with the mb_* functions are they require a non standard extension to be enabled.
So we would need to check to make sure the mb_* functions are available before trying to use them which could get very messy.
Posted by Darby Felton (darby) on 2006-07-24T11:25:10.000+0000
Original comment by Lenny:
the Zend_Filter does not have a filter for checked the character strings containing of the accents. A solution exists ? So, if not i code it !
Posted by Georg von der Howen (ghowen) on 2006-07-25T01:03:50.000+0000
Mark, I see the problem. Although it might still be worthwile considering it because AFAIU the ZF is supposed to be full UTF-8 compatible, where this is a major break. An alternative might be to have something like Zend_Filter_Mb for those of us with a non ASCII tongue. What do you think?
Posted by Christopher Thompson (christopher) on 2006-07-25T18:17:35.000+0000
Perhaps a better solution would be to allow the replace function to be configurable in some way. This could be done in several ways, but implementing a replace() function within the class might be the place to start. At least then it could be overloaded and that function would be a place to centrally make whatever implmentation seems best.
Posted by Mark Evans (sparky) on 2006-07-26T03:38:21.000+0000
Reading the php manual, it looks like if we changed from preg_replace to ereg_replace the mbstring extension can automatically overload the functions.
not sure if this is a possible solution?
Posted by Georg von der Howen (ghowen) on 2006-07-26T04:11:27.000+0000
Mark's idea sounds good - but I also see a potential problem.
From my experience with shared hosting packages you don't normally have access to the php.ini. This would then be similiar to mb_* being a non-standard extension.
But maybe you can also set overwriting per directory in .htaccess, which would probably help most people in shared hosting environments.
Posted by Gavin (gavin) on 2006-08-02T19:37:30.000+0000
Currently, my best guess is that a large percentage of deployment environments (e.g. many web hosters) do not provide the mbstring PHP extension. Thus, the only reliable alternative I've seen is: http://framework.zend.com/wiki/x/sgo
Posted by Bill Karwin (bkarwin) on 2006-11-13T15:23:36.000+0000
Changing fix version to 0.9.0.
Posted by Darby Felton (darby) on 2007-03-28T11:31:18.000+0000
Updated the issue details.
Related original comment from [~gavin]:
Summarizing from historical threads:
* the mbstring extension is not deprecated and is the recommended way to fully UTF-8 enable an entire PHP application * the mbstring extension should not be required in ZF /library core code (the extension is not a ZF requirement) * the /u modifier for PCRE does not work in all conditions and situations (see past topic threads and pcre.org for details) * test suites might make use of more "tools" (e.g. mbstring) than are available in ZF /library core, but the tests then need to be optional
Posted by Thomas Weidner (thomas) on 2007-03-28T12:05:57.000+0000
Just a small comment:
With ``` you can receive a list of chars which are accepted and official supported within this language/locale.
German for example returns "[a-z ä ö ü ß]", english retunrs "[a-z]" and greek returns "[ΐά-ώ]"
Maybe this can be usefull for you.
Posted by Gavin (gavin) on 2007-03-28T12:59:26.000+0000
A link to the discussion thread summarizing the community's consensus:
Posted by Andries Seutens (andries) on 2007-04-12T09:29:25.000+0000
Those wo actually need the mb_string functions will most probably have the mb_* functions enabled. My idea is to work with "locales", and if one does not provide a locale, we will use "default behaviour".
Posted by Andries Seutens (andries) on 2007-06-18T09:08:22.000+0000
Darby, isn't this one resolved?
Posted by Darby Felton (darby) on 2007-06-18T10:49:07.000+0000
Since the issue does not name a specific unresolved item to address, I mark the resolution as incomplete. The issue can be reopened, naming a specific problem [set], if necessary.
Have you found an issue?
See the Overview section for more details.