ZF-6936: Zend Filter unit tests fail when Unicode disabled
Description
Our machine has the pcre binary compiled without Unicode support (thanks, Red Hat), and when running unit tests some Zend_Filter_Word ones fail. The failures are the same as those detailed in #ZF-2484. I traced the problems to the fallback regular expression in SeparatorToCamelCase.php. A patch is below.
--- a/phputil/Zend/library/Zend/Filter/Word/SeparatorToCamelCase.php +++ b/phputil/Zend/library/Zend/Filter/Word/SeparatorToCamelCase.php @@ -16,7 +16,7 @@ * @package Zend_Filter * @copyright Copyright (c) 2005-2008 Zend Technologies USA Inc. (http://www.zend.com) * @license http://framework.zend.com/license/new-bsd New BSD License - * @version $Id: SeparatorToCamelCase.php,v 1.1.2.1 2009/03/02 23:35:55 iyoung Exp $ + * @version $Id: SeparatorToCamelCase.php,v 1.1.2.2 2009/03/05 21:49:21 iyoung Exp $ */ /** @@ -42,7 +42,7 @@ class Zend_Filter_Word_SeparatorToCamelCase extends Zend_Filter_Word_Separator_A parent::setMatchPattern(array('#('.$pregQuotedSeparator.')(\p{L}{1})#e','#(^\p{Ll}{1})#e')); parent::setReplacement(array("strtoupper('\\2')","strtoupper('\\1')")); } else { - parent::setMatchPattern(array('#('.$pregQuotedSeparator.')([A-Z]{1})#e','#(^[a-z]{1})#e')); + parent::setMatchPattern(array('#('.$pregQuotedSeparator.')([a-zA-Z]{1})#e','#(^[a-z]{1})#e')); parent::setReplacement(array("strtoupper('\\2')","strtoupper('\\1')")); }
Comments
Posted by Thomas Weidner (thomas) on 2009-06-05T14:00:03.000+0000
Just to note: Attaching lowercased "a-z" to the preg-pattern has no effect on unicode.
But it seems as this would break the separator as it would allow "smallcase" when "Smallcase" should be used.
Posted by Ian Young (youngian) on 2009-06-05T14:17:58.000+0000
Notice the match pattern a few lines up that includes {{\p{L}{1}}}. That is the match pattern that is used when Unicode is enabled. When it is disabled, it falls back to the pattern I changed (thus why this problem doesn't manifest for most people who run the tests). Allowing lowercase letters in the match pattern means that we can match a string like {{Camel-cased-words}} and convert it to {{CamelCasedWords}}. This is the intent of the code, and is what the Unicode-enabled version does with {{\p{L}{1}}}, which matches a Unicode 'letter'.
Posted by Thomas Weidner (thomas) on 2009-06-25T11:58:22.000+0000
Already fixed in past with r16188