Issues

ZF-6936: Zend Filter unit tests fail when Unicode disabled

Description

Our machine has the pcre binary compiled without Unicode support (thanks, Red Hat), and when running unit tests some Zend_Filter_Word ones fail. The failures are the same as those detailed in #ZF-2484. I traced the problems to the fallback regular expression in SeparatorToCamelCase.php. A patch is below.

--- a/phputil/Zend/library/Zend/Filter/Word/SeparatorToCamelCase.php
+++ b/phputil/Zend/library/Zend/Filter/Word/SeparatorToCamelCase.php
@@ -16,7 +16,7 @@
  * @package    Zend_Filter
  * @copyright  Copyright (c) 2005-2008 Zend Technologies USA Inc. (http://www.zend.com)
  * @license    http://framework.zend.com/license/new-bsd     New BSD License
- * @version    $Id: SeparatorToCamelCase.php,v 1.1.2.1 2009/03/02 23:35:55 iyoung Exp $
+ * @version    $Id: SeparatorToCamelCase.php,v 1.1.2.2 2009/03/05 21:49:21 iyoung Exp $
  */
 
 /**
@@ -42,7 +42,7 @@ class Zend_Filter_Word_SeparatorToCamelCase extends Zend_Filter_Word_Separator_A
             parent::setMatchPattern(array('#('.$pregQuotedSeparator.')(\p{L}{1})#e','#(^\p{Ll}{1})#e'));
             parent::setReplacement(array("strtoupper('\\2')","strtoupper('\\1')"));
         } else {
-            parent::setMatchPattern(array('#('.$pregQuotedSeparator.')([A-Z]{1})#e','#(^[a-z]{1})#e'));
+            parent::setMatchPattern(array('#('.$pregQuotedSeparator.')([a-zA-Z]{1})#e','#(^[a-z]{1})#e'));
             parent::setReplacement(array("strtoupper('\\2')","strtoupper('\\1')"));
         }

Comments

Just to note: Attaching lowercased "a-z" to the preg-pattern has no effect on unicode.

But it seems as this would break the separator as it would allow "smallcase" when "Smallcase" should be used.

Notice the match pattern a few lines up that includes {{\p{L}{1}}}. That is the match pattern that is used when Unicode is enabled. When it is disabled, it falls back to the pattern I changed (thus why this problem doesn't manifest for most people who run the tests). Allowing lowercase letters in the match pattern means that we can match a string like {{Camel-cased-words}} and convert it to {{CamelCasedWords}}. This is the intent of the code, and is what the Unicode-enabled version does with {{\p{L}{1}}}, which matches a Unicode 'letter'.

Already fixed in past with r16188