Zend Framework

Zend_Filter_StringTrim does not work correctly with a multibyte string

Details

  • Type: Improvement Improvement
  • Status: Resolved Resolved
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: 1.8.2
  • Fix Version/s: 1.9.0
  • Component/s: Zend_Filter
  • Labels:
    None

Description

Zend_Filter_StringTrim uses trim(), so there are two problems.

  • It does not filter the characters like U+0085 next line and U+00A0 no-break space.
    • You can use preg_match('/^[\s\p{Zs}\p{Zl}\p{Zp}]+$/u', $str) to see what characters are white spaces in UTF-8.
  • If the trim_charlist parameter includes multibyte characters, it does not work as expected.
    • The returned string is cut in the middle of the multibyte character.

$trim_charlist = " \t\n\r\0\x0B・。";
$filter = new Zend_Filter();
$filter->addFilter(new Zend_Filter_StringTrim($trim_charlist));
$s = $filter->filter($value);

There is neither mb_trim() function or iconv_trim function in php core. We need to trim with preg_replace or something.
@see
http://bugs.php.net/bug.php?id=23501
http://php.oregonstate.edu/manual/en/ref.mbstring.php#87047

Activity

Hide
Thomas Weidner added a comment -

Changed to improvement as even PHP itself does not support this feature

Show
Thomas Weidner added a comment - Changed to improvement as even PHP itself does not support this feature
Hide
Thomas Weidner added a comment -

Feature enhancement added with r16191

Show
Thomas Weidner added a comment - Feature enhancement added with r16191

People

Vote (0)
Watch (0)

Dates

  • Created:
    Updated:
    Resolved: