ZF-8715: Inconsistent encoding across several view helpers
Description
Zend_Dojo's _renderLayers() method includes a call to htmlentities(). Unfortunately, it does not pull the encoding from the view object, which means it will only work on ASCII characters, which can potentially open multibyte XSS vectors.
Zend_View_Helper_Placeholder_Container, line 29, hardcodes the encoding (instead of using the view object's), and should likely use htmlspecialchars() instead.
Zend_Form_Decorator_HtmlTag, Zend_Service_Twitter, Zend_Feed_Element, and Zend_View_Helper_Navigation_Sitemap hardcode htmlspecialchars() calls to use UTF-8.
Zend_Log_Formatter_Xml, Zend_Tag_Cloud_Decorator_HtmlTag, Zend_Tag_Cloud_Decorator_HtmlCloud, and Zend_View_Helper_HeadStyle do not pass encoding information at all.
Zend_Filter_HtmlEntities defaults to ISO-8859-1, but should default to UTF-8 (same applies to Zend_View). Additionally, for consistency, it should implement a setEncoding() method that proxies to setCharset() (or vice-versa).
Basically, all instances of htmlentities() and/or htmlspecialchars() should use the encoding argument (3rd parameter), defaulting to UTF-8 if no encoding is known.
Comments
Posted by Matthew Weier O'Phinney (matthew) on 2010-01-06T12:36:24.000+0000
Updated subject and description to be comprehensive of all reported components.
Posted by Matthew Weier O'Phinney (matthew) on 2010-01-06T12:39:50.000+0000
Actually, Zend_View_Helper_Placeholder_Container_Standalone does use the view by default; only in the absence of a view object does it call htmlentities with a hard-coded encoding. I think in this case, it fits fine.
Posted by Matthew Weier O'Phinney (matthew) on 2010-01-06T12:48:37.000+0000
Zend_Service_Twitter hardcodes the encoding due to expectations of the Twitter API, and, as such, is correct in the current implementation.
Posted by Matthew Weier O'Phinney (matthew) on 2010-01-06T13:28:43.000+0000
All code updated in both trunk and 1.9 release branch.
Posted by Matthew Weier O'Phinney (matthew) on 2010-01-06T13:29:23.000+0000
Added Zend_Form to list of components affected
Posted by Thomas Weidner (thomas) on 2010-01-06T14:17:43.000+0000
Hint: r20104 could be seen as BC break for Zend_Filter_HtmlEntities.
Problematic changes: 1.) Encoding defaults changed 2.) Protected members changed
This could lead to problems with existing classes extending this filter. Therefor a migration note should be added to the 1.10 migration chapter.
Posted by Thomas Weidner (thomas) on 2010-01-06T14:20:06.000+0000
As the change has also be backported to 1.9 series the migration note should be added to the 1.9 migration chapter.
Posted by Matthew Weier O'Phinney (matthew) on 2010-01-06T14:36:34.000+0000
The encoding issues are security related, so as such, necessary. The changes from CharSet to Encoding are more consistency, however -- but it made sense to deal with those at the same time.
I'll update the migration chapter in the morning prior to packaging the release.
As for the protected member changes, the change should be transparent due to how the setters were written and the use of the new accessors for retrieving the values (instead of using the protected members directly); it's unlikely developers are overriding the setters/accessors in most cases anyways. I'll still make a note, however.