<ac:macro ac:name="unmigrated-inline-wiki-markup"><ac:plain-text-body><![CDATA[
Zend_Utf8 is a simple component that offers escape and unescape functionalities. It's intended as a replacement for some code that is already available in ZF, though embedded in the Zend_Json and Zend_Serializer components. I've recently published a post about it at my site: http://noteslog.com/post/escaping-and-unescaping-utf-8-characters-in-php/
The Zend_Utf8 class is really simple, wholly coded, and ready for delivery, I hope. Note that still in the last release-1.11.2 the UTF-8 escaping feature in Zend_Json doesn't take into account all possible UTF-8 characters: in fact it lacks any support for the so called extended unicode characters, with a code point between 0×10000 and 0x10FFFF. This class does provide support for all unicode. Encoding PHP values to some other string format, like JSON, could require escaping UTF-8 characters. It respectively goes for decoding and unescaping. I think it's sufficiently justified the existence of a class for basic UTF-8 support in the Zend Framework. When this class will be available, the Zend_Json and Zend_Serializer modules should be refactored to call Zend_Utf8 methods where needed.Zend Framework: Zend_Utf8 Component Proposal
Proposed Component Name
Zend_Utf8
Developer Notes
http://framework.zend.com/wiki/display/ZFDEV/Zend_Utf8
Proposers
Andrea Ercolino
Zend Liaison
TBD
Revision
1.0 - 11 January 2011: Initial Draft. (wiki revision: 12)
Table of Contents
1. Overview
2. References
- Escaping and unescaping UTF-8 characters in PHP
- UTF-8 and Unicode FAQ
- Zend_Json_Encoder
- Zend_Json_Decoder
- Zend_Serializer_Adapter_PythonPickle
3. Component Requirements, Constraints, and Acceptance Criteria
4. Dependencies on Other Framework Components
5. Theory of Operation
Zend_Utf8 exposes six static functions: two are the main functions for escaping and unescaping strings and four are the ancillary functions for mapping UTF-8 characters to unicode integers and the other way around. Usage of the ancillary functions is well documented by the main functions, so I'll describe only usage of the latter.
Basic Usage
Here is a simple program that shows basic usage.
And this is its output.
Advanced Usage
If we do pass $options, we can for example escape and unescape UTF-8 as html entities in decimal format
Whose output will be
6. Milestones / Tasks
- Milestone 1: [DONE] Working prototype
- Milestone 2: Unit tests exist, work, and are checked into SVN.
- Milestone 3: Initial documentation exists.
7. Class Index
- Zend_Utf8_Exception
- Zend_Utf8
8. Use Cases
| UC-01 |
|---|