Skip to end of metadata
Go to start of metadata

<p>This RFC contains a proposal for a class detecting basic environment capabilities. The class should only contain detections which are often used on several components and/or user land code <strong>AND</strong> are tricky to detect with PHP itself.</p>

<p>The main goal is to have a single point of detection that is good tested and re-usable to minimize code duplications and possible issues.</p>

<p><strong>UPDATE:</strong><br />
After some discussion I prefer to add a class BinaryUtils contains helper methods to work with binary string incl. mbstring.func_overload workarounds, a class <code>StringUtils</code> contains helper methods to work with strings of different character sets (based on iconv/mbstring) incl. pcre unicode check<br />
and the class Environment contains other detections like the type of the operating system.</p>

<p>On method naming of <code>StringUtils/BinaryUtils</code> I avoided the name "string" because of confusions on BinaryUtils vs. StringUtils but to use the same method names on both.</p>

<p><strong>NOTE:</strong> System-Memory detections should be completely removed from ZF because it's not possible to work well on all installations and it's not required on other components.</p>

<li>unified way to detect OS, PHP version, extension compatibility, PCRE unicode compatibility, machine byte order and other valuable eviromental parameters. Current ZF code has:
<li>OS detection: 22 usages</li>
<li>PHP version check: 3 usages</li>
<li>extension detection: 70 usages</li>
<li>PCRE unicode support: 18 usages</li>

<li>unified way to detect memory and disk capacity<br />
<em>Questionable. there is no clear portable way to detect systemmemory (see notes above)</em></li>

<li>mbstring.func_overload<br />
Several components in ZF are not going to work properly if mbstring overloading is on. Some components prefer to use own wrappers to address this (i.e. <code>Http\Client</code> & <code>Http\Response</code>, <code>OpenId\OpenId</code> ) some prefer to make a note in docblock (<code>Gdata\MediaMimeStream</code>)<br />
Either single detection and wrapping should exist, or mbstring overloading should be considered non-standard PHP setup and not adressed at all. </li>

<li>unicode support<br />
Many components use own unicode support functions. There is no clear way what extension should be used for multibyte, some components use iconv some mbstring.</li>

<li>byteorder<br />
Many components need byteorder dependent way to work with binary data. Right now custom functions are in: <code>Pdf/BinaryParser</code>, <code>Serializer</code>, <code>Amf/Util/BinaryStream</code>, <code>Filter/Compress</code>, and many others.</li>

<li>machine byteorder dependent behaviour of <code>pack/unpack()</code><br />
From PHP manual: "PHP internally stores integral values as signed. If you unpack a large unsigned long and it is of the same size as PHP internally stored values the result will be a negative number even though unsigned unpacking was specified."<br />
BinUtils could address that with wrapper function.</li>

<li>unified way to get int/float from binary representation<br />
Right now custom functions for this are in: <code>Pdf/BinaryParser</code>, <code>Serializer</code></li>

<li>unified way to perform bitwise operations on binary data<br />
functions like <code>setBit</code>, <code>clearBit</code>, <code>testBit</code></li>

<p><a href=""></a></p>
<ac:macro ac:name="code"><ac:plain-text-body><![CDATA[
namespace Zend\Stdlib;

class Enviroment
public static function is32Bit();
public static function is64Bit();

public static function isWindows();
public static function isUnix();
public static function isLinux();
public static function isBsd();
public static function isSolaris();
public static function isDarwin();
public static function isCygwin()

class StringUtils
public static function hasMbString();
public static function hasIconv();
public static function isPcreUnicodeAware();

public static function isSingleByteCharset($charset);

public static function length($input, $charset = 'utf-8');
public static function indexOf($haystack, $needle, $offset = 0, $charset = 'utf-8');
public static function lastIndexOf($haystack, $needle, $offset = 0, $charset = 'utf-8');
public static function subset($input, $offset, $length = 0, $charset = 'utf-8');

// moved from Zend\Text\MultiByte
public static function wordWrap($string, $width = 75, $break = "\n", $cut = false, $charset = 'utf-8');
public static function strPad($input, $padLength, $padString = ' ', $padType = STR_PAD_RIGHT, $charset = 'utf-8');

class BinaryUtils
const BYTEORDER_LE = 1;
const BYTEORDER_BE = 2;

public static function isLittleEndian();
public static function isBigEndian();

public static function length($input);
public static function indexOf($haystack, $needle, $offset = 0);
public static function lastIndexOf($haystack, $needle, $offset = 0);
public static function subset($input, $offset, $length = 0);


rfc rfc Delete
environment environment Delete
utils utils Delete
string string Delete
binary binary Delete
charset charset Delete
multibyte multibyte Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.
  1. Apr 18, 2012

    <p>Can you provide some detail as to when you would use these utilities – i.e., what problems do they solve, exactly? (<em>I</em> know, but you can't assume everybody coming to the RFC knows.) Also, please indicate which <em>existing</em> components could make use of these, and also when existing components might choose between native PHP functions and those presented here.</p>

    1. Apr 18, 2012

      <p>On BinaryUtils<br />
      One of the ideas behind String/BinaryUtils was mbstring overloading. With mbstring.func_overload = 1 strlen becomes not binary safe. There are 3 components that have custom strlen wrappers to address this: GData, Http, OpenId. I think it's either common wrapper should exist, or mbsstring overloading should not be addressed at all. </p>

      <p>Many components need machine endianess dependent way to work with binary data. Right now custom methods are in: Pdf/BinaryParser, Serializer, Amf/Util/BinaryStream, Filter/Compress, and many others. </p>

      <p>Another issue to address is unpack behaviour on big-endian vs little-endian. </p>

      1. Apr 18, 2012

        <p>Right – my point is to incorporate that information <em>in</em> the RFC. <ac:emoticon ac:name="smile" /></p>