View Source

<h1>RFC: Object Creation and Configuration</h1>

<ac:macro ac:name="note"><ac:parameter ac:name="title">RFC Status: For discussion</ac:parameter><ac:rich-text-body>
<p>This RFC is currently under discussion and is not complete.</p>

<p>Please comment on this RFC on the mailing list.</p></ac:rich-text-body></ac:macro>

<h2>The ZF1 State of Things:</h2>

<p>Over the development of the past few components in ZF2, we've been exploring different patterns that deal with object creation and configuration, as both of these concerns go hand-in-hand for most PHP developers - give the more common styles of coding. In ZF1, the most prolific pattern is the unified constructor.</p>

<ac:macro ac:name="code"><ac:plain-text-body><![CDATA[
class Foo
{
public function __construct($config = array()) {}
}
]]></ac:plain-text-body></ac:macro>

<h3>The obvious pros:</h3>

<ul>
<li>easy of use since the array() is the most well known and versatile structure in PHP</li>
</ul>


<h3>The obvious cons:</h3>

<ul>
<li>knowing what keys are valid for $config</li>
<li>knowing what keys are required vs. option in $config</li>
<li>key naming convention is not standardized</li>
<li>docs cannot be generated from this prototype/signature by existing tools</li>
<li>hard to understand what is a scalar value vs. an objects dependency (another object)</li>
</ul>


<h3>The not so obvious cons are</h3>

<ul>
<li>objects do not have a known identity; meaning, without knowing what instantiation time values distinguish one object from another.</li>
</ul>


<ul>
<li>objects throughout the framework are too concerned with instantiation of other objects (dependencies) in non-obvious locations: like inside a getter or a setter.</li>
</ul>


<h3>Side effects of objects trying to centralize configuration are:</h3>

<ul>
<li>Objects assume that they need to be configured early-on with configuration, but utilized later - leading developers to add lazy loading of dependencies as a feature of the object itself. This has the side effect of pushing creation of other objects into a getter or a setter in some form.
<ul>
<li>(Lazy loading of dependencies should be only done by objects that are computationally expensive and/or part of the objects &quot;graph building&quot; strategy)</li>
</ul>
</li>
</ul>


<h3>Important things to remember:</h3>

<ul>
<li>constructors are not subject to Liskov Substitution Principle (even though PHP allows __construct() in an interface, having it there is considered bad practice and should be avoided anyway)</li>
</ul>


<h4>What does this mean? </h4>

<p>It means that any subclass can change the signature of the constructor should be allowed as per the requirements of the sub-type. Since sub-types can change their constructor to suit their own requirements, forcing them to comply with a parents __construct($config = array()) should generally be considered a bad practice. </p>

<h2>What is the proposal?</h2>

<ul>
<li>Well named factories plus constructors that describe an objects hard dependencies / required values, and optional dependencies should be used.</li>
<li><strong>Objects with no hard or soft dependencies would not have constructors.</strong></li>
</ul>


<p>This means that if an object must have a name, then the constructor should be</p>

<ac:macro ac:name="code"><ac:plain-text-body><![CDATA[
class Foo
{
public function __construct($name, $value = null) {}
public function setValue($value) {}
}
]]></ac:plain-text-body></ac:macro>

<ul>
<li>Factories should describe the source being used for object creation, for example:</li>
</ul>


<ac:macro ac:name="code"><ac:plain-text-body><![CDATA[
Baz::fromArray(array $array);
Baz::fromConfig(Zend\Config\Config $config);
Baz::fromString($string);

// used in Zend\Code
Baz::fromReflection(ReflectionFile $reflection);
(etc)
]]></ac:plain-text-body></ac:macro>

<p>The from&lt;source&gt;() pattern should only be used when these methods exist within the class/type being constructed.</p>

<p>This pattern is well defined on wikipedia, see the &quot;Descriptive names&quot; section: <a class="external-link" href="http://en.wikipedia.org/wiki/Factory_method_pattern">http://en.wikipedia.org/wiki/Factory_method_pattern</a></p>

<p>It is understood that <strong>all</strong> factories within that given object will always produce type used at call time. This is achieved through PHP's 5.3 LSB (the factory applies to subtypes):</p>

<ac:macro ac:name="code"><ac:plain-text-body><![CDATA[
public static function fromArray(array $array)
{
// constructor param setup from array

// static will always apply to extending classes
$obj = new static(/* req. params */);

// other wiring from array
return $obj; // will always return subtype
}
]]></ac:plain-text-body></ac:macro>

<ac:macro ac:name="note"><ac:rich-text-body>
<p>This takes advantage of PHP's class level visibility, this means that the factories can interact with instance protected properties without having to go through accessors/mutators.</p></ac:rich-text-body></ac:macro>

<p>Example:</p>

<ac:macro ac:name="code"><ac:plain-text-body><![CDATA[
class Foo
{
protected $value;

public static function fromArray($array)
{
$obj = new static;

// interact with protected member
$obj->value = $array['value'];
return $obj;
}

public function getValue()
{
return $this->value;
}

}
]]></ac:plain-text-body></ac:macro>

<ul>
<li>Dynamic/object factories will be allowed when one object is creating objects of a different type. These methods should <strong>NOT</strong> be static. The name of this factory object should contain the name 'Factory', for example:</li>
</ul>


<ac:macro ac:name="code"><ac:plain-text-body><![CDATA[
class FooFactory
{
public function createBarFromArray(array $a) {
// return type Bar
}
public function createBarFromConfig(config $b) {
// return type Bar
}
}
]]></ac:plain-text-body></ac:macro>

<p>The reasoning for having a factory object over a class full of static factory methods is that since one has opted to have a dynamic factory, there is some elements of factory configuration or state tracking that the factory is doing (for example, using a short name based plugin loader). Since that is the case, it is important that this state not be static so that other consumers of this factory have a fair chance at having a &quot;default&quot; factory object.</p>

<ac:macro ac:name="note"><ac:rich-text-body><p>This model should be only used in complex instantiation scenarios</p></ac:rich-text-body></ac:macro>

<ul>
<li>Factories are capable of calling factories of similar source type. So for example, if Foo::fromArray($array) was called, and a particular key 'bar' is located in $array, where $Foo-&gt;setBar(Bar $bar), and it is established that Bar::fromArray() exists, Foo::fromArray() would use Bar::fromArray() to instantiate from the value of the 'bar' key. This solves the problem of nested configuration/arrays that model the configuration of an object graph.</li>
</ul>


<ul>
<li>Factories should throw exceptions when not enough information is provided.</li>
</ul>


<ul>
<li>Objects should be completely valid and ready to do their object after instantiation</li>
</ul>


<ul>
<li>All required dependencies should be fulfilled at instantiation time</li>
</ul>


<ul>
<li>The special factory: createDefaultInstance() should create a poka-yoke instance with all dependencies pre-configured with sane defaults. For example:</li>
</ul>


<ac:macro ac:name="code"><ac:plain-text-body><![CDATA[
class Foo
{

public static function createDefaultInstance()
{
return new static(new Bar, new Baz);
}

public function __construct(Bar $bar, Baz $baz) {}

}
]]></ac:plain-text-body></ac:macro>

<h2>Concerns Left To Other Components</h2>

<ul>
<li>Lazy loading is not something any one object should be concerned with. Within an application, lazy loading can be achieved by the usage of a Service Locator. In other environments, this can also be solved by using a Dependency Injection container. See the above note on the special &quot;createDefaultInstance()&quot; factory.</li>
</ul>


<h2>Configuration Keys:</h2>

<p>Since array based factories are localized and not spread out amongst the class, the source for the keys are localized as well, which means we can utilize a combination of techniques to automate the finding and understanding of key values. First, they can be found inside a docblock. Second, they can be scanned by a docblock scanner and then formatted for usage in API docs, manual docs, etc. Here is an example of such code:</p>

<ac:macro ac:name="code"><ac:plain-text-body><![CDATA[
namespace Foo;

class Bar
{
protected $name;
protected $value;
protected $baz;

/**
* @configkey name string [required] Some name
* @configkey value int Some Value
* @configkey baz Baz|array A baz object
*
* @static
* @throws Exception\InvalidArgumentException
* @param array $array
* @return Bar
*/
public static function fromArray(array $array)
{
if (!isset($array['name'])) {
throw new Exception\InvalidArgumentException('Class generator requires that a name is provided for this object');
}
$bar = new static($array['name']);
foreach ($array as $name => $value) {
// normalize key
switch (strtolower(str_replace(array('.', '-', '_'), '', $name))) {
case 'value':
$cg->setValue($value);
break;
case 'baz':
$this->setBaz(($baz instanceof Baz) ? $baz : Baz::fromArray($baz));
break;
}
}
return $bar;
}

public function __construct($name)
{
$this->name = $name;
}

public function setValue($value)
{
$this->value = $value;
}

public function setBaz(Baz $baz)
{
$this->baz = $baz;
}
}
]]></ac:plain-text-body></ac:macro>