compared with
Current by Shahar Evron
on Feb 26, 2012 10:37.

Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (27)

View Page History
<p>This RFC attempts to consolidate and formalize several past discussions and efforts around refactoring Zend_Http_Client and the related classes. It is largely based on the now somewhat outdated Zend\Http\Client 2.0 proposal ([http://framework.zend.com/wiki/display/ZFPROP/Zend_Http_Client+2.0+-+Shahar+Evron], (<a href="http://framework.zend.com/wiki/display/ZFPROP/Zend_Http_Client+2.0+-+Shahar+Evron">http://framework.zend.com/wiki/display/ZFPROP/Zend_Http_Client+2.0+-+Shahar+Evron</a>, and on experiments and work done and committed to [https://github.com/shevron/zf2/tree/feature/http-client-rewrite]), <a href="https://github.com/shevron/zf2/tree/feature/http-client-rewrite">https://github.com/shevron/zf2/tree/feature/http-client-rewrite</a>), as well as on discussions in the zf-contributors mailing list and IRC channel. </p>

h1. Why Rewrite Zend_Http_Client?
<h2>Why Rewrite Zend_Http_Client?</h2>
<p>Over time, the Zend_Http_Client component in ZF 1.x (and 0.x) has grown beyond it's originally simple design to support a multitude of different use cases, and several issues with its design were raised again and again, the main ones being:</p>
<ul>
* Writing <li>Writing data from an HTTP server into a stream or from a stream is an afterthought. As a result it is pretty inconvenient to use the HTTP client in settings where memory usage might be a problem. problem.</li>
* The <li>The HTTP client is designed to mainly work with the default Socket adapter - the curl adapter (and potentially addition adapters) were an after thought, and as a result too much work is done by adapters other than the socket adapter to decompose and reconstruct requests in a way suitable for their use</li>
* There <li>There is no Request object - there is no way to abstract and override the way HTTP requests are described to the transport adapter. adapter.</li>
* Handling of cookies is problematic (e.g. value encoding)
<li>Handling of cookies is problematic (e.g. value encoding)</li>
</ul>

In addition some advancements in PHP made it possible to offer additional features (a transport adapter based on pecl_http, support for multiple concurrent requests). Changes in Zend Framework's standards require some refactoring (introduction of DI patterns, switching to Options objects for configuration). Changes in the MVC layer demand better operability and code sharing between the MVC layer and the HTTP client stack.

Due to all of these, it seems like the HTTP client stack needs to be refactored into smaller, reusable and overridable components. While some old code can be reused, it seems like the changes will require an almost complete rewrite of the HTTP client.
<p>In addition some advancements in PHP made it possible to offer additional features (a transport adapter based on pecl_http, support for multiple concurrent requests). Changes in Zend Framework's standards require some refactoring (introduction of DI patterns, switching to Options objects for configuration). Changes in the MVC layer demand better operability and code sharing between the MVC layer and the HTTP client stack. </p>

h1. Proposed Architecture for the HTTP client stack
The following new architecture is proposed. This builds on work already done for the MVC layer:
<p>Due to all of these, it seems like the HTTP client stack needs to be refactored into smaller, reusable and overridable components. While some old code can be reused, it seems like the changes will require an almost complete rewrite of the HTTP client. </p>

h2. HTTP Request and response objects
* Requests and responses are represented by the Http\Request and Http\Response classes.
* All properties specific to the PHP environment (e.g. representation of the $_SERVER, $_ENV and $_FILES super-globals) will be moved to subclasses of Request and Response under the Zend\Http\PhpEnvironment namespace (this is partially done, but needs more work)
* HTTP headers are represented by Zend\Http\Header objects. These objects are responsible for encoding and decoding headers and implementing header-specific logic such as parsing and understanding quality factor attributes and Cookie parameters
* HTTP request and response body (referred to as "Entity" in the HTTP RFC) will be represented by Zend\Http\Entity objects. This is done so that request and response objects could contain entities more complex than a simple in-memory string: an open stream, a multipart message containing both in-memory strings and file streams (e.g. a multipart/form-data based file upload), etc. For backwards compatibility reasons and to simplify things, representing the HTTP message entity as a simple string will still be possible.
* By default, a 'php://temp' stream based entity object will be used when receiving responses - this will allow decent performance for small responses with a fallback to temporary files on disk for memory hogging responses. Users could proactively set the response entity class to be used for the response before sending a request if they anticipate a response of particular nature.
<h2>Proposed Architecture for the HTTP client stack</h2>
<p>The following new architecture is proposed. This builds on work already done for the MVC layer:</p>

h2. HTTP transport layer adapters
* These will replace Zend_Http_Client's 'Adapter' classes, and will be responsible for sending and receiving HTTP messages.
* The interface for these transport adapters will be very simple: they will communicate via HTTP request and response objects, providing a single "send" method, which consumes a request and returns a response object. This design allows offloading a lot of the "heavy lifting" from the HTTP client, request and response classes into the transport layer, allowing reuse of capabilities implemented in PHP extensions such curl and pecl_http for transport adapters that will be based on them. This design is similar to the new Zend\Mail\Transport design.
* Some of the responsibilities taken on today by the client and response classes, such as decoding transport-layer encodings and compression, will be moved into the transport layer
* For simple use cases, it will be possible to instantiate and use a Transport object to send a request and receive a response without instantiating an HTTP client object
* Adapters supporting multiple/concurrent request handling will provide an additional sendMulti() or similar method, designating such support.
* Adapters supporting connections through an HTTP proxy server will offer this capability as a configuration option - no special "Proxy" adapter will be provided.
* By default a native sockets based adapter and a 'curl' based adapter will be provided, in addition to a 'Test' adapter that could be used to mock out the HTTP layer for Web service client unit testing purposes. The default transport object will be the 'Socket' one, as today.
<h3>HTTP Request and response objects</h3>
<ul>
<li>Requests and responses are represented by the Http\Request and Http\Response classes.</li>
<li>All properties specific to the PHP environment (e.g. representation of the $_SERVER, $_ENV and $_FILES super-globals) will be moved to subclasses of Request and Response under the Zend\Http\PhpEnvironment namespace (this is partially done, but needs more work)</li>
<li>HTTP headers are represented by Zend\Http\Header objects. These objects are responsible for encoding and decoding headers and implementing header-specific logic such as parsing and understanding quality factor attributes and Cookie parameters</li>
<li>HTTP request and response body (referred to as &quot;Entity&quot; in the HTTP RFC) will be represented by Zend\Http\Entity objects. This is done so that request and response objects could contain entities more complex than a simple in-memory string: an open stream, a multipart message containing both in-memory strings and file streams (e.g. a multipart/form-data based file upload), etc. For backwards compatibility reasons and to simplify things, representing the HTTP message entity as a simple string will still be possible.</li>
<li>By default, a 'php://temp' stream based entity object will be used when receiving responses - this will allow decent performance for small responses with a fallback to temporary files on disk for memory hogging responses. Users could proactively set the response entity class to be used for the response before sending a request if they anticipate a response of particular nature.</li>
</ul>

h2. The HTTP Client class
* Much of the work done today by Zend_Http_Client will be moved to distinct components such as the Transport objects, Request and Response objects, Entity objects and CookieStore objects (see below)
* The HTTP client will take on the responsibilities related to state maintenance between multiple requests, among others:
* Handling HTTP redirect responses
* Handling complex request procedures such as "100 continue" responses
* Maintaining client-side cookie state (through the use of CookieStore objects)
* Handling and maintaining the state of HTTP authentication (through the use of Authentication objects)
* Simplifying the submission of multiple requests with similar properties such as common headers (automatically setting headers such as User-Agent, Date, Accept-* etc. on all requests passing through the client)

h2. HTTP CookieStore objects
* The HTTP client will depend on a single CookieStore object to store and retrieve cookies received in responses and sent in requests. This is similar to today's CookieJar concept but will have a more modular design and will rely on Cookie and Set-Cookie header objects.
* By default a simple in-memory storage CookieStore class will be provided - users will be able to create more complex CookieStore objects storing cookies in a database or on disk, or with slightly different rules for which cookies are accepted from responses and added back to requests passed through the CookieStore object by the HTTP client.
<h3>HTTP transport layer adapters</h3>
<ul>
<li>These will replace Zend_Http_Client's 'Adapter' classes, and will be responsible for sending and receiving HTTP messages.</li>
<li>The interface for these transport adapters will be very simple: they will communicate via HTTP request and response objects, providing a single &quot;send&quot; method, which consumes a request and returns a response object. This design allows offloading a lot of the &quot;heavy lifting&quot; from the HTTP client, request and response classes into the transport layer, allowing reuse of capabilities implemented in PHP extensions such curl and pecl_http for transport adapters that will be based on them. This design is similar to the new Zend\Mail\Transport design.</li>
<li>Some of the responsibilities taken on today by the client and response classes, such as decoding transport-layer encodings and compression, will be moved into the transport layer</li>
<li>For simple use cases, it will be possible to instantiate and use a Transport object to send a request and receive a response without instantiating an HTTP client object</li>
<li>Adapters supporting multiple/concurrent request handling will provide an additional sendMulti() or similar method, designating such support.</li>
<li>Adapters supporting connections through an HTTP proxy server will offer this capability as a configuration option - no special &quot;Proxy&quot; adapter will be provided.</li>
<li>By default a native sockets based adapter and a 'curl' based adapter will be provided, in addition to a 'Test' adapter that could be used to mock out the HTTP layer for Web service client unit testing purposes. The default transport object will be the 'Socket' one, as today.</li>
</ul>

h2. HTTP Authentication objects
* Responsible for maintaining HTTP client authentication state, and for adding relevant information (usually in headers) to HTTP requests sent out by the client to ensure they are properly authenticated
* By default at least the "Basic" and "Digest" authentication schemes should be supported. Web service components could introduce vendor-specific authentication schemes through this mechanism.
* These are still to be designed

h1. Changes to current (ZF2.0 as of Feb 26th 2012) Http classes
In addition to the proposed above, some fixes and changes to the current Zend\Http classes provided in ZF 2.0 are proposed:
* Moving all PHP-Environment specific members to the PhpEnvironment namespace for requests and responses
* Allowing more flexibility on HTTP request methods, response codes, header names etc. as per the HTTP RFC (today's implementation breaks the HTTP RFC by being over-strict, only accepting a while list of HTTP response codes and methods)
* Providing an ability to quickly create and send simple HTTP requests via a simplified, domain-specific API, either in the Request class or the Client class (or both).
<h3>The HTTP Client class</h3>
<ul>
<li>Much of the work done today by Zend_Http_Client will be moved to distinct components such as the Transport objects, Request and Response objects, Entity objects and CookieStore objects (see below)</li>
<li>The HTTP client will take on the responsibilities related to state maintenance between multiple requests, among others:</li>
<li>Handling HTTP redirect responses</li>
<li>Handling complex request procedures such as &quot;100 continue&quot; responses</li>
<li>Maintaining client-side cookie state (through the use of CookieStore objects)</li>
<li>Handling and maintaining the state of HTTP authentication (through the use of Authentication objects)</li>
<li>Simplifying the submission of multiple requests with similar properties such as common headers (automatically setting headers such as User-Agent, Date, Accept-* etc. on all requests passing through the client)</li>
</ul>

h1. External Links
* The HTTP 1.1 RFC - [http://www.w3.org/Protocols/rfc2616/rfc2616.html]

<h3>HTTP CookieStore objects</h3>
<ul>
<li>The HTTP client will depend on a single CookieStore object to store and retrieve cookies received in responses and sent in requests. This is similar to today's CookieJar concept but will have a more modular design and will rely on Cookie and Set-Cookie header objects.</li>
<li>By default a simple in-memory storage CookieStore class will be provided - users will be able to create more complex CookieStore objects storing cookies in a database or on disk, or with slightly different rules for which cookies are accepted from responses and added back to requests passed through the CookieStore object by the HTTP client.</li>
</ul>


<h3>HTTP Authentication objects</h3>
<ul>
<li>Responsible for maintaining HTTP client authentication state, and for adding relevant information (usually in headers) to HTTP requests sent out by the client to ensure they are properly authenticated</li>
<li>By default at least the &quot;Basic&quot; and &quot;Digest&quot; authentication schemes should be supported. Web service components could introduce vendor-specific authentication schemes through this mechanism.</li>
<li>These are still to be designed</li>
</ul>


<h2>Changes to current (ZF2.0 as of Feb 26th 2012) Http classes</h2>
<p>In addition to the proposed above, some fixes and changes to the current Zend\Http classes provided in ZF 2.0 are proposed: </p>
<ul>
<li>Moving all PHP-Environment specific members to the PhpEnvironment namespace for requests and responses</li>
<li>Allowing more flexibility on HTTP request methods, response codes, header names etc. as per the HTTP RFC (today's implementation breaks the HTTP RFC by being over-strict, only accepting a while list of HTTP response codes and methods)</li>
<li>Providing an ability to quickly create and send simple HTTP requests via a simplified, domain-specific API, either in the Request class or the Client class (or both).</li>
</ul>


<h2>External Links</h2>
<ul>
<li>The HTTP 1.1 RFC - <a href="http://www.w3.org/Protocols/rfc2616/rfc2616.html">http://www.w3.org/Protocols/rfc2616/rfc2616.html</a></li>
</ul>