Zend_Cloud_Document is a component in the Simple Cloud API that provides a common API for cloud application table storage services, such as Amazon SimpleDB and Windows Azure Table Storage. The design goals of this API include simplicity, portability across cloud table storage services, stability, and extensibility for future expansion.
3. Component Requirements, Constraints, and Acceptance Criteria
This component must support creating collections.
This component must support deleting collections.
This component must support listing all documents in a collection.
This component must support inserting documents in to a collection.
This component must support updating documents in a collection.
This component must support deleting documents from a collection.
This component must support querying for documents in a collection.
This component must support querying of documents from one collection in a query.
It would be nice if this component supported querying of documents from two or more collections in a query.
This component must support filtering of documents in a query.
This component must support limiting of results in a query based on simple lexicographical comparisons.
This component must support sorting on one field in a query.
It would be nice if this component supported sorting by more than one field in a query.
It would be mice if this component supported transactions.
4. Dependencies on Other Framework Components
Zend_Exception
Cloud document database client libraries under the Zend_Service namespace.
5. Theory of Operation
Although there are some differences in the document database APIs offered by cloud vendors, there is a set of common runtime operations that can be supported in one API. Note that unique value-add offerings such as transaction support in Azure's API are not must-have features.
The API follows a more 'procedural' model, as opposed to an OOP paradigm, to more easily model services available via RESTful, SOAP, and RPC interfaces. Such an API will also negligibly affect performance, while allowing users to build their own object-oriented data access libraries on top of the simple methods it exposes.
This proposal covers only components implemented for Zend Framework, but these interfaces and classes have been designed to be implementable in other languages and/or package structures.
6. Milestones / Tasks
Milestone 1: [DONE] Simple Cloud API project launched.
Milestone 2: Community review sufficient to submit proposal for recommendation.
Milestone 3: Implement API and adapters in Zend Framework trunk.
Milestone 4: Document API.
Milestone 5: Address issues as necessary for production release.
7. Class Index
Zend_Cloud_DocumentService
Zend_Cloud_Document_Document
Zend_Cloud_Document_Query
Zend_Cloud_Document_Factory
Zend_Cloud_Document_Exception
8. Use Cases
UC-01 - Get instance of adapter
Call static method on factory class
Use it
UC-02 - Create collection
UC-01 - Get instance of adapter
Call create collection with name
UC-03 - Delete collection
UC-01 - Get instance of adapter
Call delete collection with name
UC-04 - List collections
UC-01 - Get instance of adapter
Call list collections
UC-05 - Insert document
UC-01 - Get instance of adapter
Call insert document with document id, collection name, and field names=>values
UC-06 - Update document
UC-01 - Get instance of adapter
Call update document with document id, collection name, and field names=>values
UC-07 - Delete document
UC-01 - Get instance of adapter
Call delete document with document id and collection name
UC-08 - Query documents
UC-01 - Get instance of adapter
Call query with collection name, filters, sorting fields, and result limits
<p>What is the reasoning behind the choice of "Document" for this interface? </p>
<p>I'm not even sure what you mean by that, but looking at the API my uncertain guess would be that Documents are meant to correspond to AWS SimpleDB Items, is that right? In Azure Table terminology they're Entities (and MS even refers to them as Rows sometimes).</p>
<p>My feeling is that the "Document" terminology will be confusing. Although many advocate steering-clear of RDBMS terminology to make it clear these services don't behave like a RDBMS at all, I think most developers who will use this API will have already had some contact with a RDBMS and will find it much easier to 'map' their concepts across if the terminology is similar (e.g. collection|table, row, column) while just making the non-RDBMs semantics clear. It will be difficult to confuse an API like this with an RDBMS anyway - since there are obviously lots of things that RDBMSs have that this doesn't - such as complex queries, joins, counting and sorting perhaps.</p>
<p>My only other comment is that the API should be refined/abstracted a bit more for queries. Just having a query() method that passes the query through to the underlying service doesn't really provide much abstraction at all. Perhaps some minimal structure for query specification, like 'table' name, set of where 'AND' conditions, sort order spec if supported - things like that. A set of classes implementing a LINQ-like interface might be nice and doable, but probably goes beyond the 'simple' designation <ac:emoticon ac:name="smile" /></p>
<p>I do still agree that an 'escape hatch' through to the underlying service is required - but not as something that need to be used for almost all use-cases. The API should also describe some lowest-common-denominator semantics, such as BASE semantics, while allowing querying for particular capabilities - such as AzureTable's ACID semantics for single entities within a single table, SimpleDBs domain data-size and item-count limits etc.</p>
<p>Will be nice to have this solidified so we can start writing to it! (at least as a version 1.0 which can continue to be supported while 2.0 is debated). <ac:emoticon ac:name="smile" />
<br class="atl-forced-newline" /></p>
<p>As far as I'm aware, there isn't an agreed-upon name for these services. "Document" reflects their document-oriented nature. The other names are what we feel was the most descriptive. We're certainly open to other suggestions, however.</p>
<p>There is a query class in this proposal, although it is definitely a tricky part of the API. Should we only allow certain clauses and constructs in the query class and enforce them there, or should the query class simply contain strings for each clause? The first is harder to implement, more bug-prone, but maintains a clear picture of portability for the user. The second is easier to implement, thinner (and therefore less bug-prone), but opens up a lot of lock-in possibilities similar to arbitrary SQL strings in code.</p>
<p>I'm getting the code cleaned up for check in. <ac:emoticon ac:name="smile" /></p>
<p>Like David, I'm a little confused by the terminology here. I have a ZF app already using Rackspace's cloud files API for storing/delivering CDN content (using my own quick 'n dirty cloud files API), so I would be keen to replace this with an 'official' implementation. But for using Cloud Files would I be using this component (Zend_Cloud_Document) or Zend_Cloud_Storage? Or some combination of the two?</p>
<p>Let's say I have a video I want to store on the Cloud Files CDN, does the video become the 'document' in this case? </p>
<p>I'm sure the use cases will clear things up a bit more! This is a great initiative though, I'm looking forward to seeing it progress.</p>
<p>Unfortunately, the community hasn't settled on a name for this type of database. I'm open to a name change, but all of the names I could think of were not exactly right for one reason or the other.</p>
<p>You would use Zend_Cloud_Storage for Cloud Files.</p>
<p>Yes, I've seen that library (and that's the one Rackspace publicise on their site). I wrote my own to integrate a little better with my ZF app (e.g. it uses Zend_Http_Client), as I had a few custom requirements. Thanks for clearing things up though!</p>
<p>Rackspace has actually expressed some interest in contributing their libraries to Zend Framework. They evaluated Zend_Http_Client and had some issues with it. Would you be interested in talking to their PHP client developer about how you hooked it all up?</p>
<p>My effort halted due to client work and 9-5 work increasing. I expected that load to decrease, however, it has not. That being said, I am going to withdraw my proposal to make way for someone else to provide this component (I've seen a couple of drafts out on github lately).</p>
<p>Any update on the Cloud Files proposal? Have you talked with Rackspace about finishing it? They were very stoked to work with ZF community members on their client libs last time I spoke with them.</p>
<p>Archiving this proposal, feel free to recover it when you want to work on it again. For more details see <a href="http://framework.zend.com/wiki/display/ZFDEV/Archiving+of+abandoned+proposals+(Feb+5+2011)">this email</a>.</p>
9 Comments
comments.show.hideSep 30, 2009
David Jung
<p>What is the reasoning behind the choice of "Document" for this interface? </p>
<p>I'm not even sure what you mean by that, but looking at the API my uncertain guess would be that Documents are meant to correspond to AWS SimpleDB Items, is that right? In Azure Table terminology they're Entities (and MS even refers to them as Rows sometimes).</p>
<p>If that's the intention, we have something like:</p>
<table><tbody>
<tr>
<th><p> SimpleDB </p></th>
<th><p> AzureTable <br class="atl-forced-newline" /> </p></th>
<th><p> SimpleCloudAPI <br class="atl-forced-newline" /> </p></th>
<th><p> RDBMS </p></th>
</tr>
<tr>
<td><p> domain </p></td>
<td><p> table </p></td>
<td><p> collection </p></td>
<td><p> table </p></td>
</tr>
<tr>
<td><p> item </p></td>
<td><p> entity </p></td>
<td><p> document </p></td>
<td><p> row </p></td>
</tr>
<tr>
<td><p> attribute <br class="atl-forced-newline" />
(multi-valued) <br class="atl-forced-newline" /> </p></td>
<td><p> property <br class="atl-forced-newline" />
(rich value) <br class="atl-forced-newline" /> </p></td>
<td><p> ? </p></td>
<td><p> column <br class="atl-forced-newline" />
(single-valued) <br class="atl-forced-newline" /> </p></td>
</tr>
</tbody></table>
<p>My feeling is that the "Document" terminology will be confusing. Although many advocate steering-clear of RDBMS terminology to make it clear these services don't behave like a RDBMS at all, I think most developers who will use this API will have already had some contact with a RDBMS and will find it much easier to 'map' their concepts across if the terminology is similar (e.g. collection|table, row, column) while just making the non-RDBMs semantics clear. It will be difficult to confuse an API like this with an RDBMS anyway - since there are obviously lots of things that RDBMSs have that this doesn't - such as complex queries, joins, counting and sorting perhaps.</p>
<p>My only other comment is that the API should be refined/abstracted a bit more for queries. Just having a query() method that passes the query through to the underlying service doesn't really provide much abstraction at all. Perhaps some minimal structure for query specification, like 'table' name, set of where 'AND' conditions, sort order spec if supported - things like that. A set of classes implementing a LINQ-like interface might be nice and doable, but probably goes beyond the 'simple' designation <ac:emoticon ac:name="smile" /></p>
<p>I do still agree that an 'escape hatch' through to the underlying service is required - but not as something that need to be used for almost all use-cases. The API should also describe some lowest-common-denominator semantics, such as BASE semantics, while allowing querying for particular capabilities - such as AzureTable's ACID semantics for single entities within a single table, SimpleDBs domain data-size and item-count limits etc.</p>
<p>Will be nice to have this solidified so we can start writing to it! (at least as a version 1.0 which can continue to be supported while 2.0 is debated). <ac:emoticon ac:name="smile" />
<br class="atl-forced-newline" /></p>
Oct 05, 2009
Wil Sinclair
<p>You're correct about the terminology. The full terminology mapping is:</p>
<table><tbody>
<tr>
<th><p> SimpleDB </p></th>
<th><p> AzureTable <br class="atl-forced-newline" /> </p></th>
<th><p> SimpleCloudAPI <br class="atl-forced-newline" /> </p></th>
<th><p> RDBMS </p></th>
</tr>
<tr>
<td><p> domain </p></td>
<td><p> table </p></td>
<td><p> collection </p></td>
<td><p> table </p></td>
</tr>
<tr>
<td><p> item </p></td>
<td><p> entity </p></td>
<td><p> document </p></td>
<td><p> row </p></td>
</tr>
<tr>
<td><p> attribute <br class="atl-forced-newline" />
(multi-valued) <br class="atl-forced-newline" /> </p></td>
<td><p> property <br class="atl-forced-newline" />
(rich value) <br class="atl-forced-newline" /> </p></td>
<td><p> field </p></td>
<td><p> column <br class="atl-forced-newline" />
(single-valued) <br class="atl-forced-newline" /> </p></td>
</tr>
</tbody></table>
<p>As far as I'm aware, there isn't an agreed-upon name for these services. "Document" reflects their document-oriented nature. The other names are what we feel was the most descriptive. We're certainly open to other suggestions, however.</p>
<p>There is a query class in this proposal, although it is definitely a tricky part of the API. Should we only allow certain clauses and constructs in the query class and enforce them there, or should the query class simply contain strings for each clause? The first is harder to implement, more bug-prone, but maintains a clear picture of portability for the user. The second is easier to implement, thinner (and therefore less bug-prone), but opens up a lot of lock-in possibilities similar to arbitrary SQL strings in code.</p>
<p>I'm getting the code cleaned up for check in. <ac:emoticon ac:name="smile" /></p>
Oct 05, 2009
Tim Fountain
<p>Like David, I'm a little confused by the terminology here. I have a ZF app already using Rackspace's cloud files API for storing/delivering CDN content (using my own quick 'n dirty cloud files API), so I would be keen to replace this with an 'official' implementation. But for using Cloud Files would I be using this component (Zend_Cloud_Document) or Zend_Cloud_Storage? Or some combination of the two?</p>
<p>Let's say I have a video I want to store on the Cloud Files CDN, does the video become the 'document' in this case? </p>
<p>I'm sure the use cases will clear things up a bit more! This is a great initiative though, I'm looking forward to seeing it progress.</p>
Oct 05, 2009
Wil Sinclair
<p>Unfortunately, the community hasn't settled on a name for this type of database. I'm open to a name change, but all of the names I could think of were not exactly right for one reason or the other.</p>
<p>You would use Zend_Cloud_Storage for Cloud Files.</p>
<p>Have you checked out this lib: <a class="external-link" href="http://github.com/rackspace/php-cloudfiles?">http://github.com/rackspace/php-cloudfiles?</a></p>
<p>,Wil</p>
Oct 05, 2009
Tim Fountain
<p>Yes, I've seen that library (and that's the one Rackspace publicise on their site). I wrote my own to integrate a little better with my ZF app (e.g. it uses Zend_Http_Client), as I had a few custom requirements. Thanks for clearing things up though!</p>
Oct 06, 2009
Wil Sinclair
<p>Rackspace has actually expressed some interest in contributing their libraries to Zend Framework. They evaluated Zend_Http_Client and had some issues with it. Would you be interested in talking to their PHP client developer about how you hooked it all up?</p>
<p>,Wil</p>
Nov 23, 2009
Wil Moore III (wilmoore)
<p>Hi Wil,</p>
<p>I had actually started working on a proposal for this back in July: <a class="external-link" href="http://framework.zend.com/wiki/display/ZFPROP/Zend_Service_Rackspace_CloudFiles+-+Wil+Moore+III">http://framework.zend.com/wiki/display/ZFPROP/Zend_Service_Rackspace_CloudFiles+-+Wil+Moore+III</a></p>
<p>My effort halted due to client work and 9-5 work increasing. I expected that load to decrease, however, it has not. That being said, I am going to withdraw my proposal to make way for someone else to provide this component (I've seen a couple of drafts out on github lately).</p>
<p>-Wil Moore III</p>
Nov 23, 2009
Wil Sinclair
<p>Any update on the Cloud Files proposal? Have you talked with Rackspace about finishing it? They were very stoked to work with ZF community members on their client libs last time I spoke with them.</p>
<p>,Wil</p>
Feb 08, 2011
Dolf Schimmel (Freeaqingme)
<p>Archiving this proposal, feel free to recover it when you want to work on it again. For more details see <a href="http://framework.zend.com/wiki/display/ZFDEV/Archiving+of+abandoned+proposals+(Feb+5+2011)">this email</a>.</p>