ZF-9814: Zend_Db_Table doesn't scale to i5/OS

Description

The query in Zend_Db_Adapter_Db2::describeTable() for i5/OS runs very slow. Though we have over 20 million distinct column names on the machine I'm using. That's not uncommon for IBM i servers since they are so (overly?) database-integrated.

Comments

I'm closing this as not an issue. To prevent the describe table query being needed repeatedly, you may either use a meta-data cache, or hardcode the columns for each table into your Zend_Db_Table definition.

I was able to improve performance significantly by removing the 4 calls to the scalar function UPPER() and the use of SELECT DISTINCT.

Unfortunately, this loses case-folding (case-insensitive) nature of Zend_Db_Table, since SQL is case-insensitive with identifiers. Seems ok as long as we enforce case-sensitivity in our code and database guidelines. We'd rather leverage Zend_Db_Table than not.

I'll inline the attached patch below:

2010-05-06  Aaron S. Hawley  

    * Zend/Db/Adapter/Db2.php (describeTable): Improve performance of
    i5/OS query by avoiding UPPER() and DISTINCT.  It's not uncommon
    for AS/400 machines to have distinct columns numbering in the tens
    of millions.

--- Zend/Db/Adapter/Db2.php 2010-04-29 12:37:50.021322300 -0400
+++ Zend/Db/Adapter/Db2.php 2010-05-06 11:44:19.584306100 -0400
@@ -399,7 +399,7 @@
         } else {
 
             // DB2 On I5 specific query
-            $sql = "SELECT DISTINCT C.TABLE_SCHEMA, C.TABLE_NAME, C.COLUMN_NAME, C.ORDINAL_POSITION,
+            $sql = "SELECT C.TABLE_SCHEMA, C.TABLE_NAME, C.COLUMN_NAME, C.ORDINAL_POSITION,
                 C.DATA_TYPE, C.COLUMN_DEFAULT, C.NULLS ,C.LENGTH, C.SCALE, LEFT(C.IDENTITY,1),
                 LEFT(tc.TYPE, 1) AS tabconsttype, k.COLSEQ
                 FROM QSYS2.SYSCOLUMNS C
@@ -411,10 +411,10 @@
                        AND C.TABLE_NAME = k.TABLE_NAME
                        AND C.COLUMN_NAME = k.COLUMN_NAME)
                 WHERE "
-                 . $this->quoteInto('UPPER(C.TABLE_NAME) = UPPER(?)', $tableName);
+                 . $this->quoteInto('C.TABLE_NAME = ?', $tableName);
 
             if ($schemaName) {
-                $sql .= $this->quoteInto(' AND UPPER(C.TABLE_SCHEMA) = UPPER(?)', $schemaName);
+                $sql .= $this->quoteInto(' AND C.TABLE_SCHEMA = ?', $schemaName);
             }
 
             $sql .= " ORDER BY C.ORDINAL_POSITION FOR FETCH ONLY";

Thanks for the quick reply. I'm not sure the caching would work because I should have said "never returns" instead of "very slow". So that really only leaves the option of hardcoding the meta data.

Could we keep this issue open to see if other people have the issue as well? It might take awhile. Those of us writing PHP for the i are likely few and far between. Thanks.

Duplicate of ZF-6606

Metadata caching does work for me on i5. To reply to Aaron's comment about his script never returning, he may need to allow a larger timeout value in the script to allow time for the cache to be written (the first time through).

Even though metadata caching for Zend_Db_Table does work for me on i5, and speeds up subsequent queries, I will avoid Zend_Db_Table when I can. I always fear that a real user will be the one to experience that slow first time through, as the cache is being written, causing a time-out for the unlucky user. I know that I can carefully make sure the cache is there before users hit the site, but I'd rather avoid even the possibility if I can.

That said, this technique does work, but if you use it on a production site, make sure all cached metadata is there before any users arrive.

With Aaron proposed "fix" my queries went from ~85sec to < 2sec

I also added caching in my boostrap and now queries are almost instantaneous.