Issues

ZF-853: MS SQL Server: limit() function does not behave as expected when no order is applied

Description

(!) The approach of having the adapter help limit the query makes it much more difficult to accomplish, and it might be better for items like the select class to have adapter specific construction of queries. People writing their own queries will unlikely call the limit method directly, so only auto-generated queries will need it. Therefore make the query building smarter rather than trying to make the limit() function so smart.

The MS SQL Server limit() function in class Zend_Db_Adapter_Pdo_Mssql is not correct.

For example, it takes the source query:


select * from HumanResources.Employee emp;

and with a limit applied of 10 count records at offset 5 generates:


SELECT * FROM (SELECT TOP 10 * FROM (SELECT TOP 15 * from HumanResources.Employee emp) AS inner_tbl) AS outer_tbl

In effect the generated query selects the first 10 rows rather than the 5th-15th. To break it down, the inner most query returns the top 15 rows, then the next outer takes those top 15 and grabs the top 10 which makes the most inner query meaningless.

This is due to the fact that the system relies on the "order by" clause of the query to get the records it wants and without the clause being present it does not add one (what would it order by anyway?)

(!) If it can't handle a query that does not have "order by" and it is deemed that limit() only works with ordered queries, then an exception should be thrown indicating the query is invalid just as the other parameters are checked at the top of the method.

Also, for SQL Server 2005 the best approach for paging like this is (which really should be special cased since it is a special feature just for this purpose):


SELECT * FROM
(SELECT ROW_NUMBER() OVER (ORDER BY EmployeeID) AS rnum, * FROM HumanResources.Employee) AS EmpWithRNUM
where rnum >= 5 and rnum <= 15

For older versions or cases where you have no known "order by", the best way is really to use temporary tables or one of the other odd paging choices.

You can try this style if you know the primary key. This version does anti-joins with index scans while the original version (as implemented) causes 2 sorts followed by an index scan.


select top 15 * from HumanResources.Employee emp
where emp.EmployeeID not in (select top 5 emp2.EmployeeID from HumanResources.Employee emp2)

(!) Other things this limit() function breaks on is "union" queries and possibly other queries where it will place an accidental "order by" into the wrong place.

Comments

As possible pre-solution we should just add an paragraph to the documentation describing problems without orderby with MSSQL adapter.

The limit function also does not work if you are sorting by multiple columns. This change I've made seems to work okay but there might be a way to make it more efficient:


     public function limit($sql, $count, $offset = 0)
     {
        $count = intval($count);
        if ($count <= 0) {
            /** @see Zend_Db_Adapter_Exception */
            require_once 'Zend/Db/Adapter/Exception.php';
            throw new Zend_Db_Adapter_Exception("LIMIT argument count=$count is not valid");
        }

        $offset = intval($offset);
        if ($offset < 0) {
            /** @see Zend_Db_Adapter_Exception */
            require_once 'Zend/Db/Adapter/Exception.php';
            throw new Zend_Db_Adapter_Exception("LIMIT argument offset=$offset is not valid");
        }

        $orderby = stristr($sql, 'ORDER BY');
        if ($orderby !== false) {
            $order = str_ireplace('ORDER BY', '', $orderby);
        }

        $sql = preg_replace('/^SELECT\s/i', 'SELECT TOP ' . ($count+$offset) . ' ', $sql);

        $sql = 'SELECT * FROM (SELECT TOP ' . $count . ' * FROM (' . $sql . ') AS inner_tbl';
        if ($orderby !== false) {
            $sql .= ' ORDER BY ';
            // PHP doesn't avoid re-replacing strings when you pass an array
            // to its string replacement functions, so add junk that doesn't match
            // the original regular expression and then run an extra pass
            $outer_order = preg_replace(array('/\bASC\b/', '/\bDESC\b/'), array('_DESC', '_ASC'), $order);
            $outer_order = str_replace(array('_DESC', '_ASC'), array('DESC', 'ASC'), $outer_order);
            $sql .= $outer_order . ' ';
        }
        $sql .= ') AS outer_tbl';
        if ($orderby !== false) {
            $sql .= ' ORDER BY ' . $order;
        }

        return $sql;
    }

Whoops. And those definitely should be case-insensitive regular expressions.


not

This issue should have been fixed for the 1.5 release.

Please categorize/fix as needed.

This doesn't appear to have been fixed in 1.5.0. Please update if this is not correct.

All three are concern ORDER BY behavior in limit() method

I will attempt to test this issue within 2 weeks.

Fixed in 1.9.2 with ms sql limit() updates