Issue Type: Bug Created: 2012-03-19T19:15:50.000+0000 Last Updated: 2012-05-30T12:40:29.000+0000 Status: Resolved Fix version(s): - 1.12.0 (27/Aug/12)
Reporter: Martin Hujer (mhujer) Assignee: Martin Hujer (mhujer) Tags: - Zend_Dom_Query
Related issues: - ZF-11376
Attachments: - ZF-12106.patch
I couldn't get assertQuery from Zend_Test_PHPUnit working. Then I tried to hack in the ZF code and I realized that XPath does not work when there are warnings in the DOMDocument:
<pre class="highlight"> $doc = new DOMDocument(); $doc->loadXML('Fine'); $xpath = new DOMXPath($doc); echo 'Count: ' . $xpath->evaluate('count(/root/node)') . "\n"; //1 $doc = new DOMDocument(); $doc->loadXML('Unknown entity ©'); $xpath = new DOMXPath($doc); echo 'Count: ' . $xpath->evaluate('count(/root/node)') . "\n"; //0
When the Zend_Test_PHPUnit_Constraint_DomQuery instances Zend_Dom_Query which loads the document (with unknown entities), it does not produce any warnings (although, there are populated in the documentErrors propery).
The assert just fails as if the XPath returned no results.
Posted by Matthew Weier O'Phinney (matthew) on 2012-03-19T21:17:23.000+0000
If the document is malformed, why would it return results? Or are you suggesting we throw an exception?
Posted by Tomáš Fejfar (email@example.com) on 2012-03-19T22:07:45.000+0000
It should do something. Either try to recover and ignore that unknown entity along with issuing a warning or fail completly and throw "malformed document" exception. Only thing it SHOULD NOT do is to fail silently IMO.
Posted by Martin Hujer (mhujer) on 2012-03-20T09:15:53.000+0000
Thanks for the quick answer!
The first problem is that is fails with misleading error.
I have a test that does assertQuery() on response (using PHPUnit and Zend_Test), and when the returned webpage is not valid XML (although it is returned as valid xhtml from w3.org validator), it fails the assertion (failed asserting that sth is present). I would have expected to get info saying "Document is malformed, couldn't evaluate the XPath expression.".
Another problem is, that the document type detection perfomed in Zend_Dom_Query is based on document content, which resolves to xml, despite the fact that webpage returns header "Content-Type:text/html". (It parses fine when I hack the ZF code to force the parsing as HTML. Also the XPath works fine in this case).
Posted by Martin Hujer (mhujer) on 2012-05-23T13:45:16.000+0000
Attaching patch, which changes the regex added in ZF-11376 (+test)
It fixes this issue and does not break any tests for Zend_Dom
Posted by Martin Hujer (mhujer) on 2012-05-23T13:47:35.000+0000
Adam, can you please review the patch? (Assigning you, because you have worked on the closely related isssue ZF-11376)
Posted by Adam Lundrigan (adamlundrigan) on 2012-05-30T12:40:29.000+0000
Fixed in trunk r24820. Thanks, Martin!
Have you found an issue?
See the Overview section for more details.