ZF-5554: records missed in returned when doing a search with two or more strings with AND

Issue Type: Bug Created: 2009-01-15T05:05:09.000+0000 Last Updated: 2012-04-24T11:03:05.000+0000 Status: In Progress Fix version(s): Reporter: Richard So (richso) Assignee: Alexander Veremyev (alexander) Tags: - Zend_Search_Lucene

  • After1.12.0
  • zf-crteam-padraic
  • zf-crteam-priority

Related issues: - ZF-5545



I have found cases that some records has missed in return to a search with two or more strings with AND criteria (even using generic search API); I've traced into the code and found the following facts:

  1. the score of a MultiTerm search becomes zero (the terms when searched separately can return results, i.e. score > 0)
  2. some of the $this->_termsFreqs[$termId][$docId] using in the for loop of member function "_conjunctionScore" is null and caused $reader->getSimilarity()->tf($this->_termsFreqs[$termId][$docId]) [MultiTerm.php line 473] to return 0
  3. the "used" $docsFilter created in [MultiTerm.php line 330] in member function "_calculateConjunctionResult" caused the last loop for the termFreqs failed to be assigned values

As I have only limited time for reverse engine, I can't fully understand the use of "DocsFilter" at the moment, I am wondering whether this can be corrected by simply modify the member function as follows:

private function _calculateConjunctionResult(Zend_Search_Lucene_Interface $reader)
    $this->_resVector = null;

    if (count($this->_terms) == 0) {
        $this->_resVector = array();

    // Order terms by selectivity
    $docFreqs = array();
    $ids      = array();
    foreach ($this->_terms as $id => $term) {
        $docFreqs[] = $reader->docFreq($term);
        $ids[]      = $id; // Used to keep original order for terms with the same selectivity and omit terms comparison
    array_multisort($docFreqs, SORT_ASC, SORT_NUMERIC,
                    $ids,      SORT_ASC, SORT_NUMERIC,

    $docsFilter = new Zend_Search_Lucene_Index_DocsFilter();
    foreach ($this->_terms as $termId => $term) {
        $termDocs = $reader->termDocs($term, $docsFilter);
    // Treat last retrieved docs vector as a result set
    // (filter collects data for other terms)
    $this->_resVector = array_flip($termDocs);

    $docsFilter = new Zend_Search_Lucene_Index_DocsFilter();                                               // <---------------------- added to re-initialize the $docsFilter
    foreach ($this->_terms as $termId => $term) {
        $this->_termsFreqs[$termId] = $reader->termFreqs($term, $docsFilter);




Posted by Sjoerd van Noort (seven007) on 2009-05-29T02:06:01.000+0000

Reinitialising the DocsFilter as mentioned here solves the Undefined offset notice from issue ZF-5545 for me. Although I don't understand how/if this influences search result. It seems to me, although I don't understand the issue fully, that with this fix the function $reader->termFreqs has to redo some work because it does not have the Docfilter data from the previous $reader->termDoc calls

Have you found an issue?

See the Overview section for more details.


© 2006-2019 by Zend, a Rogue Wave Company. Made with by awesome contributors.

This website is built using zend-expressive and it runs on PHP 7.