ZF-2055: numerical values (e.g. phone number) are not searcheable

Issue Type: Bug Created: 2007-10-10T05:30:46.000+0000 Last Updated: 2009-03-18T11:29:01.000+0000 Status: Resolved Fix version(s): - 1.8.0 (30/Apr/09)

Reporter: Wladimir Schwitin (s_t_a_l_k_e_r) Assignee: Alexander Veremyev (alexander) Tags: - Zend_Search_Lucene

Related issues: Attachments:


Numerical values e.g. phone number are not searcheable. The field types "text", "keyword" and "unstored" are tested. The TestCase for PHPUnit exploits the problem:

<pre class="highlight">

set_include_path('.' . PATH_SEPARATOR . '/opt/lampp/lib/php/' . PATH_SEPARATOR . '../application/library/');

require_once 'PHPUnit/Extensions/PerformanceTestCase.php';
require_once 'Zend/Search/Lucene.php';
require_once 'Zend/Search/Lucene/Search/Query/Boolean.php';

 * This TestCase tests the search beavior in fields of type "keyword",
 * "text" and "unstored".
class BugExploitTest extends PHPUnit_Extensions_PerformanceTestCase{
    private $index = null;
    // private $numericalValue = 'Zziqwez'; // found in fields of any type
    // private $numericalValue = 'Hallowe'; // not found in field of type "text". why? 
    private $numericalValue = '12345678'; // not found
    * Creates an index and adds a document to it.
    protected function setUp() {
            $this->index = Zend_Search_Lucene::open("/tmp/index");
        }catch(Exception $e){
            $this->index = Zend_Search_Lucene::create("/tmp/index");
        $doc = new Zend_Search_Lucene_Document();       
        $doc->addField(Zend_Search_Lucene_Field::Keyword('keyword', $this->numericalValue));
        $doc->addField(Zend_Search_Lucene_Field::Text('text', $this->numericalValue));
        $doc->addField(Zend_Search_Lucene_Field::UnStored('unstored', $this->numericalValue));

    * Shuting down the index
    protected function tearDown() {
    * Searching in the field of type "keyword".
    * Our index should have one document at least. 
    public function testSearchKeyword(){
    * Searching in the field of type "text".
    * Our index should have two documents at least. 
    * (tearDown non't deleletes any dokument) 
    public function testSearchText(){
    * Searching in the field of type "unstored".
    * Our index should have two documents at least. 
    * (tearDown non't deleletes any dokument) 
    public function testSearchUnStored(){
    private function searchInField($fieldName){
        $userQuery = Zend_Search_Lucene_Search_QueryParser::parse($this->numericalValue);
        $hits = $this->index->find($userQuery);
        // after adding a document we expect one search result at least
        $this->assertNotEquals(0, count($hits));


Posted by Thomas Weidner (thomas) on 2007-10-15T14:05:29.000+0000

Assigned to Alexander

Posted by Marc Boeren (mc) on 2007-10-29T11:42:46.000+0000

The default is to look for text (a-zA-Z) only, you can change this to include numbers by using:

<pre class="highlight">
  new Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum_CaseInsensitive()

If you need more flexibility in searching, try creating your own analzyer based on e.g. Zend_Search_Lucene_Analysis_Analyzer_Common_Text

That said, the default setting to search for text only is perhaps confusing for first-time users.

I had a slightly different problem. I was trying to search for words_with_underscores in a Keyword field. The keyword field is indexed but not tokenized, so I expected to get an exact match when is did a search for words_with_underscores. Instead, a search for 'words with underscores' was performed yielding no matches, as the keyword field wasn't tokenized. My solution was to create a ...TextCode.... analyzer.

Hope this helps!

Ciao, Marc.

Posted by Wladimir Schwitin (s_t_a_l_k_e_r) on 2007-11-07T02:26:25.000+0000

Thanks, Marc, for your helpfull comment. I inserted your code in the setUp block and the test successes!

Posted by Alexander Veremyev (alexander) on 2009-03-18T11:28:57.000+0000

Default text analyzer skips numbers.

So you can either set another analyzer:

<pre class="highlight">
   new Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum_CaseInsensitive());

or use Keyword field type.

In the second case you have to use search API to search through keyword fields:

<pre class="highlight">
$subquery1 = Zend_Search_Lucene_Search_QueryParser::parse($queryString);

$term  = new Zend_Search_Lucene_Index_Term('12345678', 'keyword');
$subquery2 = new Zend_Search_Lucene_Search_Query_Term($term);

$finalQuery = new Zend_Search_Lucene_Search_Query_Boolean();

$hits  = $index->find($finalQuery);

Have you found an issue?

See the Overview section for more details.


© 2006-2021 by Zend by Perforce. Made with by awesome contributors.

This website is built using zend-expressive and it runs on PHP 7.