API Documentation

Zend/Search/Lucene.php

Show: inherited
Table of Contents

Zend Framework

LICENSE

This source file is subject to the new BSD license that is bundled with this package in the file LICENSE.txt. It is also available through the world-wide-web at this URL: http://framework.zend.com/license/new-bsd If you did not receive a copy of the license and are unable to obtain it through the world-wide-web, please send an email to license@zend.com so we can send you a copy immediately.

Category
Zend  
Copyright
Copyright (c) 2005-2012 Zend Technologies USA Inc. (http://www.zend.com)  
License
New BSD License  
Package
Zend_Search_Lucene  
Version
$Id: Lucene.php 24594 2012-01-05 21:27:01Z matthew $  

\Zend_Search_Lucene

Package: Zend\Search\Lucene

Implements
\Zend_Search_Lucene_Interface
Category
Zend  
Copyright
Copyright (c) 2005-2012 Zend Technologies USA Inc. (http://www.zend.com)  
License
New BSD License  

Constants

Constant  FORMAT_PRE_2_1 = 0
Constant  FORMAT_2_1 = 1
Constant  FORMAT_2_3 = 2
Constant  GENERATION_RETRIEVE_COUNT = 10

Generation retrieving counter

Constant  GENERATION_RETRIEVE_PAUSE = 50

Pause between generation retrieving attempts in milliseconds

Properties

Propertyprivateboolean  $_closeDirOnExit= 'true'

File system adapter closing option

Default valuetrueDetails
Type
boolean
Propertyprivateboolean  $_closed= 'false'

Signal, that index is already closed, changes are fixed and resources are cleaned up

Default valuefalseDetails
Type
boolean
Propertyprivatestring  $_defaultSearchField= 'null'
static

Default field name for search

Null means search through all fields

Default valuenullDetails
Type
string
Propertyprivate\Zend_Search_Lucene_Storage_Directory  $_directory= 'null'

File system adapter.

Default valuenullDetails
Type
\Zend_Search_Lucene_Storage_Directory
Propertyprivateinteger  $_docCount= '0'

Number of documents in this index.

Default value0Details
Type
integer
Propertyprivateinteger  $_formatVersion= ''

Index format version

Details
Type
integer
Propertyprivateinteger  $_generation= ''

Current segment generation

Details
Type
integer
Propertyprivateboolean  $_hasChanges= 'false'

Flag for index changes

Default valuefalseDetails
Type
boolean
Propertyprivateinteger  $_refCount= '0'

Number of references to the index object

Default value0Details
Type
integer
Propertyprivateinteger  $_resultSetLimit= '0'
static

Result set limit

0 means no limit

Default value0Details
Type
integer
Propertyprivatearray  $_segmentInfos= 'array()'

Array of Zend_Search_Lucene_Index_SegmentInfo objects for current version of index.

Zend_Search_Lucene_Index_SegmentInfo
Default valuearray()Details
Type
array
Propertyprivateinteger  $_termsPerQueryLimit= '1024'
static

Terms per query limit

0 means no limit

Default value1024Details
Type
integer
Propertyprivate\Zend_Search_Lucene_TermStreamsPriorityQueue  $_termsStream= 'null'

Terms stream priority queue object

Default valuenullDetails
Type
\Zend_Search_Lucene_TermStreamsPriorityQueue
Propertyprivate\Zend_Search_Lucene_Index_Writer  $_writer= 'null'

Writer for this index, not instantiated unless required.

Default valuenullDetails
Type
\Zend_Search_Lucene_Index_Writer

Methods

methodpublic__construct( \Zend_Search_Lucene_Storage_Directory_Filesystem|string $directory = null,  $create = false ) : void

Opens the index.

IndexReader constructor needs Directory as a parameter. It should be a string with a path to the index folder or a Directory object.

Parameters
Name Type Description
$directory \Zend_Search_Lucene_Storage_Directory_Filesystem|string
$create
Throws
Exception Description
\Zend_Search_Lucene_Exception
methodpublic__destruct( ) : void

Object destructor

methodprivate_close( ) : void

Close current index and free resources

methodprivate_getIndexWriter( ) : \Zend_Search_Lucene_Index_Writer

Returns an instance of Zend_Search_Lucene_Index_Writer for the index

Returns
Type Description
\Zend_Search_Lucene_Index_Writer
methodprivate_readPre21SegmentsFile( ) : void

Read segments file for pre-2.1 Lucene index format

Throws
Exception Description
\Zend_Search_Lucene_Exception
methodprivate_readSegmentsFile( ) : void

Read segments file

Throws
Exception Description
\Zend_Search_Lucene_Exception
methodprivate_updateDocCount( ) : void

Update document counter

methodpublicaddDocument( \Zend_Search_Lucene_Document $document ) : void

Adds a document to this index.

Parameters
Name Type Description
$document \Zend_Search_Lucene_Document
methodpubliccloseTermsStream( ) : void

Close terms stream

Should be used for resources clean up if stream is not read up to the end

methodpubliccommit( ) : void

Commit changes resulting from delete() or undeleteAll() operations.

Details
Todo
undeleteAll processing.  
methodpubliccount( ) : integer

Returns the total number of documents in this index (including deleted documents).

Returns
Type Description
integer
methodpubliccreate( mixed $directory ) : \Zend_Search_Lucene_Interface
static

Create index

Parameters
Name Type Description
$directory mixed
Returns
Type Description
\Zend_Search_Lucene_Interface
methodpubliccurrentTerm( ) : \Zend_Search_Lucene_Index_Term|null

Returns term in current position

Returns
Type Description
\Zend_Search_Lucene_Index_Term|null
methodpublicdelete( integer|\Zend_Search_Lucene_Search_QueryHit $id ) : void

Deletes a document from the index.

$id is an internal document id

Parameters
Name Type Description
$id integer|\Zend_Search_Lucene_Search_QueryHit
Throws
Exception Description
\Zend_Search_Lucene_Exception
methodpublicdocFreq( \Zend_Search_Lucene_Index_Term $term ) : integer

Returns the number of documents in this index containing the $term.

Parameters
Name Type Description
$term \Zend_Search_Lucene_Index_Term
Returns
Type Description
integer
methodpublicfind( \Zend_Search_Lucene_Search_QueryParser|string $query ) : array

Performs a query against the index and returns an array of Zend_Search_Lucene_Search_QueryHit objects.

Input is a string or Zend_Search_Lucene_Search_Query.

Parameters
Name Type Description
$query \Zend_Search_Lucene_Search_QueryParser|string
Returns
Type Description
array Zend_Search_Lucene_Search_QueryHit
Throws
Exception Description
\Zend_Search_Lucene_Exception
methodpublicgetActualGeneration( \Zend_Search_Lucene_Storage_Directory $directory ) : integer
static

Get current generation number

Returns generation number 0 means pre-2.1 index format -1 means there are no segments files.

Parameters
Name Type Description
$directory \Zend_Search_Lucene_Storage_Directory
Returns
Type Description
integer
Throws
Exception Description
\Zend_Search_Lucene_Exception
methodpublicgetDefaultSearchField( ) : string
static

Get default search field.

Null means, that search is performed through all fields by default

Returns
Type Description
string
methodpublicgetDirectory( ) : \Zend_Search_Lucene_Storage_Directory

Returns the Zend_Search_Lucene_Storage_Directory instance for this index.

Returns
Type Description
\Zend_Search_Lucene_Storage_Directory
methodpublicgetDocument( integer|\Zend_Search_Lucene_Search_QueryHit $id ) : \Zend_Search_Lucene_Document

Returns a Zend_Search_Lucene_Document object for the document number $id in this index.

Parameters
Name Type Description
$id integer|\Zend_Search_Lucene_Search_QueryHit
Returns
Type Description
\Zend_Search_Lucene_Document
Throws
Exception Description
\Zend_Search_Lucene_Exception Exception is thrown if $id is out of the range
methodpublicgetFieldNames( boolean $indexed = false ) : array

Returns a list of all unique field names that exist in this index.

Parameters
Name Type Description
$indexed boolean
Returns
Type Description
array
methodpublicgetFormatVersion( ) : integer

Get index format version

Returns
Type Description
integer
methodpublicgetGeneration( ) : integer

Get generation number associated with this index instance

The same generation number in pair with document number or query string guarantees to give the same result while index retrieving. So it may be used for search result caching.

Returns
Type Description
integer
methodpublicgetMaxBufferedDocs( ) : integer

Retrieve index maxBufferedDocs option

maxBufferedDocs is a minimal number of documents required before the buffered in-memory documents are written into a new Segment

Default value is 10

Returns
Type Description
integer
methodpublicgetMaxMergeDocs( ) : integer

Retrieve index maxMergeDocs option

maxMergeDocs is a largest number of documents ever merged by addDocument(). Small values (e.g., less than 10,000) are best for interactive indexing, as this limits the length of pauses while indexing to a few seconds. Larger values are best for batched indexing and speedier searches.

Default value is PHP_INT_MAX

Returns
Type Description
integer
methodpublicgetMergeFactor( ) : integer

Retrieve index mergeFactor option

mergeFactor determines how often segment indices are merged by addDocument(). With smaller values, less RAM is used while indexing, and searches on unoptimized indices are faster, but indexing speed is slower. With larger values, more RAM is used during indexing, and while searches on unoptimized indices are slower, indexing is faster. Thus larger values (> 10) are best for batch index creation, and smaller values (< 10) for indices that are interactively maintained.

Default value is 10

Returns
Type Description
integer
methodpublicgetResultSetLimit( ) : integer
static

Get result set limit.

0 means no limit

Returns
Type Description
integer
methodpublicgetSegmentFileName( integer $generation ) : string
static

Get segments file name

Parameters
Name Type Description
$generation integer
Returns
Type Description
string
methodpublicgetSimilarity( ) : \Zend_Search_Lucene_Search_Similarity

Retrive similarity used by index reader

Returns
Type Description
\Zend_Search_Lucene_Search_Similarity
methodpublicgetTermsPerQueryLimit( ) : integer
static

Get result set limit.

0 (default) means no limit

Returns
Type Description
integer
methodpublichasDeletions( ) : boolean

Returns true if any documents have been deleted from this index.

Returns
Type Description
boolean
methodpublichasTerm( \Zend_Search_Lucene_Index_Term $term ) : boolean

Returns true if index contain documents with specified term.

Is used for query optimization.

Parameters
Name Type Description
$term \Zend_Search_Lucene_Index_Term
Returns
Type Description
boolean
methodpublicisDeleted( integer $id ) : boolean

Checks, that document is deleted

Parameters
Name Type Description
$id integer
Returns
Type Description
boolean
Throws
Exception Description
\Zend_Search_Lucene_Exception Exception is thrown if $id is out of the range
methodpublicmaxDoc( ) : integer

Returns one greater than the largest possible document number.

This may be used to, e.g., determine how big to allocate a structure which will have an element for every document number in an index.

Returns
Type Description
integer
methodpublicnextTerm( ) : \Zend_Search_Lucene_Index_Term|null

Scans terms dictionary and returns next term

Returns
Type Description
\Zend_Search_Lucene_Index_Term|null
methodpublicnorm( integer $id, string $fieldName ) : float

Returns a normalization factor for "field, document" pair.

Parameters
Name Type Description
$id integer
$fieldName string
Returns
Type Description
float
methodpublicnumDocs( ) : integer

Returns the total number of non-deleted documents in this index.

Returns
Type Description
integer
methodpublicopen( mixed $directory ) : \Zend_Search_Lucene_Interface
static

Open index

Parameters
Name Type Description
$directory mixed
Returns
Type Description
\Zend_Search_Lucene_Interface
methodpublicoptimize( ) : void

Optimize index.

Merges all segments into one

methodpublicresetTermsStream( ) : void

Reset terms stream.

methodpublicsetDefaultSearchField( string $fieldName ) : void
static

Set default search field.

Null means, that search is performed through all fields by default

Default value is null

Parameters
Name Type Description
$fieldName string
methodpublicsetFormatVersion( int $formatVersion ) : void

Set index format version.

Index is converted to this format at the nearest upfdate time

Parameters
Name Type Description
$formatVersion int
Throws
Exception Description
\Zend_Search_Lucene_Exception
methodpublicsetMaxBufferedDocs( integer $maxBufferedDocs ) : void

Set index maxBufferedDocs option

maxBufferedDocs is a minimal number of documents required before the buffered in-memory documents are written into a new Segment

Default value is 10

Parameters
Name Type Description
$maxBufferedDocs integer
methodpublicsetMaxMergeDocs( integer $maxMergeDocs ) : void

Set index maxMergeDocs option

maxMergeDocs is a largest number of documents ever merged by addDocument(). Small values (e.g., less than 10,000) are best for interactive indexing, as this limits the length of pauses while indexing to a few seconds. Larger values are best for batched indexing and speedier searches.

Default value is PHP_INT_MAX

Parameters
Name Type Description
$maxMergeDocs integer
methodpublicsetMergeFactor(  $mergeFactor ) : void

Set index mergeFactor option

mergeFactor determines how often segment indices are merged by addDocument(). With smaller values, less RAM is used while indexing, and searches on unoptimized indices are faster, but indexing speed is slower. With larger values, more RAM is used during indexing, and while searches on unoptimized indices are slower, indexing is faster. Thus larger values (> 10) are best for batch index creation, and smaller values (< 10) for indices that are interactively maintained.

Default value is 10

Parameters
Name Type Description
$mergeFactor
methodpublicsetResultSetLimit( integer $limit ) : void
static

Set result set limit.

0 (default) means no limit

Parameters
Name Type Description
$limit integer
methodpublicsetTermsPerQueryLimit( integer $limit ) : void
static

Set terms per query limit.

0 means no limit

Parameters
Name Type Description
$limit integer
methodpublicskipTo( \Zend_Search_Lucene_Index_Term $prefix ) : void

Skip terms stream up to the specified term preffix.

Prefix contains fully specified field info and portion of searched term

Parameters
Name Type Description
$prefix \Zend_Search_Lucene_Index_Term
methodpublictermDocs( \Zend_Search_Lucene_Index_Term $term, \Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null ) : array

Returns IDs of all documents containing term.

Parameters
Name Type Description
$term \Zend_Search_Lucene_Index_Term
$docsFilter \Zend_Search_Lucene_Index_DocsFilter|null
Returns
Type Description
array
methodpublictermDocsFilter( \Zend_Search_Lucene_Index_Term $term, \Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null ) : \Zend_Search_Lucene_Index_DocsFilter

Returns documents filter for all documents containing term.

It performs the same operation as termDocs, but return result as Zend_Search_Lucene_Index_DocsFilter object

Parameters
Name Type Description
$term \Zend_Search_Lucene_Index_Term
$docsFilter \Zend_Search_Lucene_Index_DocsFilter|null
Returns
Type Description
\Zend_Search_Lucene_Index_DocsFilter
methodpublictermFreqs( \Zend_Search_Lucene_Index_Term $term, \Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null ) : integer

Returns an array of all term freqs.

Result array structure: array(docId => freq, ...)

Parameters
Name Type Description
$term \Zend_Search_Lucene_Index_Term
$docsFilter \Zend_Search_Lucene_Index_DocsFilter|null
Returns
Type Description
integer
methodpublictermPositions( \Zend_Search_Lucene_Index_Term $term, \Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null ) : array

Returns an array of all term positions in the documents.

Result array structure: array(docId => array(pos1, pos2, ...), ...)

Parameters
Name Type Description
$term \Zend_Search_Lucene_Index_Term
$docsFilter \Zend_Search_Lucene_Index_DocsFilter|null
Returns
Type Description
array
methodpublicterms( ) : array

Returns an array of all terms in this index.

Returns
Type Description
array
methodpublicundeleteAll( ) : void

Undeletes all documents currently marked as deleted in this index.

Details
Todo
Implementation  
Documentation was generated by DocBlox 0.15.1.