API Documentation

Zend/Search/Lucene/Index/SegmentInfo.php

Show: inherited
Table of Contents

Zend Framework

LICENSE

This source file is subject to the new BSD license that is bundled with this package in the file LICENSE.txt. It is also available through the world-wide-web at this URL: http://framework.zend.com/license/new-bsd If you did not receive a copy of the license and are unable to obtain it through the world-wide-web, please send an email to license@zend.com so we can send you a copy immediately.

Category
Zend  
Copyright
Copyright (c) 2005-2012 Zend Technologies USA Inc. (http://www.zend.com)  
License
New BSD License  
Package
Zend_Search_Lucene  
Subpackage
Index  
Version
$Id: SegmentInfo.php 24594 2012-01-05 21:27:01Z matthew $  

\Zend_Search_Lucene_Index_SegmentInfo

Package: Zend\Search\Lucene\Index

Implements
\Zend_Search_Lucene_Index_TermsStream_Interface
Category
Zend  
Copyright
Copyright (c) 2005-2012 Zend Technologies USA Inc. (http://www.zend.com)  
License
New BSD License  

Constants

Constant  FULL_SCAN_VS_FETCH_BOUNDARY = 5

"Full scan vs fetch" boundary.

If filter selectivity is less than this value, then full scan is performed (since term entries fetching has some additional overhead).

Constant  SM_TERMS_ONLY = 0

Scan modes

Constant  SM_FULL_INFO = 1
Constant  SM_MERGE_INFO = 2

Properties

Propertyprivateinteger  $_delGen= ''

Delete file generation number

-2 means autodetect latest delete generation -1 means 'there is no delete file' 0 means pre-2.1 format delete file X specifies used delete file

Details
Type
integer
Propertyprivatemixed  $_deleted= 'null'

List of deleted documents.

bitset if bitset extension is loaded or array otherwise.

Default valuenullDetails
Type
mixed
Propertyprivateboolean  $_deletedDirty= 'false'

$this->_deleted update flag

Default valuefalseDetails
Type
boolean
Propertyprivate\Zend_Search_Lucene_Storage_Directory_Filesystem  $_directory= ''

File system adapter.

Propertyprivateinteger  $_docCount= ''

Number of docs in a segment

Details
Type
integer
Propertyprivatearray|null  $_docMap= 'null'

Map of the document IDs Used to get new docID after removing deleted documents.

It's not very effective from memory usage point of view, but much more faster, then other methods

Default valuenullDetails
Type
arraynull
Propertyprivatearray  $_fields= ''

Segment fields. Array of Zend_Search_Lucene_Index_FieldInfo objects for this segment

Details
Type
array
Propertyprivatearray  $_fieldsDicPositions= ''

Field positions in a dictionary.

(Term dictionary contains filelds ordered by names)

Details
Type
array
Propertyprivate\Zend_Search_Lucene_Storage_File  $_frqFile= 'null'

Frequencies File object for stream like terms reading

Default valuenullDetails
Type
\Zend_Search_Lucene_Storage_File
Propertyprivateinteger  $_frqFileOffset= ''

Actual offset of the .frq file data

Details
Type
integer
Propertyprivateboolean  $_hasSingleNormFile= ''

Segment has single norms file

If true then one .nrm file is used for all fields Otherwise .fN files are used

Details
Type
boolean
Propertyprivateinteger  $_indexInterval= ''

Segment index interval

Details
Type
integer
Propertyprivateboolean  $_isCompound= ''

Use compound segment file (*.cfs) to collect all other segment files (excluding .del files)

Details
Type
boolean
Propertyprivate\Zend_Search_Lucene_Index_Term  $_lastTerm= 'null'

Last Term in a terms stream

Default valuenullDetails
Type
\Zend_Search_Lucene_Index_Term
Propertyprivate\Zend_Search_Lucene_Index_TermInfo  $_lastTermInfo= 'null'

Last TermInfo in a terms stream

Default valuenullDetails
Type
\Zend_Search_Lucene_Index_TermInfo
Propertyprivatearray|null  $_lastTermPositions= ''

An array of all term positions in the documents.

Array structure: array( docId => array( pos1, pos2, ...), ...)

Is set to null if term positions loading has to be skipped

Details
Type
arraynull
Propertyprivatestring  $_name= ''

Segment name

Details
Type
string
Propertyprivatearray  $_norms= 'array()'

Normalization factors.

An array fieldName => normVector normVector is a binary string. Each byte corresponds to an indexed document in a segment and encodes normalization factor (float value, encoded by Zend_Search_Lucene_Search_Similarity::encodeNorm())

Default valuearray()Details
Type
array
Propertyprivate\Zend_Search_Lucene_Storage_File  $_prxFile= 'null'

Positions File object for stream like terms reading

Default valuenullDetails
Type
\Zend_Search_Lucene_Storage_File
Propertyprivateinteger  $_prxFileOffset= ''

Actual offset of the .prx file in the compound file

Details
Type
integer
Propertyprivatearray  $_segFileSizes= ''

Associative array where the key is the file name and the value is file size (.csf).

Details
Type
array
Propertyprivatearray  $_segFiles= ''

Associative array where the key is the file name and the value is data offset in a compound segment file (.csf).

Details
Type
array
Propertyprivate  $_sharedDocStoreOptions= ''
Details
Type
Propertyprivateinteger  $_skipInterval= ''

Segment skip interval

Details
Type
integer
Propertyprivateinteger  $_termCount= '0'

Actual number of terms in term stream

Default value0Details
Type
integer
Propertyprivatearray  $_termDictionary= ''

Term Dictionary Index

Array of arrays (Zend_Search_Lucene_Index_Term objects are represented as arrays because of performance considerations) [0] -> $termValue [1] -> $termFieldNum

Corresponding Zend_Search_Lucene_Index_TermInfo object stored in the $_termDictionaryInfos

Details
Type
array
Propertyprivatearray  $_termDictionaryInfos= ''

Term Dictionary Index TermInfos

Array of arrays (Zend_Search_Lucene_Index_TermInfo objects are represented as arrays because of performance considerations) [0] -> $docFreq [1] -> $freqPointer [2] -> $proxPointer [3] -> $skipOffset [4] -> $indexPointer

Details
Type
array
Propertyprivatearray  $_termInfoCache= 'array()'

TermInfo cache

Size is 1024. Numbers are used instead of class constants because of performance considerations

Default valuearray()Details
Type
array
Propertyprivateinteger  $_termNum= '0'

Overall number of terms in term stream

Default value0Details
Type
integer
Propertyprivateinteger  $_termsScanMode= ''

Terms scan mode

Values:

self::SM_TERMS_ONLY - terms are scanned, no additional info is retrieved self::SM_FULL_INFO - terms are scanned, frequency and position info is retrieved self::SM_MERGE_INFO - terms are scanned, frequency and position info is retrieved document numbers are compacted (shifted if segment has deleted documents)

Details
Type
integer
Propertyprivate\Zend_Search_Lucene_Storage_File  $_tisFile= 'null'

Term Dictionary File object for stream like terms reading

Default valuenullDetails
Type
\Zend_Search_Lucene_Storage_File
Propertyprivateinteger  $_tisFileOffset= ''

Actual offset of the .tis file data

Details
Type
integer
Propertyprivateboolean  $_usesSharedDocStore= ''

True if segment uses shared doc store

Details
Type
boolean

Methods

methodpublic__construct( \Zend_Search_Lucene_Storage_Directory $directory, string $name, integer $docCount, integer $delGen = 0, array|null $docStoreOptions = null, boolean $hasSingleNormFile = false, boolean $isCompound = null ) : void

Zend_Search_Lucene_Index_SegmentInfo constructor

Parameters
Name Type Description
$directory \Zend_Search_Lucene_Storage_Directory
$name string
$docCount integer
$delGen integer
$docStoreOptions array|null
$hasSingleNormFile boolean
$isCompound boolean
methodprivate_cleanUpTermInfoCache( ) : void

methodprivate_deletedCount( ) : integer

Returns number of deleted documents.

Returns
Type Description
integer
methodprivate_detectLatestDelGen( ) : integer

Detect latest delete generation

Is actualy used from writeChanges() method or from the constructor if it's invoked from Index writer. In both cases index write lock is already obtained, so we shouldn't care about it

Returns
Type Description
integer
methodprivate_getFieldPosition( integer $fieldNum ) : integer

Get field position in a fields dictionary

Parameters
Name Type Description
$fieldNum integer
Returns
Type Description
integer
methodprivate_load21DelFile( ) : mixed

Load 2.1+ format detetions file

Returns bitset or an array depending on bitset extension availability

Returns
Type Description
mixed
methodprivate_loadDelFile( ) : mixed

Load detetions file

Returns bitset or an array depending on bitset extension availability

Returns
Type Description
mixed
Throws
Exception Description
\Zend_Search_Lucene_Exception
methodprivate_loadDictionaryIndex( ) : void

Load terms dictionary index

Throws
Exception Description
\Zend_Search_Lucene_Exception
methodprivate_loadNorm( integer $fieldNum ) : void

Load normalizatin factors from an index file

Parameters
Name Type Description
$fieldNum integer
Throws
Exception Description
\Zend_Search_Lucene_Exception
methodprivate_loadPre21DelFile( ) : mixed

Load pre-2.1 detetions file

Returns bitset or an array depending on bitset extension availability

Returns
Type Description
mixed
Throws
Exception Description
\Zend_Search_Lucene_Exception
methodpubliccloseTermsStream( ) : void

Close terms stream

Should be used for resources clean up if stream is not read up to the end

methodpubliccompoundFileLength( string $extension ) : integer

Get compound file length

Parameters
Name Type Description
$extension string
Returns
Type Description
integer
methodpubliccount( ) : integer

Returns the total number of documents in this segment (including deleted documents).

Returns
Type Description
integer
methodpubliccurrentTerm( ) : \Zend_Search_Lucene_Index_Term|null

Returns term in current position

Returns
Type Description
\Zend_Search_Lucene_Index_Term|null
methodpubliccurrentTermPositions( ) : array

Returns an array of all term positions in the documents.

Return array structure: array( docId => array( pos1, pos2, ...), ...)

Returns
Type Description
array
methodpublicdelete(  $id ) : void

Deletes a document from the index segment.

$id is an internal document id

Parameters
Name Type Description
$id

integer

methodpublicgetDelGen( ) : integer

Returns actual deletions file generation number.

Returns
Type Description
integer
methodpublicgetField( integer $fieldNum ) : \Zend_Search_Lucene_Index_FieldInfo

Returns field info for specified field

Parameters
Name Type Description
$fieldNum integer
Returns
Type Description
\Zend_Search_Lucene_Index_FieldInfo
methodpublicgetFieldInfos( ) : array

Returns array of FieldInfo objects.

Returns
Type Description
array
methodpublicgetFieldNum( string $fieldName ) : integer

Returns field index or -1 if field is not found

Parameters
Name Type Description
$fieldName string
Returns
Type Description
integer
methodpublicgetFields( boolean $indexed = false ) : array

Returns array of fields.

if $indexed parameter is true, then returns only indexed fields.

Parameters
Name Type Description
$indexed boolean
Returns
Type Description
array
methodpublicgetName( ) : string

Return segment name

Returns
Type Description
string
methodpublicgetTermInfo( \Zend_Search_Lucene_Index_Term $term ) : \Zend_Search_Lucene_Index_TermInfo

Scans terms dictionary and returns term info

Parameters
Name Type Description
$term \Zend_Search_Lucene_Index_Term
Returns
Type Description
\Zend_Search_Lucene_Index_TermInfo
methodpublichasDeletions( ) : boolean

Returns true if any documents have been deleted from this index segment.

Returns
Type Description
boolean
methodpublichasSingleNormFile( ) : boolean

Returns true if segment has single norms file.

Returns
Type Description
boolean
methodpublicisCompound( ) : boolean

Returns true if segment is stored using compound segment file.

Returns
Type Description
boolean
methodpublicisDeleted(  $id ) : boolean

Checks, that document is deleted

Parameters
Name Type Description
$id

integer

Returns
Type Description
boolean
methodpublicnextTerm( ) : \Zend_Search_Lucene_Index_Term|null

Scans terms dictionary and returns next term

Returns
Type Description
\Zend_Search_Lucene_Index_Term|null
methodpublicnorm( integer $id, string $fieldName ) : float

Returns normalization factor for specified documents

Parameters
Name Type Description
$id integer
$fieldName string
Returns
Type Description
float
methodpublicnormVector( string $fieldName ) : string

Returns norm vector, encoded in a byte string

Parameters
Name Type Description
$fieldName string
Returns
Type Description
string
methodpublicnumDocs( ) : integer

Returns the total number of non-deleted documents in this segment.

Returns
Type Description
integer
methodpublicopenCompoundFile( string $extension, boolean $shareHandler = true ) : \Zend_Search_Lucene_Storage_File

Opens index file stoted within compound index file

Parameters
Name Type Description
$extension string
$shareHandler boolean
Returns
Type Description
\Zend_Search_Lucene_Storage_File
Throws
Exception Description
\Zend_Search_Lucene_Exception
methodpublicresetTermsStream( ) : integer

Reset terms stream

$startId - id for the fist document $compact - remove deleted documents

Returns start document id for the next segment

Returns
Type Description
integer
Throws
Exception Description
\Zend_Search_Lucene_Exception
methodpublicskipTo( \Zend_Search_Lucene_Index_Term $prefix ) : void

Skip terms stream up to the specified term preffix.

Prefix contains fully specified field info and portion of searched term

Parameters
Name Type Description
$prefix \Zend_Search_Lucene_Index_Term
Throws
Exception Description
\Zend_Search_Lucene_Exception
methodpublictermDocs( \Zend_Search_Lucene_Index_Term $term, integer $shift = 0, \Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null ) : array

Returns IDs of all the documents containing term.

Parameters
Name Type Description
$term \Zend_Search_Lucene_Index_Term
$shift integer
$docsFilter \Zend_Search_Lucene_Index_DocsFilter|null
Returns
Type Description
array
methodpublictermFreqs( \Zend_Search_Lucene_Index_Term $term, integer $shift = 0, \Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null ) : \Zend_Search_Lucene_Index_TermInfo

Returns term freqs array.

Result array structure: array(docId => freq, ...)

Parameters
Name Type Description
$term \Zend_Search_Lucene_Index_Term
$shift integer
$docsFilter \Zend_Search_Lucene_Index_DocsFilter|null
Returns
Type Description
\Zend_Search_Lucene_Index_TermInfo
methodpublictermPositions( \Zend_Search_Lucene_Index_Term $term, integer $shift = 0, \Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null ) : \Zend_Search_Lucene_Index_TermInfo

Returns term positions array.

Result array structure: array(docId => array(pos1, pos2, ...), ...)

Parameters
Name Type Description
$term \Zend_Search_Lucene_Index_Term
$shift integer
$docsFilter \Zend_Search_Lucene_Index_DocsFilter|null
Returns
Type Description
\Zend_Search_Lucene_Index_TermInfo
Documentation was generated by DocBlox 0.15.1.