API Documentation

Zend/Search/Lucene/Analysis/Analyzer.php

Show: inherited
Table of Contents

Zend Framework

LICENSE

This source file is subject to the new BSD license that is bundled with this package in the file LICENSE.txt. It is also available through the world-wide-web at this URL: http://framework.zend.com/license/new-bsd If you did not receive a copy of the license and are unable to obtain it through the world-wide-web, please send an email to license@zend.com so we can send you a copy immediately.

Category
Zend  
Copyright
Copyright (c) 2005-2012 Zend Technologies USA Inc. (http://www.zend.com)  
License
New BSD License  
Package
Zend_Search_Lucene  
Subpackage
Analysis  
Version
$Id: Analyzer.php 24594 2012-01-05 21:27:01Z matthew $  

\Zend_Search_Lucene_Analysis_Analyzer

Package: Zend\Search\Lucene\Analysis

An Analyzer is used to analyze text.

It thus represents a policy for extracting index terms from text.

Note: Lucene Java implementation is oriented to streams. It provides effective work with a huge documents (more then 20Mb). But engine itself is not oriented such documents. Thus Zend_Search_Lucene analysis API works with data strings and sets (arrays).

Children
\Zend_Search_Lucene_Analysis_Analyzer_Common
Category
Zend  
Copyright
Copyright (c) 2005-2012 Zend Technologies USA Inc. (http://www.zend.com)  
License
New BSD License  

Properties

Propertyprivate\Zend_Search_Lucene_Analysis_Analyzer  $_defaultImpl= ''
static

The Analyzer implementation used by default.

Propertyprotectedstring  $_encoding= ''''

Input string encoding

Default value''Details
Type
string
Propertyprotectedstring  $_input= 'null'

Input string

Default valuenullDetails
Type
string

Methods

methodpublicgetDefault( ) : \Zend_Search_Lucene_Analysis_Analyzer
static

Return the default Analyzer implementation used by indexing code.

Returns
Type Description
\Zend_Search_Lucene_Analysis_Analyzer
methodpublicnextToken( ) : \Zend_Search_Lucene_Analysis_Token|null
abstract

Tokenization stream API Get next token Returns null at the end of stream

Tokens are returned in UTF-8 (internal Zend_Search_Lucene encoding)

Returns
Type Description
\Zend_Search_Lucene_Analysis_Token|null
methodpublicreset( ) : void
abstract

Reset token stream

methodpublicsetDefault(  $analyzer ) : void
static

Set the default Analyzer implementation used by indexing code.

Parameters
Name Type Description
$analyzer
methodpublicsetInput( string $data,  $encoding = '' ) : void

Tokenization stream API Set input

Parameters
Name Type Description
$data string
$encoding
methodpublictokenize( string $data,  $encoding = '' ) : array

Tokenize text to a terms Returns array of Zend_Search_Lucene_Analysis_Token objects

Tokens are returned in UTF-8 (internal Zend_Search_Lucene encoding)

Parameters
Name Type Description
$data string
$encoding
Returns
Type Description
array
Documentation was generated by DocBlox 0.15.1.