Interface Tokenizer
-
public interface TokenizerAn interface for objects that take String and produceTokenLists.
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.StringTOKENIZER_DEFAULTThe name of the default system tokenizer.
-
Method Summary
All Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description default TokenizergetIngestTokenizer(SchemaField field, java.util.Locale locale)Get the underlying tokenizer to use for tokenizing fields in the ingest workflow.default TokenizergetQueryTokenizer(SchemaField field, java.util.Locale locale)Get the underlaying tokenizer to use for tokenizing fields in the query workflow.Phrasetokenize(SchemaField field, java.util.Locale locale, SearchTerm term)Tokenizesterminto a Phrase for query processing.Phrasetokenize(SchemaField field, java.util.Locale locale, TermRange range)Tokenizesrangeinto a Phrase for query processing.Phrasetokenize(SchemaField field, java.util.Locale locale, WildcardTerm term)Tokenizes a wildcardterminto a Phrase for query processing.voidtokenize(SchemaField field, java.util.Locale locale, TokenList tokens)Tokenizes all tokens intokens.default TokenListtokenize(SchemaField field, java.util.Locale locale, java.lang.String value)Tokenizesvalueinto a TokenList.
-
-
-
Field Detail
-
TOKENIZER_DEFAULT
static final java.lang.String TOKENIZER_DEFAULT
The name of the default system tokenizer.- See Also:
- Constant Field Values
-
-
Method Detail
-
getIngestTokenizer
default Tokenizer getIngestTokenizer(SchemaField field, java.util.Locale locale) throws AttivioException
Get the underlying tokenizer to use for tokenizing fields in the ingest workflow.In general, this method should return
this. Tokenizers that route to sub-tokenizers for handling different fields/locales should return the actual tokenizer that will be used.- Throws:
AttivioException
-
getQueryTokenizer
default Tokenizer getQueryTokenizer(SchemaField field, java.util.Locale locale) throws AttivioException
Get the underlaying tokenizer to use for tokenizing fields in the query workflow.In general, this method should return
this. Tokenizers that route to sub-tokenizers for handling different fields/locales should return the actual tokenizer that will be used.- Throws:
AttivioException
-
tokenize
void tokenize(SchemaField field, java.util.Locale locale, TokenList tokens) throws AttivioException
Tokenizes all tokens intokens.- Parameters:
field- the schema field being tokenized (may be null)locale- the Locale of the tokens (may be null)tokens- the token list- Throws:
AttivioException- on an unrecoverable error
-
tokenize
default TokenList tokenize(SchemaField field, java.util.Locale locale, java.lang.String value) throws AttivioException
Tokenizesvalueinto a TokenList.- Parameters:
field- the schema field being tokenized (may be null)locale- the Locale of the tokens (may be null)value- the string to tokenize- Throws:
AttivioException- on an unrecoverable error
-
tokenize
Phrase tokenize(SchemaField field, java.util.Locale locale, SearchTerm term) throws AttivioException
Tokenizesterminto a Phrase for query processing.- Parameters:
field- the schema field being tokenized (may be null)locale- the Locale of the tokens (may be null)term- the SearchTerm to tokenize- Throws:
AttivioException- on an unrecoverable error
-
tokenize
Phrase tokenize(SchemaField field, java.util.Locale locale, WildcardTerm term) throws AttivioException
Tokenizes a wildcardterminto a Phrase for query processing.- Parameters:
field- the schema field being tokenized (may be null)locale- the Locale of the tokens (may be null)term- the WildcardTerm to tokenize- Throws:
AttivioException- on an unrecoverable error
-
tokenize
Phrase tokenize(SchemaField field, java.util.Locale locale, TermRange range) throws AttivioException
Tokenizesrangeinto a Phrase for query processing.- Parameters:
field- the schema field being tokenized (may be null)locale- the Locale of the tokens (may be null)range- the TermRange to tokenize- Throws:
AttivioException- on an unrecoverable error
-
-