public class TokenList extends TokenSink implements Cloneable, Iterable<Token>, Externalizable
CAUTION: unless otherwise noted, any Token instances passed to methods of a TokenList or a TokenListIterator are NOT copied when added to this TokenList. You should therefore not modify any Token instances after you have passed them into this TokenList unless you are sure of what you are doing.
DEFAULT_POSITION_INCREMENT
Constructor and Description |
---|
TokenList()
The default constructor
|
TokenList(String... tokens)
Constructor a new TokenList with the specified
tokens . |
TokenList(String token)
Construct a new TokenList with a single
token . |
TokenList(Token... tokens)
Construct a new TokenList with the specified
tokens . |
TokenList(Token token)
Construct a new TokenList with a single
token . |
Modifier and Type | Method and Description |
---|---|
void |
add(Token t,
int increment)
Appends a token.
|
TokenList |
append(String t)
Appends a token to this TokenList using the default position increment.
|
TokenList |
append(String t,
int increment)
Appends a token.
|
TokenList |
append(Token t)
Appends a token using the default position increment.
|
TokenList |
append(Token t,
int increment)
Appends a token.
|
void |
clear() |
TokenList |
clone() |
String |
concat()
Represent all primary (ie top of the stack) tokens from this TokenList as a string.
|
static String |
concat(Iterable<Position> positions)
Represent all primary (ie top of the stack) tokens from
positions as a string. |
static String |
concat(Iterator<Position> positions)
Represent all primary (ie top of the stack) tokens from
positions as a string. |
static String |
concat(Position... positions)
Represent all primary (ie top of the stack) tokens from
positions as a string. |
static String |
concat(PositionIterator positions)
Represent all primary (ie top of the stack) tokens from
positions as a string. |
boolean |
containsWildcard()
Returns true if any Token in this TokenList contains ? or *
|
boolean |
equals(Object other) |
int |
getEndOffset()
Get the end offset for the last token that contains offset information.
|
Token |
getFirst()
Returns the first Token in this TokenList.
|
Token |
getLast()
Returns the last Token in this TokenList.
|
int |
getMaxRemainingTokens()
This value is a hint to tokenization.
|
int |
getPositionCount()
Returns the number of unique positions for this term
|
int |
getStartOffset()
Get the start offset for the first token that contains offset information.
|
int |
hashCode() |
TokenIterator |
iterator()
Returns an iterator over the Tokens in this TokenList.
|
static String |
join(Iterable<Token> tokens,
char joinChar)
Join the text for all
tokens on joinChar . |
static String |
join(Iterator<Token> tokens,
char joinChar)
Join the text for all
tokens on joinChar . |
static String |
join(Token[] tokens,
int startIndex,
int endIndex,
char joinChar)
Join the text for all
tokens on joinChar . |
TokenIterator |
listIterator()
Returns a list iterator over the Tokens in this TokenList.
|
PositionIterator |
positions()
Returns an iterator that iterates over the unique positions in TokenList.
|
TokenList |
readExternal(DataInput in) |
void |
readExternal(ObjectInput in) |
void |
remove(TokenAnnotation annotation)
Remove all tokens annotated with
annotation . |
void |
setMaxRemainingTokens(int value)
Set the max remaining tokens allowed to be created.
|
int |
size()
Returns the number of Tokens in this TokenList.
|
Phrase |
toPhrase()
Convert this TokenList into a comparable Phrase query.
|
Phrase |
toPhrase(int offsetBase)
Convert this TokenList into a comparable Phrase query.
|
String |
toQuotedString() |
String |
toString() |
void |
truncate(int maxSize) |
static TokenList |
valueOf(String value)
Parses the string representation of a TokenList back into a TokenList
|
void |
writeExternal(DataOutput out) |
void |
writeExternal(ObjectOutput out) |
protected void |
writeExternalV0(DataOutput out)
Deprecated.
|
StringBuilder |
writeTo(StringBuilder buffer)
Write token list to
buffer . |
void |
writeTo(TokenListSerializer out)
Write token list to
out . |
add, add, add, add, add, endScope, endScope, startLanguageRegion, startScope, startScope
finalize, getClass, notify, notifyAll, wait, wait, wait
forEach, spliterator
public TokenList()
public TokenList(String token)
token
.public TokenList(Token token)
token
.public TokenList(String... tokens)
tokens
.public TokenList(Token... tokens)
tokens
.public void clear()
public void truncate(int maxSize)
public Phrase toPhrase()
public Phrase toPhrase(int offsetBase)
public int getMaxRemainingTokens()
public void setMaxRemainingTokens(int value)
public Token getFirst()
public Token getLast()
public int size()
public int getStartOffset()
Returns -1
if no tokens contain offset information.
public int getEndOffset()
Returns -1
if no tokens contain offset information.
public int getPositionCount()
public boolean containsWildcard()
public TokenList append(String t)
public void add(Token t, int increment)
public void remove(TokenAnnotation annotation)
annotation
.
NOTE: this removal is "position" aware.
public TokenIterator iterator()
public TokenIterator listIterator()
public PositionIterator positions()
public String toQuotedString()
public StringBuilder writeTo(StringBuilder buffer)
buffer
.public void writeTo(TokenListSerializer out)
out
.public static TokenList valueOf(String value)
value
- the string value to parsepublic static String join(Iterable<Token> tokens, char joinChar)
tokens
on joinChar
.public static String join(Iterator<Token> tokens, char joinChar)
tokens
on joinChar
.public static String join(Token[] tokens, int startIndex, int endIndex, char joinChar)
tokens
on joinChar
.tokens
- the array of tokens to join.startIndex
- the first element in tokens
to joinendIndex
- one past the last element in tokens
to joinjoinChar
- the character to join all tokens on.public String concat()
NOTE: Uses Token.offsetGap(Token)
to determine if space should be placed between tokens.
public static String concat(PositionIterator positions)
positions
as a string.
NOTE: Uses Token.offsetGap(Token)
to determine if space should be placed between tokens.
public static String concat(Iterable<Position> positions)
positions
as a string.
NOTE: Uses Token.offsetGap(Token)
to determine if space should be placed between tokens.
public static String concat(Iterator<Position> positions)
positions
as a string.
NOTE: Uses Token.offsetGap(Token)
to determine if space should be placed between tokens.
public static String concat(Position... positions)
positions
as a string.
NOTE: Uses Token.offsetGap(Token)
to determine if space should be placed between tokens.
public void readExternal(ObjectInput in) throws IOException
readExternal
in interface Externalizable
IOException
public void writeExternal(ObjectOutput out) throws IOException
writeExternal
in interface Externalizable
IOException
public TokenList readExternal(DataInput in) throws IOException
IOException
public void writeExternal(DataOutput out) throws IOException
IOException
@Deprecated protected void writeExternalV0(DataOutput out) throws IOException
IOException
Copyright © 2018 Attivio, Inc. All Rights Reserved.
PATENT NOTICE: Attivio, Inc. Software Related Patents. With respect to the Attivio software product(s) being used, the following patents apply: Querying Joined Data Within A Search Engine Index: United States Patent No.(s): 8,073,840. Ordered Processing of Groups of Messages: U.S. Patent No.(s) 8,495,656. Signal processing approach to sentiment analysis for entities in documents: U.S. Patent No.(s) 8,725,494. Other U.S. and International Patents Pending.