public class MinimumLengthTokenizer extends Tokenizer
positions, tokenVector
Constructor and Description |
---|
MinimumLengthTokenizer(java.lang.String length)
Constructor that sets the minimum token length to be considered
|
Modifier and Type | Method and Description |
---|---|
void |
tokenize(java.util.Iterator<java.lang.String> iterator)
Eliminates tokens that are shorter than the length specified in the constructor
|
void |
tokenize(java.util.Iterator<java.lang.String> tokensIterator,
java.util.Iterator<Pair<java.lang.Integer,java.lang.Integer>> positionsIterator)
Eliminates tokens that are shorter than the length specified in the constructor
|
getPositionsVector, getTokenVector, iterator, position_iterator, printTokens, tokenize
public MinimumLengthTokenizer(java.lang.String length)
length
- the string value of the minimum token length to be includedpublic void tokenize(java.util.Iterator<java.lang.String> iterator)
public void tokenize(java.util.Iterator<java.lang.String> tokensIterator, java.util.Iterator<Pair<java.lang.Integer,java.lang.Integer>> positionsIterator)