public class EnronEmailTokenizer extends Tokenizer
positions, tokenVector
Constructor and Description |
---|
EnronEmailTokenizer() |
Modifier and Type | Method and Description |
---|---|
void |
tokenize(java.util.Iterator<java.lang.String> iterator)
Eliminates tokens related to attachments in the Enron data set
|
void |
tokenize(java.util.Iterator<java.lang.String> tokensIterator,
java.util.Iterator<Pair<java.lang.Integer,java.lang.Integer>> positionsIterator)
Eliminates tokens related to attachments in the Enron data set
|
getPositionsVector, getTokenVector, iterator, position_iterator, printTokens, tokenize
public void tokenize(java.util.Iterator<java.lang.String> iterator)
public void tokenize(java.util.Iterator<java.lang.String> tokensIterator, java.util.Iterator<Pair<java.lang.Integer,java.lang.Integer>> positionsIterator)