public class EnronStripMailHeaderTokenizer extends Tokenizer
positions, tokenVector
Constructor and Description |
---|
EnronStripMailHeaderTokenizer()
Constructor to remove tokens from headers in the Enron data set
|
Modifier and Type | Method and Description |
---|---|
void |
tokenize(java.util.Iterator<java.lang.String> iterator)
Eliminates tokens from headers in the Enron data set
|
void |
tokenize(java.util.Iterator<java.lang.String> tokensIterator,
java.util.Iterator<Pair<java.lang.Integer,java.lang.Integer>> positionsIterator)
Eliminates tokens from headers in the Enron data set
|
getPositionsVector, getTokenVector, iterator, position_iterator, printTokens, tokenize
public EnronStripMailHeaderTokenizer()
public void tokenize(java.util.Iterator<java.lang.String> iterator)
public void tokenize(java.util.Iterator<java.lang.String> tokensIterator, java.util.Iterator<Pair<java.lang.Integer,java.lang.Integer>> positionsIterator)