public class FileManglerTokenizer extends Tokenizer
positions, tokenVector
Constructor and Description |
---|
FileManglerTokenizer()
Constructor that initializes the random number generator and clears the mangler settings
|
FileManglerTokenizer(java.lang.String manglers)
Constructor that sets the mangler settings
|
Modifier and Type | Method and Description |
---|---|
void |
disableMangler()
Disables the mangler in this tokenizer
|
void |
enableMangler(java.lang.String manglers)
Enables the mangler with the given settings in this tokenizer
|
void |
enableMangler(java.lang.String manglers,
java.util.List<java.lang.String> tokens)
Enables the mangler with the given settings in this tokenizer
|
static void |
main(java.lang.String[] args) |
void |
setRNG(java.util.Random random)
Sets the random number generator to use with the manglers that are set in this tokenizer
|
void |
showSettings()
Prints mangler settings to standard output
|
void |
tokenize(java.util.Iterator<java.lang.String> iterator)
Alters or eliminates certain tokens based on the given mangler settings
|
getPositionsVector, getTokenVector, iterator, position_iterator, printTokens, tokenize, tokenize
public FileManglerTokenizer()
public FileManglerTokenizer(java.lang.String manglers)
manglers
- the string of mangler settings; expects non-null mangler strings in the form:
"manglerName-manglerPercent", where the manglerPercent is between 1 and 99 inclusivepublic void enableMangler(java.lang.String manglers)
manglers
- the string of mangler settings; expects non-null mangler strings in the form:
"manglerName-manglerPercent", where the manglerPercent is between 1 and 99 inclusivepublic void enableMangler(java.lang.String manglers, java.util.List<java.lang.String> tokens)
manglers
- the string of mangler settings; expects non-null mangler strings in the form:
"manglerName-manglerPercent", where the manglerPercent is between 1 and 99 inclusivetokens
- the list of string tokens to put through this manglerpublic void disableMangler()
public void setRNG(java.util.Random random)
random
- the random number generator to use with this tokenizerpublic void showSettings()
public void tokenize(java.util.Iterator<java.lang.String> iterator)
public static void main(java.lang.String[] args)