public class ArabicFileTokenizer extends FileTokenizer
mode
positions, tokenVector
Constructor and Description |
---|
ArabicFileTokenizer() |
Modifier and Type | Method and Description |
---|---|
static void |
main(java.lang.String[] args)
Tokenizes an Arabic text file and prints the resulting tokens to the the screen
|
void |
tokenize(java.lang.String filename)
Splits the document into tokens.
|
checkIndexing, getMode, readFile, setMode
getPositionsVector, getTokenVector, iterator, position_iterator, printTokens, tokenize, tokenize
public void tokenize(java.lang.String filename)
tokenize
in class FileTokenizer
filename
- the string filename of the Arabic language document to split into tokensArabicAnalyzer
public static void main(java.lang.String[] args)
args
- array of string command line argumentsargs[0]
the filename of the Arabic language document to split into tokens