public class ArabicFileTokenizer extends FileTokenizer
mode_
tokenVectorMap
Constructor and Description |
---|
ArabicFileTokenizer() |
Modifier and Type | Method and Description |
---|---|
java.util.List<Token> |
tokenize(java.lang.String filename)
Splits the document into tokens.
|
addTokenizers, readFile, setMode, tokenizeFile
getTokenVectorMap, iterator, printTokens, tokenize, toString
public java.util.List<Token> tokenize(java.lang.String filename)
tokenize
in class FileTokenizer
filename
- the string filename of the Arabic language document to split into tokensArabicAnalyzer