|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.georgetown.gucs.tokenizers.Tokenizer
edu.georgetown.gucs.tokenizers.ArabicTokenizer
public class ArabicTokenizer
Splits contents of an Arabic text file into tokens using Apache Lucene's ArabicAnalyzer. Tokens are used for Dictionary creation.
Field Summary |
---|
Fields inherited from class edu.georgetown.gucs.tokenizers.Tokenizer |
---|
constructor, tokenVector |
Constructor Summary | |
---|---|
ArabicTokenizer()
|
Method Summary | |
---|---|
static void |
main(java.lang.String[] args)
|
void |
tokenize(java.lang.String filename)
Splits the document into tokens. |
Methods inherited from class edu.georgetown.gucs.tokenizers.Tokenizer |
---|
getConstructor, iterator, printTokens, tokenize |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public ArabicTokenizer()
Method Detail |
---|
public void tokenize(java.lang.String filename)
tokenize
in class Tokenizer
filename
- the string filename of the document to split into tokensArabicAnalyzer
public static void main(java.lang.String[] args)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |