public class Trial
extends java.lang.Object
implements java.lang.Runnable
Constructor and Description |
---|
Trial(int experimentID,
boolean insert,
DBInterface database,
java.lang.String dataset,
FileLister datasetFiles,
java.io.File datasetPath,
java.lang.String language,
long randomSeed,
double dictionaryParameter,
double sampleParameter,
double minIDF,
double maxIDF,
java.util.List<java.lang.String> tokenizers,
java.lang.String manglers,
java.lang.String fingerprinter,
java.lang.String matcher,
int matcherScore)
Constructor that sets default values for this trial
|
Modifier and Type | Method and Description |
---|---|
void |
calculateTrialResults()
Computes the precision, recall, and f-score for each mangler
|
int |
getDictionarySize()
Provides the size of the dictionary used
|
java.util.Map<java.lang.String,java.lang.Double[]> |
getManglerResults()
Provides the results for each mangler; sleeps until the trial is finished
|
TrialParameters |
getTrialParameters()
Provides the parameters for this trial that are needed for results comparisons
|
void |
run()
Runs the trial
|
int |
runTrial()
Runs this trial
|
public Trial(int experimentID, boolean insert, DBInterface database, java.lang.String dataset, FileLister datasetFiles, java.io.File datasetPath, java.lang.String language, long randomSeed, double dictionaryParameter, double sampleParameter, double minIDF, double maxIDF, java.util.List<java.lang.String> tokenizers, java.lang.String manglers, java.lang.String fingerprinter, java.lang.String matcher, int matcherScore)
experimentID
- the int
value of the experiment id this trial is a part of; set to -1 if insert = falseinsert
- the boolean
value of whether to insert the results from this trial into the databasedatabase
- the dBInterface database to insert trial results; null if insert = falsedataset
- the string name of the database to insert trial resultsdatasetFiles
- the fileLister containing the files to use for the dictionary and sample in this trialdatasetPath
- the path to the files to use for the dictionary and sample in this triallanguage
- the string language of the files used in this trialrandomSeed
- the long
the initial seed to use for the random number generator in this trial; can be used
to enable repeatability across trialsdictionaryParameter
- the double
containing the count or percent to use for dictionary creation in this trial;
numbers less than 1 are processed as percents, numbers greater than or equal to 1 are processed as countssampleParameter
- the double
containing the count or percent to use for the file sample size in this trial;
numbers less than 1 are processed as percents, numbers greater than or equal to 1 are processed as countsminIDF
- the double
minimum normalized IDF to keep in the dictionary; any token with a lower IDF will
be discardedmaxIDF
- the double
maximum normalized IDF to keep in the dictionary; any token with a higher IDF will
be discardedtokenizers
- the list of tokenizer strings to use in this trialmanglers
- the list different mangler strings to use in this trialfingerprinter
- the string name of the fingerprinter to use in this trialmatcher
- the string name of the matcher to use for fingerprint comparisons in this trialmatcherScore
- the int
minimum score that the matcher considers a matchpublic TrialParameters getTrialParameters()
public java.util.Map<java.lang.String,java.lang.Double[]> getManglerResults()
public int getDictionarySize()
int
size of the dictionary in this trialpublic int runTrial()
int
trialID from this trialpublic void run()
run
in interface java.lang.Runnable
public void calculateTrialResults()