- main(String...) - Static method in class sdtext.CommandLine
-
- makeDictionary(String, String, String) - Method in class edu.georgetown.gucs.dictionary.Dictionary
-
Creates dictionary from the given directory; Reads an XML file that specifies which tokenizers to use and outputs
dictionary to a file
- makeDictionary(List<String>, String, double, double, String) - Method in class edu.georgetown.gucs.dictionary.Dictionary
-
Creates dictionary from the given directory; Reads an XML file that specifies which tokenizers to use and outputs a
trimmed dictionary to a file.
- manglerName - Variable in class edu.georgetown.gucs.configurations.FingerprintConfiguration
-
the name of the current mangler, if any *
- manglerName - Variable in class edu.georgetown.gucs.fingerprinter.Fingerprinter
-
the name of the current mangler, if any *
- manglerOn - Variable in class edu.georgetown.gucs.fingerprinter.Fingerprinter
-
- match(byte[], byte[]) - Method in class edu.georgetown.gucs.matcher.CosineSimilarityFingerprintMatcher
-
Determines that the two fingerprints are matching if their similarity score is at or above the minimum score for
this fingerprinter
- match(byte[], byte[]) - Method in class edu.georgetown.gucs.matcher.FingerprintMatcher
-
Determines if the two fingerprints' documents match; must be over-ridden in each specific matcher
- match(String, String) - Method in class edu.georgetown.gucs.matcher.FingerprintMatcher
-
Determines if these two fingerprints' documents match.
- match(List<Fingerprint>) - Method in class edu.georgetown.gucs.matcher.FingerprintMatcher
-
Determines if these two fingerprints' documents match.
- matcher - Variable in class edu.georgetown.gucs.matcher.CompareFingerprint
-
- MAX_BIT_ARRAY_SIZE - Static variable in class edu.georgetown.gucs.bloomfilter.BloomFilterCalculations
-
- MAX_BITSET_SIZE_CHANGE - Static variable in class edu.georgetown.gucs.bloomfilter.BloomFilterCalculations
-
- MAX_SCORE - Static variable in class edu.georgetown.gucs.utility.Global
-
Value indicating the maximum similarity score
- MaximumLengthTokenizer - Class in edu.georgetown.gucs.tokenizers
-
Eliminates tokens that are longer than a given length
- MaximumLengthTokenizer(String) - Constructor for class edu.georgetown.gucs.tokenizers.MaximumLengthTokenizer
-
Constructor that sets the maximum token length to be considered
- maxNumberOfElementsForDesiredFalsePositiveProbability(double) - Static method in class edu.georgetown.gucs.bloomfilter.BloomFilterCalculations
-
Returns the maximum number of elements (n) the LongFastBloomFilter can hold
based on MAX_BIT_ARRAY_SIZE while still maintaining the desired false positive
probability.
- maxThread - Variable in class edu.georgetown.gucs.matcher.CompareFingerprint
-
- member(int) - Method in class edu.georgetown.gucs.utility.AntlrBitSet
-
Provides whether the provided integer is a member of this bitset
- MemoryTokenizer - Class in edu.georgetown.gucs.tokenizers
-
- MemoryTokenizer(String) - Constructor for class edu.georgetown.gucs.tokenizers.MemoryTokenizer
-
- MemoryTokenizer() - Constructor for class edu.georgetown.gucs.tokenizers.MemoryTokenizer
-
- mergeDictionary(Map<Token, DictionaryEntry>) - Method in class edu.georgetown.gucs.dictionary.Dictionary
-
Merges the given Map of tokens to dictionaryEntry with this dictionary
- MIN_SCORE - Static variable in class edu.georgetown.gucs.utility.Global
-
Value indicating the lowest similarity score
- minimum_score - Variable in class edu.georgetown.gucs.matcher.CompareFingerprint
-
- minimum_score - Variable in class edu.georgetown.gucs.matcher.FingerprintMatcher
-
the minimum score needed to be considered a match
- MinimumLengthTokenizer - Class in edu.georgetown.gucs.tokenizers
-
Eliminates tokens that are shorter than a given length
- MinimumLengthTokenizer(String) - Constructor for class edu.georgetown.gucs.tokenizers.MinimumLengthTokenizer
-
Constructor that sets the minimum token length to be considered
- MOD_MASK - Static variable in class edu.georgetown.gucs.utility.AntlrBitSet
-
A precomputed mod mask.
- mode_ - Variable in class edu.georgetown.gucs.tokenizers.FileTokenizer
-
- moreFiles() - Method in class edu.georgetown.gucs.utility.FileLister
-
Lets a process that is taking files off the linked blocking queue know if there are more
files or not; use to test if there will be any more.
- MurmurHash - Class in edu.georgetown.gucs.bloomfilter
-
- MurmurHash() - Constructor for class edu.georgetown.gucs.bloomfilter.MurmurHash
-