Skip navigation links
A B C D E F G H I L M N O P R S T U V W X 

G

generateCreatingProgram() - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
Determines the program that created this fingerprinter
generateXML(List<Fingerprint>, String) - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
 
generateXML(List<Fingerprint>, String, String) - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
Generates an XML file that contains this fingerprinter's digest
get(long) - Method in class edu.georgetown.gucs.bloomfilter.LongBitSet
Returns the value of the bit with the specified index.
getA() - Method in class edu.georgetown.gucs.utility.Pair
Provides the first object in this pair
getB() - Method in class edu.georgetown.gucs.utility.Pair
Provides the second object in this pair
getBase64Fingerprint() - Method in class edu.georgetown.gucs.fingerprinter.Fingerprint
 
getBase64Fingerprints() - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
Gives this fingerprinter's fingerprint in Base64 encoding
getBitSetSize() - Method in class edu.georgetown.gucs.bloomfilter.LongFastBloomFilter
 
getConfigSpec() - Method in class sdtext.Arguments
 
getCreatingProgram() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the program that created this dictionary
getCreation() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the original creation date of this dictionary
getCurrentFalsePositiveProbability() - Method in class edu.georgetown.gucs.bloomfilter.LongFastBloomFilter
Returns the current false positive probability of the bloom filter based on how many elements have been added to the filter.
getCurrentNumberOfElements() - Method in class edu.georgetown.gucs.bloomfilter.LongFastBloomFilter
 
getDebugSpec() - Method in class sdtext.Arguments
 
getDictionary() - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
Gives this fingerprinter's dictionary
getDictionaryFilename() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the filename of this dictionary, if this dictionary was loaded from or saved to a file
getDictionaryName() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the name of this dictionary
getDictionarySpec() - Method in class sdtext.Arguments
 
getDictNumTerms() - Method in class edu.georgetown.gucs.fingerprinter.Fingerprint
 
getDirectory() - Method in class edu.georgetown.gucs.utility.FileLister
Provides the path to the directory used for the initial list of files
getDirectorySpec() - Method in class sdtext.Arguments
 
getDocumentCount() - Method in class edu.georgetown.gucs.dictionary.DictionaryEntry
Provides the number of documents this token appears in
getEndByte() - Method in class edu.georgetown.gucs.fingerprinter.Fingerprint
 
getFileListerArray() - Method in class edu.georgetown.gucs.utility.FileLister
 
getFileListerList() - Method in class edu.georgetown.gucs.utility.FileLister
 
getFileName() - Method in class edu.georgetown.gucs.fingerprinter.Fingerprint
 
getFileName() - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
 
getFileQueue() - Method in class edu.georgetown.gucs.utility.FileLister
 
getFilter(long, double) - Static method in class edu.georgetown.gucs.bloomfilter.LongFastBloomFilter
 
getFingerprint() - Method in class edu.georgetown.gucs.fingerprinter.Fingerprint
 
getFingerprintName() - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
Gives this fingerprinter's name.
getFingerprintSpec() - Method in class sdtext.Arguments
 
getFrequencyCount() - Method in class edu.georgetown.gucs.dictionary.DictionaryEntry
Provides the frequency count for this entry
getGUID() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the unique identifier for this dictionary
getIDF() - Method in class edu.georgetown.gucs.dictionary.DictionaryEntry
Provides the IDF of this token
getInputSpec() - Method in class sdtext.Arguments
 
getLoaded() - Method in class edu.georgetown.gucs.utility.FileLister
Provides the boolean indicating if all files have been loaded into the linked blocking queue
getLongBitSet() - Method in class edu.georgetown.gucs.bloomfilter.LongFastBloomFilter
 
getMangler() - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
Returns the string of the current mangler setting
getMatch() - Method in class edu.georgetown.gucs.matcher.ScoreFingerprints
Depending on the matcher, returns string representations of either the boolean match or integer similarity of these two fingerprints
getMatcherName() - Method in class edu.georgetown.gucs.matcher.FingerprintMatcher
Provides the name of the matcher used to determine in the two fingerprints have matching documents
getMatcherSpec() - Method in class sdtext.Arguments
 
getMaxIDF() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the largest IDF value in this dictionary
getMaxIDFFound() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the largest IDF value in this dictionary
getMaxIdfSpec() - Method in class sdtext.Arguments
 
getMaxThread() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the maximum thread count to use for this dictionary
getMinBitArraySize(long, double) - Static method in class edu.georgetown.gucs.bloomfilter.BloomFilterCalculations
Returns the minimum bit array size (m) to satisfy the desired false positive probability based on the number of elements expected for the bloom filter.
getMinIDF() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the largest IDF value in this dictionary
getMinIdfSpec() - Method in class sdtext.Arguments
 
getMinimumScore() - Method in class edu.georgetown.gucs.matcher.FingerprintMatcher
Provides the minimum score for these two fingerprints to be considered a match.
getMinScore() - Method in class sdtext.Arguments
 
getNames() - Method in class edu.georgetown.gucs.tokenizers.TokenizerList
Provides the ordered list of the tokenizer names in this list
getNormalizedIDF(Token) - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the normalized inverse document frequency (IDF) of the given token in this dictionary.
getNormalizedIDF(DictionaryEntry) - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the normalized inverse document frequency (IDF) of the given dictionaryEntry.
getNumHashFunctions() - Method in class edu.georgetown.gucs.bloomfilter.LongFastBloomFilter
 
getOutputSpec() - Method in class sdtext.Arguments
 
getPosition(Token) - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the position of the given token in this dictionary
getPosition() - Method in class edu.georgetown.gucs.dictionary.DictionaryEntry
Provides the position of the token in this dictionary.
getProperty(String) - Static method in class edu.georgetown.gucs.utility.Global
returns the property specified by ANT build
getScore(byte[], byte[]) - Method in class edu.georgetown.gucs.matcher.CosineSimilarityFingerprintMatcher
Determines the cosine similarity score for these two fingerprints
getScore(Fingerprint, Fingerprint) - Method in class edu.georgetown.gucs.matcher.CosineSimilarityFingerprintMatcher
 
getScore(byte[], byte[]) - Method in class edu.georgetown.gucs.matcher.FingerprintMatcher
Determines a similarity score for these two fingerprints; must be over-ridden in each specific matcher
getScore() - Method in class edu.georgetown.gucs.matcher.FingerprintMatcher
Determines a similarity score for these two fingerprints; over-ridden in GoogleAllPairs
getScore(String, String) - Method in class edu.georgetown.gucs.matcher.FingerprintMatcher
 
getScore(Fingerprint, Fingerprint) - Method in class edu.georgetown.gucs.matcher.FingerprintMatcher
 
getScore(String, String) - Method in class edu.georgetown.gucs.matcher.SdHashFingerprintMatcher
Takes two Lists that are of type Pair where the fingerprint is stored in byte form
getScoreXML(String, String) - Method in class edu.georgetown.gucs.matcher.FingerprintMatcher
Determines a similarity score for these two fingerprints.
getSdhash() - Method in class edu.georgetown.gucs.fingerprinter.Fingerprint
 
getSource() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the directory used for this dictionary
getStartByte() - Method in class edu.georgetown.gucs.fingerprinter.Fingerprint
 
getStatistics() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides this dictionary statistics, including: Name Number of documents Number of tokens If this dictionary has been trimmed (including the IDF range, if trimmed) Minimum and maximum IDF List of tokenizers
getSublists(double, double) - Method in class edu.georgetown.gucs.utility.FileLister
Populates the dictionary and sample lists, waiting for all files to be loaded before randomly assigning files to one of the lists.
getSystemID() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the identifier for the system that this dictionary was created on
getToken() - Method in class edu.georgetown.gucs.dictionary.DictionaryEntry
Provides the token
getTokenizerNames() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the names of the tokenizers
getTokenizers() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the tokenizerList object used to create this dictionary
getTokenizers() - Method in class edu.georgetown.gucs.tokenizers.TokenizerList
 
getTokenList() - Method in class edu.georgetown.gucs.fingerprinter.Fingerprint
 
getTokens() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides a vector of the tokens in this dictionary
getTokenVectorMap() - Method in class edu.georgetown.gucs.tokenizers.Tokenizer
Provides the list of each token in order of its appearance
getTotalDocuments() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the number of documents processed for this dictionary
Global - Class in edu.georgetown.gucs.utility
Global information
Global() - Constructor for class edu.georgetown.gucs.utility.Global
 
growToInclude(int) - Method in class edu.georgetown.gucs.utility.AntlrBitSet
Grows the set to a larger number of bits
GUID - Variable in class edu.georgetown.gucs.configurations.FingerprintConfiguration
the globally unique identifier for this fingerprinter
GUID - Variable in class edu.georgetown.gucs.fingerprinter.Fingerprinter
the globally unique identifier for this fingerprinter
GzippedFileTokenizer - Class in edu.georgetown.gucs.tokenizers
Splits contents of a gzipped text file into tokens based on whitespace or by line.
GzippedFileTokenizer() - Constructor for class edu.georgetown.gucs.tokenizers.GzippedFileTokenizer
Constructor that sets the token creation mode to split based on whitespace
GzippedFileTokenizer(String) - Constructor for class edu.georgetown.gucs.tokenizers.GzippedFileTokenizer
Constructor that sets the token creation mode.
A B C D E F G H I L M N O P R S T U V W X 
Skip navigation links