Skip navigation links
A B C D E F G H I L M N O P R S T U V W X 

S

saveDictionaryXML(String) - Method in class edu.georgetown.gucs.dictionary.Dictionary
Writes this dictionary as a serialized XML object.
saveScores(String) - Method in class edu.georgetown.gucs.matcher.CompareFingerprint
 
saveScores(String) - Method in class edu.georgetown.gucs.matcher.ExperimentN2
 
score - Variable in class edu.georgetown.gucs.matcher.FingerprintMatcher
 
ScoreFingerprints - Class in edu.georgetown.gucs.matcher
Compares two XML fingerprints and provides a score for their degree of similarity.
ScoreFingerprints(String, String, String) - Constructor for class edu.georgetown.gucs.matcher.ScoreFingerprints
Constructor that creates a matcher object and two Base64 encoded fingerprint string objects
scores - Variable in class edu.georgetown.gucs.matcher.CompareFingerprint
 
sdHashFilePath - Static variable in class edu.georgetown.gucs.utility.Global
Location of file path to SdHash
SdhashFingerprinter - Class in edu.georgetown.gucs.fingerprinter
 
SdhashFingerprinter() - Constructor for class edu.georgetown.gucs.fingerprinter.SdhashFingerprinter
 
SdhashFingerprinter(String) - Constructor for class edu.georgetown.gucs.fingerprinter.SdhashFingerprinter
Constructor that loads a dictionary and its tokenizers.
SdHashFingerprintMatcher - Class in edu.georgetown.gucs.matcher
 
SdHashFingerprintMatcher() - Constructor for class edu.georgetown.gucs.matcher.SdHashFingerprintMatcher
The constructor that is called to inherit the Matcher parent class
sdtext - package sdtext
 
serialize(T, ByteArrayOutputStream) - Method in interface edu.georgetown.gucs.bloomfilter.ICompactSerializer
Serialize the specified type into the specified DataOutputStream instance.
serialize(LongBitSet, DataOutputStream) - Static method in class edu.georgetown.gucs.bloomfilter.LongBitSetSerializer
 
serializer() - Static method in class edu.georgetown.gucs.bloomfilter.LongFastBloomFilter
 
set(long) - Method in class edu.georgetown.gucs.bloomfilter.LongBitSet
Sets the bit at the specified index to true.
set(int, int) - Method in class edu.georgetown.gucs.bloomfilter.LongBitSet
Sets the bits between from (inclusive) and to (exclusive) to true.
setA(A) - Method in class edu.georgetown.gucs.utility.Pair
Sets the first object in this pair
setB(B) - Method in class edu.georgetown.gucs.utility.Pair
Sets the second object in this pair
setConfigurations() - Method in class edu.georgetown.gucs.configurations.FingerprintConfiguration
 
setCreatingProgram(String) - Method in class edu.georgetown.gucs.dictionary.Dictionary
Sets the name of the program that created this dictionary
setDictionary(Dictionary) - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
Sets the dictionary for this fingerprinter.
setDictionary(String) - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
Sets the dictionary for this fingerprinter.
setEndByte(int) - Method in class edu.georgetown.gucs.fingerprinter.Fingerprint
 
setFileName(String) - Method in class edu.georgetown.gucs.fingerprinter.Fingerprint
 
setFingerprint(Fingerprint) - Method in class edu.georgetown.gucs.fingerprinter.Fingerprint
 
setFingerprint(byte[]) - Method in class edu.georgetown.gucs.fingerprinter.Fingerprint
 
setMangler(boolean) - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
Added by evan- removes teh manglerToken
setMangler(String, Dictionary) - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
Passes the specified mangler settings and a set of tokens to the mangler for this fingerprinter.
setManglerRNG(Random) - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
Sets a random number generator for the mangler to allow for repeatability.
setManglerRNG(Random) - Method in class edu.georgetown.gucs.tokenizers.TokenizerList
Sets the random number generator to use with a FileManglerTokenizer object in this list
setMinimumScore(int) - Method in class edu.georgetown.gucs.matcher.FingerprintMatcher
Sets the minimum score for these two fingerprints to be considered a match.
setMode(String) - Method in class edu.georgetown.gucs.tokenizers.ChineseFileTokenizer
Sets the token creation mode.
setMode(String) - Method in class edu.georgetown.gucs.tokenizers.FileTokenizer
 
setOutput(boolean, boolean, boolean) - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
Specifies which information to display in this fingerprinter's digest XML output
setPosition(int) - Method in class edu.georgetown.gucs.dictionary.DictionaryEntry
Sets the position of the token in this dictionary.
setRNG(Random) - Method in class edu.georgetown.gucs.tokenizers.FileManglerTokenizer
Sets the random number generator to use with the manglers that are set in this tokenizer
setSplitter(String) - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
Sets the splitter for this fingerprinter
setSplitter(String) - Method in class edu.georgetown.gucs.tokenizers.TokenizerList
Sets the splitter for this list
setStartByte(int) - Method in class edu.georgetown.gucs.fingerprinter.Fingerprint
 
setTerse() - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
Sets this fingerprinter's digest XML output to only display file and fingerprint information
setTokenizers(TokenizerList) - Method in class edu.georgetown.gucs.dictionary.Dictionary
Loads the tokenizers to use for this dictionary from a TokenizerList object.
setTokenizers(List<String>) - Method in class edu.georgetown.gucs.dictionary.Dictionary
Sets the tokenizers used by this dictionary; Only works if no tokenizers have already been set or if the given list contains the same tokenizers as those that have already been set
setTokenizers(TokenizerList) - Method in class edu.georgetown.gucs.fingerprinter.SdhashFingerprinter
Tokenizes a file based on a list of tokenizers that are passed to the function
setTokenizers(String) - Method in class edu.georgetown.gucs.fingerprinter.SdhashFingerprinter
SetTokenizers sets the tokenizer configuration for the input file
setVerbose() - Method in class edu.georgetown.gucs.fingerprinter.Fingerprinter
Sets this fingerprinter's digest XML output to display all available information
showBloomFilter - Variable in class edu.georgetown.gucs.configurations.FingerprintConfiguration
whether or not the bloomFilterExists
showBloomFilter - Variable in class edu.georgetown.gucs.fingerprinter.Fingerprinter
whether or not the bloomFilterExists
showDataSource - Variable in class edu.georgetown.gucs.configurations.FingerprintConfiguration
whether to store data source information in the output for the fingerprint
showDataSource - Variable in class edu.georgetown.gucs.fingerprinter.Fingerprinter
whether to store data source information in the output for the fingerprint
showDictionary - Variable in class edu.georgetown.gucs.configurations.FingerprintConfiguration
whether to store the full dictionary used for the fingerprint in the output
showDictionary - Variable in class edu.georgetown.gucs.fingerprinter.Fingerprinter
whether to store the full dictionary used for the fingerprint in the output
showDigest - Variable in class edu.georgetown.gucs.configurations.FingerprintConfiguration
whether to store digest information in the output for the fingerprint
showDigest - Variable in class edu.georgetown.gucs.fingerprinter.Fingerprinter
whether to store digest information in the output for the fingerprint
size() - Method in class edu.georgetown.gucs.bloomfilter.LongBitSet
Returns the number of bits of space actually in use by this BitSet to represent bit values.
size() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Provides the number of tokens in this dictionary
size() - Method in class edu.georgetown.gucs.utility.AntlrBitSet
Provides the size of this bit set
size() - Method in class edu.georgetown.gucs.utility.FileLister
Provides the size of the list of all files
splitLines() - Method in class edu.georgetown.gucs.tokenizers.ParseTokenizers
 
Splitter - Class in edu.georgetown.gucs.tokenizers
Partitions a set of tokens into multiple pieces
Splitter(String, List<Token>) - Constructor for class edu.georgetown.gucs.tokenizers.Splitter
Constructor that takes a string indicating the type and percent for the splitter and a list of tokens to split
splitToken(Vector<Token>) - Method in class edu.georgetown.gucs.tokenizers.FileManglerTokenizer
 
start_byte() - Method in class edu.georgetown.gucs.utility.Token
Provides the starting byte location of this token
stdOutDictionaryXML() - Method in class edu.georgetown.gucs.dictionary.Dictionary
Writes dictionary to standard output
StripMarkupTokenizer - Class in edu.georgetown.gucs.tokenizers
Eliminates tokens nested inside markup language tags; assumes that tokens have been split by line rather than using whitespace
StripMarkupTokenizer(String) - Constructor for class edu.georgetown.gucs.tokenizers.StripMarkupTokenizer
Constructor that specifies whether to keep tokens nested inside script tags the default is to eliminate these tokens
StripPunctuationTokenizer - Class in edu.georgetown.gucs.tokenizers
Separates tokens based on punctuation and removes punctuation from tokens
StripPunctuationTokenizer() - Constructor for class edu.georgetown.gucs.tokenizers.StripPunctuationTokenizer
 
subset(AntlrBitSet) - Method in class edu.georgetown.gucs.utility.AntlrBitSet
Returns whether this bitset is contained within a
subtractInPlace(AntlrBitSet) - Method in class edu.georgetown.gucs.utility.AntlrBitSet
Subtracts the elements of the given antlrBitSet from this bitset in-place (turn off all bits of this bitset that are in the given antlrBitSet)
summaryReduce(Vector<Token>) - Method in class edu.georgetown.gucs.tokenizers.FileManglerTokenizer
 
systemID - Variable in class edu.georgetown.gucs.configurations.FingerprintConfiguration
the fully qualified domain name for the local host IP address
systemID - Variable in class edu.georgetown.gucs.fingerprinter.Fingerprinter
the fully qualified domain name for the local host IP address
A B C D E F G H I L M N O P R S T U V W X 
Skip navigation links