public class BitVectorFingerprinter extends Fingerprinter
base64BloomFilter, base64Fingerprints, bloomFilter, bloomFilterSize, byteRun, creatingProgram, creator, dictionary, diskImage, fileName, fingerprintName, GUID, manglerName, manglerOn, showBloomFilter, showDataSource, showDictionary, showDigest, systemID, targetFile, unknownTokens, version, volume
Constructor and Description |
---|
BitVectorFingerprinter()
Constructor that generates the fingerprint name, version, unique identifier (GUID), system identifier, and creating
program.
|
BitVectorFingerprinter(java.lang.String dictionaryFilename)
Constructor that loads a dictionary and its tokenizers.
|
BitVectorFingerprinter(java.lang.String fingerprintFilename,
java.lang.String dictionaryFilename)
Loads a previously generated fingerprinter digest from fingerprinter and dictionary XML files.
|
Modifier and Type | Method and Description |
---|---|
java.util.List<Fingerprint> |
computeFingerprint(java.lang.String filename)
Computes a byte array fingerprint indicating the presence or absence of each token in this dictionary
|
java.util.List<Fingerprint> |
computeFingerprint(TokenizerList tokenizer,
java.lang.String str)
Computes a byte array fingerprint indicating the presence or absence of each token in this dictionary
does not read from a file -- creates fingerprints from a string and the tokenizers passed
|
java.lang.String |
toString()
A string representation of the BitVectorFingerprinter
|
addBloomFilter, computeFingerprintXML, generateCreatingProgram, generateXML, generateXML, getBase64Fingerprints, getDictionary, getFileName, getFingerprintName, getMangler, outputFields, setDictionary, setDictionary, setMangler, setMangler, setManglerRNG, setOutput, setSplitter, setTerse, setVerbose
public BitVectorFingerprinter()
public BitVectorFingerprinter(java.lang.String dictionaryFilename)
dictionaryFilename
- the filename containing the dictionary to use for this fingerprinterpublic BitVectorFingerprinter(java.lang.String fingerprintFilename, java.lang.String dictionaryFilename)
fingerprintFilename
- the string name of the XML file containing the fingerprinter's digestdictionaryFilename
- the filename containing the dictionary to use for this fingerprinterpublic java.util.List<Fingerprint> computeFingerprint(java.lang.String filename)
computeFingerprint
in class Fingerprinter
filename
- the string filename of the document to fingerprintpublic java.util.List<Fingerprint> computeFingerprint(TokenizerList tokenizer, java.lang.String str)
computeFingerprint
in class Fingerprinter
tokenizer
- the tokenizerList that will be used to tokenize the inputstr
- the string that will be converted to a fingerprintpublic java.lang.String toString()
toString
in class java.lang.Object