Data Leak Prevention is set of technologies used to monitor information that is leaving corporate network, and taking actions in case sensitive information “leaves the building”. Detection of sensitive information is done usually using pattern matching with checksums (e.g. credit card numbers, which have quite a bit of redundant information for validation), content proximity heuristics (e.g. US social security numbers, which don’t have any redundant information for validation, and without taking content into equation can be recognized as Canadian driver license numbers), digital fingerprints and the combination of the above.
Birthday Attack is a way to meddle with digital signatures to greatly reduce key strength. This attack is based on (and takes its name from) the Birthday Paradox. It resorts to slightly tweaking (by inserting commas, reordering words and using synonyms) two documents – one which is considered to be correct, and one which is not – until both have same digital signature. Once digital signatures collide, one can use the “good” document (e.g. request for a day off) to generate digital signature, and then use “bad” document (e.g. approval for fund transfer) as legitimate. The Birthday Attack reduces key depth by two, effectively turning 64bit digital signatures int 32bit.
Returning to the DLP problem. Usually, corporations have some sort of legal banners, footers, headers and sidebars, which comprise all kinds of disclaimers, terms, conditions and threats. As these are usually not considered sensitive, good fingerprinting schemas will contain “black lists” for “sensitive” fingerprints, along with “white lists” for insensitive fingerprints.
If the potential “extruder” – the person who wants to get the information out from the organization is willing to get away with it – can slightly influence the white lists, he can use the Birthday Attack to mask sensitive bits as legalese, which provide plenty of opportunity for meddling by being inhumanely long and unreadable. Once fingerprint for the sensitive information collide with fingerprint for the legalese, the “extruder” can safely circumvent DLP system.
The DLP system, in order to prevent this kind of attack, should do at least one of following:
- Disallow meddling with white lists (keep lawyers on track)
- Use hash functions with sufficient key depth, so that keys will be cryptographically resistant to BA. For example, 2^64 is pretty strong key depth; this suggests that hash function with key depth of 128 will be used.
- Combine fingerprints with context information. Seeing one fingerprint from a banner in the middle of neutral text should set some bits on