Tuesday, March 8, 2016

A technical scheme for "watermarking" intrusions

A Sample Scenario

A commercial security company finds a trojan on one of the servers used by Turkey Point Nuclear Generating Station. While none of the management machinery is compromised (and in fact, is not even computerized), the server is responsible for both holding sensitive information and conducting other sensitive operations and an analysis by a point team deployed from a National Lab indicates that had those operations been compromised, there was a possibility of power loss from Turkey Point, although not a nuclear release.

What our policy-makers do in this event is often dictated by whether they know, for a fact, that the trojan was placed there by a participating nation state following acceptable norms, or if it is potentially the work of a rogue nation or criminal group. Sometimes these situations will matter in the future to the point of "evacuate large cities" versus "clean up and forget about it". Our technical and political protocols as represented in this post are a first-draft attempt to provide an initial, reasonable step, towards a solution.

Some other solutions

One major other idea people want to implement is of course "no go zones" for intrusion. This is harder than it looks. Most important systems are dual use - collecting intelligence about a power plant is indistinguishable from being in a position to DoS it. So we back down to having the norm of "taking all due care" when on a sensitive system. This is nearly impossible to audit or manage. So for this and other reasons not stated here, we recommend instead that a system of "anonymous Red Phones" be set up.

The Value of Multilateral as Opposed to Bilateral Norms

Assuming you have a perfect way to do the watermarking as described below, if you only have two members of the Norms Group, detecting the watermark provides attribution. Therefore having many members in the norms group is ideal.

Likewise, a group-level anonymous "red phone" can allow for back and forth over a contested issue without running into the attribution issue.

Real World Use Cases

This "international incident" related to the war in Ukraine was in fact, nothing of the sort.

A basic background in watermarking

Every watermarking specialist has spent hours looking at this image and can't even see it anymore, just patterns of high and low frequency data.

Steganography and Watermarking are very similar, but watermarking has one clear major difference, especially when used, as most people want to, to fingerprint movie files or images so you can tell which customer you sent them to.

This is the normal conceptual format for watermarking, which is how visually inspectable watermarks work. This works fine as the smart people at BangBus know very well.

But customers hate seeing watermarks all over the place and of course, visual watermarks are subject to tampering and removal using that advanced "crop" tool in Windows Paint (or more sophisticated techniques I won't go into). So what super smart PhD people do is an invisible watermark, using this basic format:

And lots of people do really good mathematical work making watermarks (all of which boils down to hiding information in the hair and feathers of Lena). After several pages of math doing statistical modeling, you can add your watermark to compressed data and remain still basically invisible to the human eye, while still being recoverable after display or just from the compressed data stream itself. You'll note that all schemes like this avoid using EXIF or other tag data parts of the image format because you can't get a PhD by doing the obvious solution. However they have one simple problem, which is they all fail in the exact same way:

This is just how information theory works, and no amount of PhDing can solve it, in my opinion (and hopefully in yours). For this reason, invisible watermarks historically only work when the world does not know you are doing them. We are not so lucky in our goals (dire music goes here).

What are our design goals and constraints

One key thing is that we don't need to watermark software in particular, but intrusions in general. And intrusions are large complex things. Some of them involve exploits, some involve software implants ("Trojans") and some involve hardware implants. Many involve all three and each of those three components has many sub-components all of which we are expecting the skilled intrusion detection team to have access to when they conduct their analysis - but not necessarily immediately.

Significant intrusions get analyzed by teams of experts when they are discovered. But of course, signs of intrusions are being looked for by automated systems all the time. Our goal is to create a system that is detectable by a team of experts, but not by an automated system. An extremely robust system will be detectable just from an incident response report, without any access to the raw intrusion data at all, which has some political advantages.

Our particular technical options

The simplest way to indicate that we are a "nation-state" and not a "criminal group" is to share a private key and cryptographically sign a random data blob within as many sections of your intrusion chain as possible. The more places you sign, the more likely you are to be "validatable" by the Nation State Incident Response team (which also has the private key).

Of course, to "hide" this from automated detection techniques, you could make both the Blob and the signature something computed by code that, in some cases, is never even run during normal use.

Encryption routines are common inside implants. Likewise, most implants gather data from the machines they are installed on, for use as a key to encryption routines. This is valuable to them because it makes incident response harder (even INNUENDO does this). This is valuable to us because it means that a signature cannot be stolen from one trojan, and added to another trojan on a different machine.

Imagine an even tinier protocol, where you simply decide on a large set of 32-bit numbers, and if you see any three of them inside the analysis of your trojan, it is part of the Group. There is plenty of cover data to hide these numbers in. They could be register values computed during an initialization operation, for example. Or even text included within the program as a "forgotten debug variable". This kind of protocol would be more vulnerable to a theoretical "automated detection", but is more resistant to other kinds of analysis (no way to steal a signature if you can't figure out what part of the code is the signature). Likewise, this scheme operates without needing additional information from the host systems. Another benefit to this kind of scheme is that it is applicable to "data in motion" as well as data at rest (aka, Exploits).

The end result may be a multi-layered scheme, with each layer operating at a different level of confidence and security.

In the end you get a "uniform" for your intrusion efforts, but one that has camouflage and is not transferable to criminal groups.

Social Protocol Design

In addition to a technical design, we also need to decide when and how things such as keys will be distributed, what does a revocation look like, what does a "challenge" look like in case you think someone is overstepping the acceptable norms, or failing to sign their work, etc. I will leave all thoughts of this to another paper as it largely depends on having a working technical solution first. But this protocol will initially offer at least the possibility of an anonymous group "Red Phone" to avoid crisis when we need it most. A worthy goal?

No comments:

Post a Comment