HellaSec joins DARPA’s Cyber Fast Track program

Posted on March 26, 2012 by mgagnon

I am very excited to announce that DARPA has awarded funding to HellaSec, my security R&D side business, through the Cyber Fast Track program. I’ll be working with a fellow security researcher to develop novel security technology, which we plan to open source at the conclusion of the program.

In a future blog post I will describe the technology we’re building. For this post though I want to build some context by discussing the Cyber Fast Track program and why it’s important.

What is DARPA?

To understand the Cyber Fast Track you first need to understand DARPA — the United States’ Defense Advanced Research Projects Agency. DARPA hires visionary scientists and engineers to pursue their own big ideas to benefit national security. By charter, DARPA goes after the high-risk, high-payoff R&D that other agencies avoid.

Among many fascinating projects, DARPA has led the development of the world’s fastest legged robot, the first stealth airplane, and the original Internet. Every research program is an outlandlish experiment. Can we build a running robot faster than most humans? Can we hide from radar? Can we build a network capable of joining all other computer networks?

What is the Cyber Fast Track program?

In 2010 Mudge, the famed hacker of L0pht and cDc, joined DARPA as a program manager. What is his big idea?

Rather than proposing a specific new security technology or technique, Mudge’s big idea is to experiment with the R&D process itself. Leave it to a hacker to be so meta.

Mudge created the Cyber Fast Track program to fund lone hackers and small, agile security firms like my own. The hope is that these smaller efforts can rapidly advance the United State’s security capabilities, while “leapfrogging” over the expensive and slow research efforts lead by the traditional pool of large corporations.

I buy into Mudge’s vision and I am excited to be a part of his program. If all goes according to plan, HellaSec will contribute to the success of the experiment — paving the way for new mechanisms for the United States to keep pace with the security world.

Stay tuned

If you would like to keep posted, follow me on Twitter @MichaelNGagnon.

The Cycle of Gaining and Losing Knowledge

Posted on March 18, 2012 by mgagnon

“There are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – there are things we do not know we don’t know.” — Donald Rumsfeld

In a press briefing a little over 10 years go, Rumsfeld popularized the concept of known knowns, known unknowns, and unknown unknowns. I hope my figure below communicates the problem with “unknown unknowns.”

In general you want the small green slice of “known knowns” to be as big as possible; this represents your knowledge of the universe. For example,

I know that I know how to program in the Ruby programming language.

You also want the blue slice of “known unknowns” to be as big as possible. Once you are aware of ignorance about a particular thing, you can then decide to educate yourself (if the benefit of learning outweighs the cost of learning).

I didn’t always know how to program in Ruby, but I knew Ruby existed and had an idea of its value proposition.

Moving from “known unknowns” to “known knowns” is therefore relatively straightforward. However, that big red slice of “unknown unknowns” is really scary. You have no idea what’s in there and you have no idea how big it is. You can’t really decide on your own to move from “unknown unknown” into a “known unknown,” because, well — you don’t even know what knowledge is in that category.

The best you can do with “unknown unknowns” is be aware that the category exists and maintain an open mind. This way when information presents itself to you, you can cognizantly realize that it was an “unknown unknown” and then you’ll either be in the “known knowns” or “known unknowns” category.

I didn’t always know that Ruby existed. One day someone told me about it, and I became aware of its value proposition.

These transitions of knowledge are presented in the figure below.

I’ve already explained the blue and green arrows, but what the hell is that red arrow? The point of this blog post to describe the arrow from “known knowns” to “unknown unknowns;” the transition from knowledge to ignorance.

Becoming ignorant again

It’s often a long road becoming an expert on a topic.

I have been programming for 17 years. I’ve programmed in Ruby for severals years and I know Ruby pretty well now.

Becoming an expert was hard. I remember what it was like to be a beginner — the perplexing error messages and the mystery of source code.

But today some information came my way that made me realize I don’t actually know what it’s like to be a beginner. I thought I knew what it was like; in actuality though, I didn’t know that I didn’t know what it was like to be a beginner.

I was reading an article in Slate about a journalist’s efforts to teach herself programming in Ruby. This is the line that shocked me out of the “unknown unknown” category.

“I started off by saving [my first Ruby program] as a Word document” — Annie Lowrey

I never would have imagined someone would try that. From 17 years of experience, I know that you need to save your programs in text file format, but I had no idea that a beginner wouldn’t know that.

Even worse than being ignorant, I had been deceiving myself into thinking that I remembered what it was like to be a beginner.

I do remember bits and pieces of learning to program, which fooled me into thinking I remembered what it was like to be a beginner. For instance I remember that when I first started programming, the error messages overwhelmed me. Perhaps the only reason these memories stand out is because sometimes they still overwhelm me. Maybe I forgot about the text-file issue because it was one particular issue I learned 17 years ago; and once I learned it, I moved on and never thought about it again.

I can’t relate to what its like to be a beginner programmer. But at least now, I know that I don’t know.

The scientific method is not applicable to computer science (a patently false statement)

Posted on March 10, 2012 by mgagnon

While a student in Margo Seltzer‘s research-based Operating Systems class, she challenged each of us to write a one paragraph essay that argues a “patently false statement.” It was an exercise in persuasive writing. Here is mine:

The scientific method is not applicable to computer science

Science only works because scientists share a common philosophical belief—the belief that the laws of the physical universe remain constant. However, computers are man-made systems, whose laws remain at the whims of engineers and users. This point is particularly emphasized by the failures of computer security: malicious users are very often capable of breaking the laws of computer systems. Because the laws vary from one universe (computer system) to another, one cannot safely generalize the results of computer-science experiments. Scientific induction simply does not apply and, ultimately, the scientific method is fundamentally not applicable to computer science.

What do you think?

What the flock?

Posted on January 23, 2012 by mgagnon

Today, through the acquisition of my company Dasient, I have joined the flock at Twitter (they love bird metaphors). I am a proud new member of Twitter’s revenue-security team, and I’m excited!

Until I heard about Twitter’s interest in acquiring Dasient, I was only a light Twitter user. I didn’t really get it. And when I talk to some of my friends about Twitter, they don’t get it either. Let me share what I learned about Twitter that enlightened me.

Twitter helps me find and follow information I can’t get any other way

Just by looking at my Tweets, you can’t tell I’m a power user, because I don’t Tweet much. But I am a power user because I use Twitter to access information I can’t get any other way. Here’s two personal examples of how Twitter helped me within the past couple of weeks:

(1) ATM outage

Last week at Bank of America every ATM was out of order and the bank lobby was closed. Strange.

In the days before smartphones I would normally drive around town looking for another ATM. But now I have a smartphone and understand the value of Twitter. Searching for “bank of america outage” on Twitter revealed 4 tweets from the past hour reporting Bank of America ATM outages all across the United States.

Bank of America outage on Twitter search

No need to look for another ATM; the entire system was down.

Now that I get Twitter, I know how to find that kind of real-time information. But what would I have done if I hadn’t been a Twitter user? When I got back to the office, I did a little experiment. I tried looking up the ATM outage on Google. I searched the news, blogs, BofA’s own site, and the Web at large—and found nothing.

Twitter was the only medium that had the real-time information I needed.

(2) Algorithmic-complexity attacks in the wild

The primary topic of my graduate research at Harvard and MIT Lincoln Laboratory was algorithmic-complexity attacks (AC attacks), a particular type of denial-of-service (DoS) attack, which I have blogged about before.

While surveying the literature, I searched far and wide for public discussions of these attacks occurring in the wild. I found nothing. I felt sure they actually happened in the wild, because of their appealing features from an attacker’s perspective. I just couldn’t find any discussions. In my research paper describing my novel defense against such attacks, I couldn’t mention the occurrence of any real-world AC attacks in the wild because I was unaware of any.

My negative experiences researching AC attacks without Twitter sharply contrasts my recent positive experiences researching AC attacks with Twitter. Specifically, several AC vulnerabilities were recently announced on the web. These particular vulnerabilities represent a subgenre of AC vulnerabilities, known as HashDoS vulnerabilities. Discussions of the vulnerabilities emerged on Twitter using the #hashDoS hashtag. There is even a @HashDoS account on Twitter, which moderates and promotes such discussions.

Within minutes of beginning my search on Twitter, I found the gem I’d been searching for for years: a report of an AC attack against StackOverflow.com by an employee.

Keep in mind: I am an expert on AC attacks, I am a Google power user, and I am a newcomer to Twitter.

To top it all off, now that I know who the other researchers are in this area, I don’t have to search anymore. I just follow them and their insights get pushed to me automatically.

Flock

I think most people are like me; they consume information more than they produce it. Subsequently, thinking of Twitter as just a microblogging platform grossly understates its value. If you are not already, become a Twitter user and have the world’s real-time status at your fingertips. Join the flock and follow me @MichaelNGagnon.

The only brain teaser I use in interviews

Posted on January 8, 2012 by mgagnon

Interviewing is a controversial topic in Silicon Valley. Should you attract and filter talent using puzzles? Should interviewers avoid puzzles and brain teasers altogether? Of course I don’t have the answer. But I have an opinion.

I always use exactly one puzzle when I interview candidates for tech positions (such as software engineering and security R&D). I believe the ability to solve this puzzle strongly indicates that the candidate thinks scientifically, which is very important in tech positions. I rarely recommend moving forward with a candidate unless he or she passes this puzzle.

The puzzle tests the candidate for a specific cognitive bias and works like this:

I will give you a sequence of three numbers (a triple). This triple adheres to a particular rule. Your goal is to figure out what this rule is. You will figure out what the rule is by proposing new triples to me. For each triple you give me, I will respond by telling you whether that triple adheres to the rule, or breaks the rule. You can give me as many triples as you like, but you only get one chance to guess the rule.

An example triple that follows the rule is 2, 4, 6 √. (The check mark indicates that the triple follows the rule).

Confirmation bias

This puzzle tests candidates for their susceptibility to the confirmation bias.

When given the puzzle, people tend to develop an initial hypothesis such as “ascending even numbers” (since it seems to nicely generalize the example triple 2, 4, 6 ). Once candidates have an initial hypothesis, they can go two routes in their investigation: confirmation and falsification.

Confirmation

Scientists have studied the confirmation bias using this exact puzzle, and the research has shown that the average Joe tends to confirm hypotheses, rather than attempt to refute them. For example, if your initial hypothesis is “ascending even numbers” you could confirm your hypothesis by giving triples that follow the rule, such as:

4, 6, 8 √
108, 116, 140 √
-8, -6, -4 √

In the particular puzzle that I give, all of the above triples follow the rule. And if you fall victim to the confirmation bias, you might convince yourself that your hypothesis is correct based on the above evidence.

But in fact, the correct rule is not “ascending even numbers.”

So how do you solve the puzzle?

Falsification

To correctly solve the puzzle you must attempt to falsify your hypothesis (in addition to confirming it). For example, if your initial hypothesis is “ascending even numbers” you could refute your hypothesis by giving triples that attempt to break the rule, such as:

6, 4, 2 X
3, 2, 1 X
1, 2, 3 √

In the above cases the last triple (1, 2, 3) follows the rule, while the first two break the rule. This evidence refutes the hypothesis (because if the hypothesis were true, 1, 2, 3 would break the rule).

Once you’ve falsified a hypothesis, you need to develop a new hypothesis and repeat the process.

Solution

The correct rule is “ascending numbers.” However, guessing the correct rule is not enough to get credit for solving the puzzle. To solve the puzzle you must scientifically attempt to confirm and refute your hypotheses until you’ve gathered enough evidence to confidently guess the correct rule.

Relevance

As I mentioned before, research has shown that most people fail to solve this puzzle. What then is the relevance of identifying whether or not someone can pass this puzzle? If they fail, you might argue “So what? Everyone has cognitive biases.”

To succeed in a technical job you must be able to think scientifically, and overcoming the confirmation bias is a cornerstone of scientific thinking.

For example, when debugging faulty software, you might be inclined to stop as soon as the faulty behavior disappears. But what you really need to do is figure out what is causing the faulty behavior? Don’t trick yourself into thinking you’ve figured it out until you’ve honed in on the problem by rigorously testing your hypotheses.

Or for example say you’re writing unit tests for your software. You carefully wrote the software under test and you’re feeling good that it’s correct. You might be tempted to take it easy on the unit tests, but really you must try as best you can to write unit tests that break your code.

So there it is. Use this puzzle and gain confidence that the people you hire think scientifically!

How to defend against algorithmic-complexity attacks

Posted on January 1, 2012 by mgagnon

Algorithmic-complexity attacks have been all over the news lately:

Supercolliding a PHP array
Effective DoS attacks against Web Application Plattforms – #hashDos
oCERT Advisory: multiple implementations denial-of-service via hash algorithm collision

This type of attack is well known, causes significant service failures, and can even happen accidentally (such as when Wikipedia crashed while covering news of Michael Jackson’s Death, [1], [2]).

Ever since I discovered my first algorithmic-complexity vulnerability while in undergrad, I’ve kept an eye out for these vulnerabilities. They’re everywhere. I’ve even used an algorithmic-complexity attack to take down a radar during a live-fire ballistic-missile-defense exercise. Our servers, operating systems, programming languages, and best practices do not prevent them.

So while I was in grad school I set out to change that by developing a method to probabilistically guarantee immunity against algorithmic complexity attacks.

Here’s the gist of how you can defend your software from alg-complexity attacks.

Provably Protecting Servers From Algorithmic Complexity Attacks

Enforce a minimum inter-arrival time between incoming requests (you really only need to do this during an overload)
When the system becomes overloaded, do not queue up incoming requests. Rather, kick out old requests that are using up too many resources.
By assuming a probability distribution on the resource requirements of legitimate requests, this method can provide guarantees of the form: with probability at least X, at least Y legitimate requests will complete on time.

You can read the nitty gritty details in this technical report (including a mathematical proof of the guarantee as well as experimental results from attacking Mediawiki).

To finish the work I’m planning on (1) implementing my technique as an enhancement to FastCGI, (2) finding a bunch of real-world vulnerabilities, (3) then doing experiments that demonstrate the effectiveness of my defense against 0-day attacks. Anyone interested on collaborating?

Hit me up with an email (mikegagnon at gmail) or follow me on Twitter.

Spontaneous radial symmetry

Posted on December 10, 2011 by mgagnon

Minutes into January 1, 2009 my wife’s family and I spontaneously jumped together for a group photo. Several of us threw our hands in the air independently.

Amazingly, our arms made exact 45-degree angles.

Original photo

As soon as I saw the photo I immediately announced it was going to be the best picture of 2009. I still stand by that statement. I also knew that some image manipulation with gimp would transform this photograph into fine art.

Spontaneous Radial Symmetry

I hope it doesn’t scare you.

Platform-independent micro-benchmarks are inherently limited

Posted on December 10, 2011 by mgagnon

On Quora someone asked, are there any purportedly objective benchmark tests or comparisons of operating system efficiency?

Last year when I was in grad school I took Margo Seltzer‘s research-based Operating Systems class. Part of my thesis for my class project was that the accuracy of platform-independent micro-benchmarks suffers inherent limitations (as opposed to platform-specific micro-benchmarks, which do not suffer the same limitations).

Micro-benchmarks are benchmarks that measure the performance of primitive operations, such as systems calls or page faults.

Platform-independent micro-benchmarks are micro-benchmarks designed to run on multiple platforms.

To prove this part of my thesis I analyzed lmbench, the most popular OS benchmark suite that uses platform-independent micro-benchmarks.

“lmbench is a suite of simple, portable, ANSI/C microbenchmarks for UNIX/POSIX. In general, it measures two key features: latency and bandwidth. lmbench is intended to give system developers insight into basic costs of key operations.” — from the lmbench SourceForge page

True to my thesis, the platform-independent aspect of lmbench’s design caused its benchmarks to suffer inherent limitations. These limitations can only be fixed by abandoning platform-independence and developing platform-specific micro-benchmarks. The results of my study illustrate why comparative OS benchmarking is hard and why the results are often trustworthy, which leads to my response to the Quora question…

Are there any purportedly objective benchmark tests or comparisons of operating system efficiency?

It depends on what you want to benchmark.

There are two things (that I can think of) that you might want to benchmark:

The effect of the OS on application performance, and
The performance of operating-system primitives such as system calls.

(1) is easier to measure because you can re-run the same application benchmarks on multiple OSes and compare results. However, (2) is much more difficult because it requires portable micro-benchmarks and developing portable micro-benchmarks for OSes is inherently hard.

Lmbench is probably the most popular micro-benchmark suite that benchmarks operating-system primitives. Despite its popularity, lmbench produces inaccurate results on modern operating systems, due to the inherent difficulty of developing portable micro-benchmar

Mike Gagnon

Security researcher turned Twitter software engineer