Friday, November 17, 2006

What scanners can and can't find. Who cares and why does it matter?

What vulnerabilities (blackbox / whitebox) scanners can and can't find is one of the most important topics in web application security. Innovation is this area will inevitably determine the industry accepted vulnerability assessment methodology. Online business depends on this problem being addressed with right blend of coverage, ease-of-use, and price. For us vendors it’s a battleground for which solutions will ultimately be successful in the market. Competitors who do not adapt and push the technology limits will not be around long. I’ve seen this coming for a while. To the delight of many and frustration of some I’ve offered presentations, released articles, and written blog posts.

Since founding WhiteHat Security I’ve long believed that there was no way a scanner, built by me or someone else, could identify anywhere close to all the vulnerabilities in all websites. For years was I had no good way to explain or justify my position. It wasn’t until I read a fascinating Dr. Dobbs's article (The Halting Problem) from 2003 that established the basis of my current understanding. To quote:

"None other than Alan Turing proved that a Turing Machine (and, thus, all the computers we're interested in these days) cannot decide whether a given program will halt or run continuously. By extension, no program can detect all errors or run-time faults in another program."

Brilliant! This article instantly sparked my interest in the halting problem, undecidable problem, and a bunch of other mathematical proofs. Taking what I had learned, I later introduced the “technical vulnerabilities” and “business logic flaws” terminology during a 2004 Black Hat conference. I guess people enjoyed the terminology because I frequently see others using it. Loosely described, technical vulnerabilities are those that can be found by scanners and business logic flaws must be found by humans (experts).

What needs to be understood is finding each vulnerability class is not exclusive to a single method of identification. Scanners and humans can in fact identify both technical and logical vulnerabilities. How effective they are is the real question. Observe the following diagram. (Don’t take the numbers too literally, the diagram is meant to enforce concepts more than precise measurements.)


  1. Scanners are way more adept at finding the majority of technical vulnerabilities. Mostly because of the vast number of tests required to be exhaustive is too time consuming for a human (expert).
  2. Humans (experts) are much better quited at finding business logic flaws. The issues are highly complex and require contextual understanding, which scanners (computers) lack.
  3. Neither scanner nor human will likely or provably reach 100% vulnerability coverage. Software has bugs (vulnerabilities) and that will probably remain the case for a long time to come.
The coverage scales will slide in different directions with each website encountered. A while back I posted some stats on how vulnerabilities are identified here at WhiteHat. Based on 100 websites, here are the findings.



This numbers are neat on a variety of level. As more people dive into web application security inevitably we’ll see more measurements, reviews, and statistics released. The cloud of the unknown will lift and the most effective assessment methodology will reveal itself. I welcome this trend as I think I'm on the right track. Brass tax...

From what I've seen, malicious web attacks typically target websites on a one-by-one basis rather than shotgun blast approach. The bad guys aren’t using commercial scanners, performing full-blown assessments, or even open source tools for that matter. Frankly because they don’t need to. A web browser and a single vulnerability is all the really need to profit. That’s why I’ve been harping on comprehensiveness and measuring the effectiveness of the assessment methodologies for so long. Finding some of vulnerabilities, some of the time, on some of the websites - ain’t going to cut it. You will get hacked this way. We need to find them all, all of the time, and as fast as possible.

My crystal ball (1-3 years):
1) Standalone back box scanners will transfer from the hands of security personnel to those in development and QA – they’ll merge with the white box scanners and finally tightly integrate inside of established IDE’s.
2) The one-off vulnerability assessment market (professional services) will give way to managed service model, just like they already have in the network VA world.
3) Majority industry consolidation will occur as customers look for singular security vendors that can address the entirety of their vulnerability stack.

9 comments:

Ory said...

Quote: "Finding some of vulnerabilities, some of the time, on some of the websites - ain’t going to cut it. You will get hacked this way. We need to find them all, all of the time, and as fast as possible."

Jeremiah, this is a very interesting post. I have one small comment on the quote above -

IMHO, security is all about adding more and more layers of protection. There is no silver bullet solution. Your quote above reminds me of people who say that if you want to really secure a machine, then don't connect it to the internet. While this is true, the real world is not black and white, there are some shades of gray in between.

Anonymous said...

specifically regarding that same quote, i'd like to see metrics showing time-to-automated-exploit vs. remediation (per vulnerability, per criticality).

knowing your enemy (and their tools) is popular in the network VA world, but not as common in webappsec yet. notable exceptions are PHP HoP, honeytrap, and honeyclients.

a scanner can't get into the mind of an attacker. but an attacker can analyze and circumvent a scanner's checks. this is another area where the "creative process" stands to remain dominant.

the problem with all VA is that it rarely incorporates penetration testing, especially automated. i would argue that ethical hacking is weak and less creative than blackhat motivations and resources. for example: everyone wants to model the network/application (karalon, redseal), but rarely does anyone model a realistic botnet (trend intercloud, shadowserver, and simplicita do provide anti-botnet services for ISP's).

going back to my first paragraph, "automated vulnerability management" is another point you can add to the crystal ball. as attacker tools and vulnerability scanners become more sophisticated and automated, so should the remediation process. and i'm not just talking about patch management.

i could see vuln mgmt as a line item for a CMS feature. websites need control over which portions of their websites run unsafe or unknown (and changing) code vs. known, tested, and static code. minimizing the complexity will not only help when performing VA (blackbox, whitebox, automated, or not), but it also provides a path to future browser security features (e.g. httpOnly).

and webappsec vuln mgmt is not only about your HTML, JS, and SQL. certainly browsers should be listed (or sampled) as assets in a company, even a pure hosted web app. maybe server-side NAC holds a future for the DOA technology. if phishers can tell any given user's security posture, why can't a banking website?

Jeremiah Grossman said...

Hey Ory,

> Jeremiah, this is a very interesting post.

Thank you, I appreciate that.

> IMHO, security is all about adding more and more layers of protection. There is no silver bullet solution.

Your right, absolutely right. Unfortunately much of the webappsec industry, and even myself from time to time, fall into the one solution fits all bucket. Defense in depth needs to be embraced as webappsec matures. And you gave me a good idea for a future post.

Also, you should really consider starting a blog.

Jeremiah Grossman said...

> specifically regarding that same quote, i'd like to see metrics showing time-to-automated-exploit vs. remediation (per vulnerability, per criticality).

Im not sure exactly what you mean. Can you clarify this a bit or maybe an example?

> knowing your enemy (and their tools) is popular in the network VA world, but not as common in webappsec yet.

True, we really need this data. Unfortunately we really don't have access to the data, yet.

> even a pure hosted web app. maybe server-side NAC holds a future for the DOA technology.

This part of the solution stack will have to come, it just not there quite yet.

Thanks for commenting!

Anonymous said...

>> specifically regarding that same quote, i'd like to see metrics showing time-to-automated-exploit vs. remediation (per vulnerability, per criticality).
>
> Im not sure exactly what you mean. Can you clarify this a bit or maybe an example?

for example, let's say it takes one hour to create an exploit, and 1.5 days for 80% of the users to patch. how long does it take for the POC exploit to turn into something as complicated as sqlblaster, code red, or sasser? I say: one week?

obviously, the source of the vulnerabilty and, if available, POC are the most important factors here. vulnerabilities that are posted code to the full-disclosure list or (god forbid) bugtraq are different than code stolen from sla.ckers.org, hackthissite, or in "the wild".

also related would be microsoft patches that cery an be reversed. or instant POC's like XSS (let's argue about whether XSS are POC's or not!). so every vulnerability isn't critical, which is why things like CVSS exist. we need to patch the vulnerabilites that are easy to exploit or have an active exploit fast. the others can wait a little while, right?

Jeremiah Grossman said...

> for example, let's say it takes one hour to create an exploit, and 1.5 days for 80% of the users to patch. how long does it take for the POC exploit to turn into something as complicated as sqlblaster, code red, or sasser? I say: one week?

I think it depends on the class of attack (XSS, SQL Injection, business logic flaw, etc.) For those I would say....

XSS: Minutes to ours because PoC is easy and generic to make.
SQL Inj: Hours to days
Business Logic Flaw: Immediately upon discovery

> obviously, the source of the vulnerabilty and, if available, POC are the most important factors here. vulnerabilities that are posted code to the full-disclosure list or (god forbid) bugtraq are different than code stolen from sla.ckers.org, hackthissite, or in "the wild".

Definitely.

I think the problem with measuring exploit time with vulnerabilities in custom web application is that we don't know when/how the issue was found or when it was exploited. We simply don't have access to the logs. This I think is going be a cloudy area for some time, at least until the

Distributed Open Proxy Honeyports gets going.
http://www.webappsec.org/projects/honeypots/

Thorin said...

Quote ory: "IMHO, security is all about adding more and more layers of protection."

As long as these layers add to your "defense in depth" and do not become or attempt to implement "security through obscurity".

Jeremiah Grossman said...

personally I enjoy security WITH obscurity. That's kinda why I use OS X and Firefox. :) *oops*

Dan said...

"Security through obscurity is putting your money under your mattress.
Security WITH obscurity is putting your money in a safe concealed behind a painting."
Dan Ross 9/27/2004
http://www.owasp.org/images/3/32/OWASPSanAntonio_2006_05_ForcefulBrowsing_Content.pdf