Monday, September 08, 2008

WASC Web Application Security Statistics 2007

For those hungry for more web application security vulnerability data, WASC has released its Web Application Security Statistics report for 2007. Under the leadership of Sergey Gordeychik and the broad participation by Booz Allen Hamilton, BT, Cenzic, dblogic.it, HP, Positive Technologies, Veracode, and WhiteHat Security – we’ve combined custom web application vulnerability data from roughly 32,000 websites totaling 70,000 vulnerabilities. Methodologies include white box and black box, automated and manual, all reported using the Web Security Threat Classification as a baseline. Excellent stuff.

Vulnerability frequency by types


The most prevalent vulnerabilities (BlackBox & WhiteBox)


Sergey did a masterful job coordinating all the vendors (whom we thank), compiling the data, and generating a report in a nicely readable format. I’d like to caution those who may read too deeply into the data and draw unfounded conclusions. It’s best to view reports such as these, where the true number and type of vulnerabilities is an unknown, as the best-case scenario. There are certainly inaccuracies, such as with CSRF, but at the very least this gives us something to go on. Future reports will certainly become more complete and representative of the whole as additional sources of vulnerability data come onboard.

6 comments:

Andre Gironda said...

I find it increasingly interesting that almost every study like this consistently (over time) seems to list logic flaws at or around 14 percent of the total "vulnerability class" distribution. See also: MITRE "CVE classification" and CWE statistics.

Clearly attackers seem to prefer SQLi and URL redirection above other kinds of semantic web application attack vectors. For those that have seen your Blackhat 2008 talk, I guess they already know how some of this works.

MikeA said...

I'm going to stick my head out here (and maybe get my ass handed to me by you guys ;p) and say that although these stats are welcome, they appear (to me at least) to show "ease of finding" rather than "prevalence of vulnerabilities".

It's a real subtle difference, but what I think is being shown isn't vulnerability distribution in webapps (which is what we want to know, right?), but "where testers/tools are good at looking".

This isn't nearly as useful (although still "interesting"), as it only shows what the "low hanging fruit" are - XSS is a breeze for people/tools to test for, and info disclosure just requires to know what you are looking for (and it's a wide category). SQLi is going down not only because of stored proc usage, but also blind injection is required more and it's a higher bar for people/tools to meet and therefore not as much time/effort is spent on if there is a potential vuln in the app.

Sergey Gordeychik said...

>MikeA said...they appear (to me at least) to show "ease of finding"


Absolutely! Bu if we come down to earth, the probability of detection and exploitation of vulnerability looks like one of main business metric to compute resulting risks and take decision about remediation. Take CVSS for example.
If there one "very risky" bug which can be exploited by one "BlackHatMegaguru" – who care about it? Or SQLi/PHPinject which can exploit by any scriptkiddie or mass-defacement bot? What we should to patch first?

>MikeA said... blind injection is required more and it's a higher bar for people/tools

On my experience most of SQLi (can be) detected by automated tools. With standard WA scanner or Fuzzer + trace analysis. In general stored (non persistent) XSS is a more complex task :)

MikeA said...

> [snip - business risks vs skill level of attack.

I have absolutely no argument about that. However, I think what we are seeing in these metrics are important to "frame" - it's not about number of vulnerabilities, it's about ease of discovery. We see a lot of XSS not (only) because a lot of sites are vulnerable, but because it's just easier to test. It may be very obvious, but this skews the results somewhat. Who's to say that AuthN bypass isn't as prevalent, but based on the skill of the testers/tools, it's just not being discovered as much. CSRF is another case-in-point.

I guess stats like these just show where the "low hanging fruit" are for both testers/attackers. It *doesnt* show how prevelent any of the main attack categories are. One of the things I like very much about WhiteHat's report published earlier is two things.

a) because (as I understand) they have a very set methodology, with a lot automated, "it all comes out in the wash" - you do start to see a prevalence because there's very little changing between each test.

b) I *very* much like the "time to fix" stats (even though I'm seeing some odd figures in there that I don't understand), because it shows how clients see each of the vulns, and the priority they have in fixing them.

> On my experience most of SQLi (can be) detected by automated tools

I would have to disagree on that. XSS certainly a lot of tools are doing a good job, but there's a number of SQLi issues that I've seen that a tool simply fails to discover. Not that it's a failing on the tool (it is if it claims 100% success), but there's some application logic/output that give a manual tester a "hint" to continue down a path, which a tool (without any advanced UI/training) simply wouldn't pick up.

Jeremiah Grossman said...

@mike & sergey, you guys both make very good points. Clearly the distinction between "prevalence" and "ease of discovery" is important, but we're also unlikely to get clarity on that point with custom webappsec vulns. We don't know what we don't know and all that.

We just wanted to see this project take the next step and measure more of whatever we could get our hands on. In this example, whatever methodology you choose, "what vulnerabilities are you finding?" and it do from a large pool of different vendors. Would it match up with what the bad guys are finding/exploiting, who knows.
And no doubt there will be holes in the stats that need filling.

We'll be taking in a lot of feedback to be integrated into collection process the next time around. Fortunately we're now on the right track!

The Serrano Boy said...

i wanna download this software