Friday, September 03, 2010

Our infrastructure -- Assessing Over 2,000 websites

Recently I was asking a colleague how desktop black box web application vulnerability scanners, from a scalability perspective, approach scanning large numbers of websites (i.e. 100 to 500+) simultaneously. I was curious about the way they address the physical infrastructure requirements to support big enterprise deployments as compared to our own. Anyone with experience knows commercial desktop black box scanners can easily eat up several gigs of memory and disk space for even a single website. Nothing high-end workstations can’t handle, but multiplied out it’s an entirely different story. The person had an unexpected answer, “Desktop black box scanners don’t have the use-case you [WhiteHat] do, their technology doesn’t need to scale.” Say Whaaa!?!

Asking for clarification he said consider how black box scanners are normally used in the field. They are “developer” or “pen-test” tools where the use-case is one person, one machine, one website, one configured scan, and then let it run for however many hour or days it takes until completion. Attempting to perform dozens or hundreds of scans at the same time would be exceedingly rare, if ever, so the capability to do so doesn’t need to exist. He said, “Who beside you guys [WhiteHat] needs to scan that many websites at a time?” To which I humbly replied, “the customer.”

As we know new Web attack techniques are being published weekly and Web application code is change is rapidly (Agile). Web applications, even those that are old and unchanged, need to be tested often for these issues. Testing once a year, or even once a month, isn’t enough in an environment like the Web where daily attacks are normal. So if an enterprise has say 10 or more websites, to say nothing of those with hundreds or thousands, mass scanning is essential to get through them regularly. Burdening enterprises by having them wire scan nodes together with command and control systems to achieve scale is patently absurd. That's as an inefficient one-to-one model. 100 simultaneous scans = 100 scan boxes. Of course I’m sure they are happy to sell the hardware.

So yes I was a bit surprised that the desktop scanner guys haven’t seen fit to tackle the technology scaling problem, even though two of them are mega corps. They above all should know that scaling must be addressed if performing routine vulnerability assessments on all the Internet’s most important websites is to become a reality. To be fair, we’ve never pulled back the curtain to show off our own infrastructure. Maybe it’s time we did so because over the years we’ve invested heavily and it’s something we’re particularly proud of. I think others would be interested and impressed as well. The physical requirements for WhiteHat Sentinel, a SaaS-based website vulnerability management platform, are in a word -- massive.

Operationally we’re assessing over 2,000 websites on a fairly routine basis (~weekly). A dedicated IT staff is monitoring the systems for over 300 points of interest (utilization of network, cpu, memory, uptime, latency, etc.) ensuring everything is running smoothly 24x7. Metrics show at any moment 450 scans are running concurrently, generating about 300 million HTTP requests per month, and processing 90,000 potential vulnerabilities per day. We preserve a copy of every request sent and response received for audit, trending, tracking, and reporting purposes. This system itself is being access by over 350 different customers with tens of thousands of individual Sentinel users.

CPU and memory wise our ESX virtualization chassis allow us to control resource allocation and scale fast between multiple scanning instances and load balanced front-end & back-end Web servers. As you can see from the pictures we have some serious storage requirements. Our clustered storage arrays have 250TB ready to go (additional capacity at a moments notice), writing about 500GB to disk per day, and connected by dual 10GB backplane ethernet connections. Sick!

Oh, did I forgot to mention the two 100MB links to the Internet? Also very important is that the infrastructure is fully redundant. Pull any network cable, push any power button, and the system keeps on humming. I left out the pictures of the backside of the cages, which is every bit as cool as the front, but there’s a lot of network cords, firewalls, routers and other stuff we’d prefer to keep to ourselves. :) If someone else claims to have a SaaS scanning platform I’m wondering if it looks anything remotely like ours.

The data center where everything is housed is SAS70 Type II certified and state-of-the-art when it comes to power, fire protection, cabling, construction, cooling, and physical security. Guards are on site 24/7/365, active patrols both inside and outside the facility, with 54 closed circuit video cameras covering the interior and exterior of building. To get access to our area requires an appointment, government issued ID, thumbprint, retina scan and only then do they hand over the key to our private space where only two people at WhiteHat have access. I’m not one of them. :) Compare this to the scanner on laptop sitting somewhere unguarded in the enterprise. Clearly we’re not a desktop scanner behind a curtain like others out there. We’re not playing around. We take this stuff extremely seriously.

25 comments:

Anonymous said...

Geeze a webapp guy that knows his hardware. Who Knew

Jeremiah Grossman said...

@Anonymous: LOL, I had some help with the draf to make sure all the big words were accurate. Usually if a subjection doesn't start with "cross" or end with "jacking", I'm lost. :)

Dan said...

Other then the first three paragraphs, a nice post. (Although, 2 100MB links seems a bit odd - I'd think you'd be better off if you spread them out a bit between telcos for better peering - but maybe you are using someone like InterNAP.)

Now, about those three paragraphs... :)


>Say Whaaa!?!

Are you suggesting that "Desktop" scanners _should_ scale?

That doesn't make any sense.

If you need to scan 100 websites, surely you might consider that a desktop tool is the wrong one to do so?

>He said, “Who beside you guys
>[WhiteHat] needs to scan that many
>websites at a time?” To which I
>humbly replied, “the customer.”

The customer of the "Desktop" tool vendor or a Whitehat customer?

If you meant the "Desktop" customer - they should consider an appropriate "Enterprise" tool. Strapping together 100 desktop tools seems too goofy to consider.

If you mean a WhiteHat SaaS customer - well, that's what you are there for. :)

In the real world, I don't think you will find too much demand for vast scaling to the degree that a SaaS vendor needs.

In my experience, even if the threats change daily, test schedules/windows, CM policies, approvals, backups, etc slow things down to a more cautious pace.

Even when/if you do get authorization to do a BIG scan, a little bit of creative scheduling and risk assessment/prioritization will help with bottle necks and keep your server and network teams happier with you anyway.

(This discussion almost ends up as a advertisement for WAF as 0-day protection.)

Anonymous said...

Does the storage use any kind of encryption?

Dan said...

Other then the first three paragraphs, a nice post. (Although, 2 100MB links seems a bit odd - I'd think you'd be better off if you spread them out a bit between telcos for better peering - but maybe you are using someone like InterNAP.)

Now, about those three paragraphs... :)

>Say Whaaa!?!

Are you suggesting that "Desktop" scanners _should_ scale?

That doesn't make any sense.

If you need to scan 100 websites within tight time constraints, surely you might consider that a desktop tool is the wrong one to do so?

>He said, “Who beside you guys
>[WhiteHat] needs to scan that many
>websites at a time?” To which I
>humbly replied, “the customer.”

The customer of the "Desktop" tool vendor or a Whitehat customer?

If you meant the "Desktop" customer - they should consider an appropriate "Enterprise" tool. Strapping together 100 desktop tools seems too goofy to consider.

If you mean a WhiteHat SaaS customer - well, that's what you are there for. :)

In the real world, I don't think you will find too much demand for vast scaling to the degree that a SaaS vendor needs.

In my experience, even if the threats change daily, test schedules/windows, CM policies, approvals, backups, etc slow things down to a more cautious pace.

Even when/if you do get authorization to do a BIG scan, a little bit of creative scheduling and risk assessment/prioritization will help with bottle necks and keep your server and network teams happier with you anyway.

(This discussion almost ends up as a advertisement for WAF as 0-day protection.)

Ory said...

Jer,

For large enterprises that require scanning numerous applications, in a recurring manner, we have a different product called AppScan Enterprise, which does exactly that. It sits on dedicated hardware, uses a robust database, and can handle the load mentioned.

-Ory

Jeremiah Grossman said...

@Anonymous: yes, enough file-system level crypto is applied in the case of physical hardware theft, the risk of is which is already extremely low. Application level encryption is a more complicated subject because we must be able to read the data to perform our duties.

@Ory: Would you mind describing what AppScan Enterprise's hardware requirements would be when an organization needs to scan 100 sites simultaneously? And if the answer is "it depends," perhaps explaining how creating such an estimate is approached and estimated.

@Dan: Im seeing your comments emailed to me, but for some reason they are not being posted to blogger. Not sure if this is a bug or you are deleting the message. Either way, hard to respond to a comments that no one else sees.

Jeremiah Grossman said...

@Dan: The data center provider handles the relationships and connections through multiple telcos.

Desktop scanner vendors claim their solution scales, when it clearly does not at several multiple levels. Secondly, in network VA, a single scan box is capable scanning huge host/ip space. The perception among many is that the same can be done in webappsec. Obviously not true.

"In the real world", our experience has been that there are a great many organizations who are responsible for literally hundreds and if not thousands of websites. For those yes, I think we'd be a fine match. :)

I'll not be addressing the WAF issue here, except to say that something needs to be done with the vulns found.

Ory said...

Jer -

I'm sure no scanner vendor ever said that its desktop scanner scales to the point of 100 concurrent scans (at least not IBM) That's why there's an Enterprise scanner.

Do we really need to continue with the anti-scanner-vendor propaganda all the time? or is that your way of protecting the business?

Jeremiah Grossman said...

@Ory: Propaganda? I'm specifically comparing against the invalid claims perpetuated by desktop scanner vendors, which include IBM. A conversation that yes, not only product my business, but also protects customers against such false and misleading scalability claims.

I asked you a fair and direct question about the hardware requirements when deploying AppScan Enterprise to scan 100 sites simultaneously. You did not answer. If that's the way you protect your business, so be it.

Dan said...

As far as I know, I am not doing anything weird - entering the message, doing the captcha, hitting submit... So, not sure, but it isn't me. :)

Jeremiah Grossman said...

Posting for @Dan....

Other then the first three paragraphs, a nice post. (Although, 2 100MB links seems a bit odd - I'd think you'd be better off if you spread them out a bit between telcos for better peering - but maybe you are using someone like InterNAP.)

Now, about those three paragraphs... :)

>Say Whaaa!?!

Are you suggesting that "Desktop" scanners _should_ scale?

That doesn't make any sense.

If you need to scan 100 websites within tight time constraints, surely you might consider that a desktop tool is the wrong one to do so?

>He said, “Who beside you guys
>[WhiteHat] needs to scan that many
>websites at a time?” To which I
>humbly replied, “the customer.”

The customer of the "Desktop" tool vendor or a Whitehat customer?

If you meant the "Desktop" customer - they should consider an appropriate "Enterprise" tool. Strapping together 100 desktop tools seems too goofy to consider.

If you mean a WhiteHat SaaS customer - well, that's what you are there for. :)

In the real world, I don't think you will find too much demand for vast scaling to the degree that a SaaS vendor needs.

In my experience, even if the threats change daily, test schedules/windows, CM policies, approvals, backups, etc slow things down to a more cautious pace.

Even when/if you do get authorization to do a BIG scan, a little bit of creative scheduling and risk assessment/prioritization will help with bottle necks and keep your server and network teams happier with you anyway.

(This discussion almost ends up as a advertisement for WAF as 0-day protection.)

Ory said...

Jer -

Can you give more specific examples of false claims made by IBM? any quotes? references? supporting documents?

With regards to required hardware by AppScan Enterprise, this information is freely available on the IBM and IBM Support sites.

-Ory

Jeremiah Grossman said...

Impossible to validate sale process nonsense claims aside...

In the "Edition comparison" PDF (RAB14001USEN.PDF), linked from:
http://www-01.ibm.com/software/awdtools/appscan/enterprise/

says...

"The ability to scan and test thousands of applications simultaneously on a complex Web site and retest them frequently, following changes."

Let try to define "scalable" in terms of hardware. Hardware requirements for AppScan Enterprise:

http://www-01.ibm.com/software/awdtools/appscan/enterprise/sysreq/?S_CMP=rnav

This does not give any sense for one how much will be required when scaled out to 100 simultaneous scans, let alone the "thousands" claimed. What is the formula?

Same question, 3 times now and the last time I'll be asking.

Ory said...

Hi,

I am not trying to stall or avoid giving an answer, I simply don't have the answer, since I do not deal with AppScan Enterprise, I'm mostly involved in the AppScan desktop product (Standard Edition), as you probably know.

Having said that, I guess our support team has a formula to help our (successful) customers with their scale-up questions.

What can I say - Whitehat does it all, it's the best company, with the best scanning solution on the planet. There you go, I admit it, we suck, you rule :-)

Geez.

Ory said...

I don't know why I keep coming back for more...but...it's stronger than me :-)

Back to my original point - we do not tell people that our !!desktop!! product can scan 100 applications, like your post suggests.

And now, I will go back to my quiet Sunday evening.

Dan said...

@Jeremiah - Thanks for posting for me!

@Ory - Maybe it can... Thinking about the problem a bit more...

Aside from the initial scan for a new effort, wouldn't "normal" operations be to scan as part of the ongoing lifecycle, periodically, and as the threat changes?

So, once you get past that first "Scan Everything for Everything" scan (which I can't imagine would really be done all at once on 100 servers anyway - given the constraints I mentioned earlier), then you have:

1. Full "as needed" scan on some subset of hosts as their code & environment changes,
2. full "periodic" scans,
3. and delta scans for specific newly discovered vulnerabilities & techniques

So, maybe you could reasonably cover 100 sites with a desktop tool. Since, the problem doesn't necessarily require a full crawl and scan over and over daily.

Running several new tests on 100 pre-crawled sites doesn't seem too far beyond the capabilities of desktop scanners - although, they would probably be serially executed (?) and not parallel.

But I'm not sure the average company will get too excited about serial versus parallel execution. If you're that time sensitive (and not a SaaS vendor) you probably need another control anyway - since fixes are going to take awhile in an "Enterprise" environment.

Aside from SaaS vendors (little control of the schedule) and universities or hosting companies (little control of the content/configuration), what sort of users are you finding that require a high degree of parallel testing?

kingthorin said...

@Dan
I totally agree with this question.

"Aside from SaaS vendors (little control of the schedule) and universities or hosting companies (little control of the content/configuration), what sort of users are you finding that require a high degree of parallel testing?"

If a single customer has 100 (or more) apps to look at then I think they have bigger business issues to consider.

Unknown said...

Hence why WhiteHat is a excellent choice for service providers like www.proactiverisk.com to leverage in helping clients meet the needs for enterprise -- no silver bullet, just a machine gun to be leveraged.

Home Depot has lots of HAMMERS you have to pick the right one for the type of nail you want to hit ;)

Aaron Bryson said...

Hi Jeremiah,

The desktop web application scanning products definitely do not scale, and as mentioned, they are not supposed to because that is not their intent.

As far as commercial scanning products (and I have used and have licenses to all the top commercial scanners), the vendors almost always offer two products. The desktop version, and the "enterprise" version. The different between the the two tends to average about $20,000 for a desktop license, and $1,000,000 for the enterprise product.

The enterprise product has to scale, so it will definitely require more hardware. That being said, there are a couple things I have seen.

Company A will offer up the service in a SaaS manner, much like WhiteHat. This is great, because it removes a lot of the hardware burden (managing & building labs, etc) from the customer, they are already struggling as it is.

Company B, the $1,000,000 price tag includes an company wide license for unlimited scanning using the enterprise product. BUT they DO NOT provide the hardware/infrastructure/resources. The customer has to provide their own servers, virtual machines, etc. Company B sucks. : )

Then there are open-source web application vulnerability scanners. They don't scale either in an enterprise fashion, and because they are free there is little customer support. But for a single penetration tester with a single web application target, they are great!

Aaron Bryson said...

There is something else that is worth noting.

When talking about desktop and enterprise scanning products. It is VERY important that both products use the same scanning engine, so that way there is consistency. There are certain commercial scanning products out there that use a different code base in their desktop and enterprise product. So what happens is...you scan a web application with the enterprise and desktop product, and end up with a different list of vulnerabilities. False-positive, and even worse, false-negatives.

In addition, certain vendor enterprise scanners do not have the same "nerd knobs" as the desktop product. What that means, is there is very little ability to fine-tune the enterprise scanner to a particular web application to the granularity.

So you get quantity, and lose quality. Why can't I have both?!

Jeremiah Grossman said...

@Aaron: Great insights, thank you for sharing. "nerd knobs", I like that. LOL! Who would have thought that scanning websites would require so much raw computing horsepower.

Anyway, the next challenge our customers are grappling with is how to tackle the mountain of vulnerability data.

Aaron Bryson said...

What vulnerabilities? ; ) Haha.

Well, I am curious as to what experience you and others have on the subject of handling the mountain scan data and remediations.

Also, Brook Schoenfield says, "hello".

Jeremiah Grossman said...

@Aaron: Speaking from experience when customers, when they engage with us they quickly mature from phase of just finding vulnerabilities to actually implementing a process to fix them.

"Fixes" are typically an application code change, a configuration change, or a web application firewall rule. Whatever the case may be the vulnerability details and recommended action (policy?) must filter down from the security team to the appropriate people in the organization.

The way many of our customers have done this is by using the open XML API in Sentinel. The results are automatically pulled into a bug tracking system or in a higher level dashboard like archer. Of course the XML can also be automatically converted in WAF virtual patch rules.

jetstarvn said...

Very good! Scientific and technological development is so great means to help the present life and future of humanity!