Sunday, February 2, 2025

We Finally Put Up a WAF

Someone sent an awful lot of requests at a system for long enough that management noticed the issue. Working with the responsible admin, I ended up proposing AWS WAF “to see what would happen.”

What happened: WAF blocked 10,000 requests per minute, and someone got the message.  This released the pressure on the DynamoDB table behind the system, allowing it to jump straight from max to min capacity (1/16th) after fifteen minutes.

It seems some automated vulnerability scanner had gotten into an infinite loop.  There were a lot of repeated URLs in the access logs, like it wasn’t clearing pages from its queue if they got an OK response but unexpected data.  The reason “everything” returns OK is because an unknown URL (outside of a specific static-content prefix) returns a page with the React app root, and lets JavaScript worry about rendering whatever should be there.

I went ahead and put the same WAF on my systems, promptly breaking them.  Meanwhile, our automated testing provider started reporting failures, with every request from the original system returning Forbidden.

The testing platform… is a bot.  I had to write them an exception.

Turning my attention back to my systems, I put together a second WAF so I could have different policies.  My system includes an API or two, so I needed to allow HTTP Libraries and non-browsers. I linked in the exception for the testing platform as well.  Things went much more smoothly after that.

I know that the WAF is fundamentally “enumerating badness,” but it is clearly better than zero filtering.  It is also much less effort and risk, which is why this sort of thing persists.