Stopping IP Scanning with NGINX

Prior to putting a website online I could see that my firewall logs were filled with blocks from scanners attempting to hit my WAN IP via telnet, ssh, http, rdp, etc. However, once I opened up my firewall for http and https, I realized I had another brute force problem on my hands: scanners were now attempting to directory walk/ fingerprint my site:

Scanner looking for .env file

I could see scanners looking for common webmail resources, trying netgear router setup paths, trying random hex, etc. My NGINX configuration was pretty bare and would simply forward those requests to one of the sites it was proxying for.

In an attempt to lower the amount of garbage that was hitting my backend sites I decided to stop anyone who was attempting to scan my site via IP address instead of hostname. My thought was that humans definitely won’t be visiting my site by IP address so I should block any requests that come in that way. The simplest way to do this would be by inspecting the host header and not responding to any requests that had an IP address instead of my site name.

NGINX users the server_name directive and matches what is in the host header to determine what configuration file to use:

server_name example

To stop IP scanners, I could either manually set server_name to match my WAN IP address or use a regular expression to catch anything that generally matched an IP address. NGINX has a good section on how to configure regular expressions on their site that I referenced. I also decided that I didn’t want to waste any resources on responding to clients that reached me by IP address. NGINX has the handy 444 return code that can be used to do this. My configuration ending up looking like this:

If IP, don’t respond

My regular expression is a little lazy and will match 000.000.000.000 through 999.999.999.999, however no one should be trying that anyways! Here is a good test to see it in action:

Curl and wget attempts to IP address
NGINX returning the 444 code

One thing I noticed with https, is that the certificate that will be returned to clients will be determined by the site configuration file that is first in numerical/ alphabetical order. So in my case I changed my blog’s configuration name to 00_rm322.conf so that its certificate was presented. If you host several sites, you may want to reorder them so that any internal hostnames are not shared with these scanners.

While we’ll never stop the most persistent scanners, since implementing this change I now see over one hundred 444 logs in my proxy logs each day. This probably doesn’t seem like much but it feels good to see changes in action:

444 for owa