So far so good on the referrer spam killer. Since I implemented it yesterday it’s killed 33 spam links that would have appeared in my referrer logs. That’s precisely what I ask of it, so that’s well and good.
Unfortunately I’ve also noticed some “false positives” that necessitate another adjustment to the mechanism. The last time we made an adjustment to allow search engines to get in on the possibility that someone would enter the site through a query. That’s fine as far as it goes, but I found a failed attempt to get in from a foreign version of Google. So, I’m adjusting the search engine conditions to be as follows:
RewriteCond %{HTTP_REFERER} !^http://([^/]+)google\..*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://([^/]+)altavista\..*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://([^/]+)yahoo\..*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://([^/]+)msn\..*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://([^/]+)a9\..*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://([^/]+)lycos\..*$ [NC]And again, my code formatting will screw up the escaping backslashes… imagine a backslash immediately after the name of each engine (e.g. right after “google”). Note that the source display of the .htaccess rules is displayed correctly if you have problems.
In this instance, in addition to fixing the country-specific searches, I’m also being extra paranoid. I figure the spammers would eventually figure that they could easily bypass this protection by, for instance, inserting the word “google” somewhere in their URL, even as a subdirectory (e.g. http://texas-holdem-sucks-rocks.biz/google/). So, the new code looks for the search engine name only before the first slash. It can’t be a subdirectory. That should give them a few more headaches trying to do an endrun around it.
Again, this is a living, moving target. As I find weaknesses, I’ll update. As they find weaknesses, I’ll update. Hope this helps.





coldforged.org » Killing Referral Spam Says:January 26th, 2005 at 2:48 pm
aquo;