Sturgeon’s Law

Ninety per­cent of every­thing is crap.


Derived from a quote by sci­ence fic­tion author Theodore Stur­geon, who once said, “Sure, 90% of sci­ence fic­tion is crud. That’s because 90% of every­thing is crud.” Oddly, when Sturgeon’s Law is cited, the final word is almost invari­ably changed to ‘crap’.

Random Images

Downtown Edgefield Post Office Sign In Please February

Miles Per Gallon

Fuelly Fuelly

Help A Noobie Out

This morn­ing I got an email telling me that I was get­ting close to exceed­ing my band­width for the month. Inter­est­ing, that’s never hap­pened before. So I checked my stats and sure enough I’ve served up 8.6 Gigs out of my 10 alloted.

Things were run­ning about nor­mal until the 23rd of the month and then usage quadru­pled. Nor­mally I was using between 150-200MB a day when all of a sud­den it jumped to over 900MB. Vis­its and hits stayed pretty much the same, but pages went way up. The biggest page served was “/archives/ miatatude/” which is auto­mat­i­cally gen­er­ated when requested.

Fur­ther delv­ing into the stats, a lot of exter­nal links had web addresses with names like: http://phentermine.us.tt — http://phentermine.dnc.pl — http://phentermine.rocken.de — http://phentermine.220v.org — http://party-poker.dnc.pl — http://www.cialis.wczasy.com — http://hgh.dnc.pl — http://hydrocodone.dnc.pl– http://www.rape.wczasy.com

Next I looked in the raw access logs and found a bunch of entries that looked like this: 210.0.200.2 — - [26/Aug/2005:00:00:08 –0500] “GET /archives/miatatude/ HTTP/1.0″ 200 26131 “http://phentermine.us.tt” “Mozilla/5.0 (Win­dows; U; Win­dows NT 5.1; en-US; rv:1.5) Gecko/20031007 Firebird/0.7″ and 148.244.150.58 — - [26/Aug/2005:00:02:00 –0500] “GET /archives/miatatude/ HTTP/1.0″ 200 1723287 “http://phentermine.rocken.de” “Mozilla/5.0 (Win­dows; U; Win­dows NT 5.1; en-US; rv:1.5) Gecko/20031007 Firebird/0.7″

I did some read­ing up on deci­pher­ing that mumbo-jumbo and what is really strange is that both those requests are for the same web page, but for one a lot more info is returned, but the big ques­tion is what is going on here? I found like the top ten IP addresses doing this request­ing and denied them access so they will get a 403 instead con­tent. What really wor­ries me is this looks a lot like com­ment spam roaches, you squash one and sev­eral more crawl out from the base boards. Am I going to have to check my logs daily and ban IPs until I close every one?

4 comments to Help A Noobie Out

  • Have you tried tak­ing a look in your com­ment table in your DB? Doing a search ordered by date could show you whether they’re seed­ing your past pages with com­ments point­ing at their domains. Also, are you re-writing URLs placed in com­ments with the rel=‘nofollow’ syn­tax? This kills the ben­e­fit in terms of pager­ank and google, which is typ­i­cally why you get this spam to begin with. Some­how I think you already know this stuff though… =)

  • This may be what they are try­ing to do, but it ain’t hap­pen­ing. I run some sort of MT plug in that closes com­ments once the entry falls off the front page (10 days?)

  • Go forth and acquire MT Blacklist.

    It’s a slick tool that will allow you to ban based on con­tent, not IP. It will also help elim­i­nate those per­ni­cious track­pack pings spams.

    J

  • Used to run Black­list, but stopped once I installed MT Close Com­ments. You think that is what those are, track back pings? That fea­ture isn’t enabled on my blog…