posted by dave on Wednesday, March 2, 2005 at 6:11 AM in category technology, website

Just a couple of small changes to the site that I want to make public. Very boring stuff.

First of all, I've recently been slammed by some searchbots that are retrieving, then ignoring, my robots.txt file. This file specifically states files and folders that I don't want indexed by 'bots. I make these areas off-limts for bandwidth reasons, or for simple site functionality purposes.

For example, I don't want 'bots bumping up any 'blog entries - those are for actual people who like the entries.

I also don't want 'bots searching the raw 'blog files themselves, and I don't want them downloading all of my movies files.

Well I've become sick of 'bots ignoring these rules, so I've decided to block them completely. This was done with an easy addition to my .htaccess file:

order allow,deny
deny from
deny from
deny from
allow from all

As I see new 'bots ignoring my simple robots.txt file I'll block their asses as well.

(Just after I posted this entry I got slammed by another asshole, so his address is now blocked as well.)

Another thing that's been bugging me lately is that I've been getting what's called referer spam. This is when assholes modify their browser to change where they appear to be coming from. Many people, for privacy reasons, will simply change this value to a blank or something. But these assholes are replacing the real referer value with a URL to a site they're hawking.

My .htaccess file can deal with these people as well:

RewriteCond %{HTTP_REFERER} ^.*viagra.*$
RewriteRule .* [R=301,L]

In the example of above I send anyone with the string viagra in their referer to a site that I'm hoping will get them fired if they hit it at work.

On a completely unrelated note, I've changed my site search function to just go to Google. My own home-made search script wasn't working correctly and I just haven't found the time to debug it. This change was for the site search only - the 'blog search is still home-grown.

And finally, I'm beginning to contemplate another site redesign. If I actually decide to do this then V5.00 will feature a cleaner design and will make even more use of CSS.

comments (3)

Keep up the great work on your blog. Best wishes WaltDe

Okay, I will. In fact I have. This entry is a year and a half old.

Your site is cool :))

post a comment

If you haven't left a comment here before, you may need to be approved before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.

I'll pretty much approve anything except SPAM comments, or comments that clearly have no purpose except to piss me off, or comments that are insulting to a previous commenter.

Use anything you want for your name and email address. I think it has to at least look like a valid email address though.

mysterious gray box mysterious blue box mysterious red box mysterious green box mysterious gold box

search main 'blog





Search word(s)
   help me!

blog favorites

the convenience of grief
merrily, merrily, merrily, merrily
nothing personal
the one
dream sweet dreams for me
the willow bends and so do i
on bloodied ground
lack of inertia
thinning the herd
or maybe not
here's looking at you
what i miss
who wants to play?
feverish thoughts
the devil inside?
my cat ate my homework
don't say i didn't warn you
my god, it's full of stars
hold on a second, koko, i'm writing something
you know?
apples and oranges
happy new year
pissing on the inside
remembering dad

Creative Commons License
This work is licensed under a Creative Commons License.