Category Archives: Email

Stopping WordPress spammers

The blog comment/trackback anti-spam refinement continues.

I’m testing the WP-Hashcash plugin, which inserts Javascript code to calculate an authorisation code into the comment. Since comment spammers don’t actually use the comment forms (at least I hope not; not until they start using people to enter the comments), this means only real comments get through. Well, real comments from people with Javascript running. If they don’t have Javascript running, they may be out of luck. Hopefully that applies to nobody these days, and I think this solution is less painful than a captcha-based one.

But trackback spam is still a problem. One available option is to block direct access to the WordPress trackback PHP, but this isn’t very effective, since most current trackback spammers however are clever enough to call the “real” URL.

A version of Auto shutoff comments modified to close trackbacks on posts older than 28 days, however, seems more effective. I don’t particularly want to shut comments off (especially since the above plugin effectively stops comment spam), but trackbacks are less compelling to keep open.

Together with previously discussed .htaccess entries to block big bandwidth thieves, this appears to be a fairly effective set of anti-blog spam measures. For now.

Pirates! Spammers! Gyroscopes! Bandwidth thieves!

This is officially getting ridiculous. Not only are my blogs getting a lot of comment spam, but my personal blog site is burning huge amounts of bandwidth, as particular (I assume zombie) hosts hit the site.

Below are the top ten bandwidth users of danielbowen.com for June:

Top 10 of 15312 Total Sites By KBytes
# Hits Files KBytes Visits Hostname
1 14380 4.10% 3801 1.77% 111235 2.22% 159 0.24% host-148-244-150-58.block.alestra.net.mx
2 17558 5.01% 3191 1.48% 99441 1.98% 157 0.24% host-207-248-240-119.block.alestra.net.mx
3 3927 1.12% 3640 1.69% 75989 1.51% 3 0.00% csr010.goo.ne.jp
4 3062 0.87% 2797 1.30% 74881 1.49% 171 0.26% rrcs-24-97-174-130.nys.biz.rr.com
5 3057 0.87% 2200 1.02% 62547 1.25% 392 0.60% msnbot.msn.com
6 2691 0.77% 2248 1.04% 60684 1.21% 153 0.23% 64.124.85.78.become.com
7 2256 0.64% 2082 0.97% 56383 1.12% 124 0.19% 98-101-196-200.linkexpress.com.br
8 2146 0.61% 2033 0.94% 51665 1.03% 279 0.43% dsl-250-198.monet.no
9 2001 0.57% 1755 0.82% 47605 0.95% 23 0.04% host133.sprintnetops.net
10 1686 0.48% 1571 0.73% 35979 0.72% 325 0.50% corporativos

It’s not like this site is hosting pr0n or something — there’s just no reason why any single host would need to grab 110Mb of traffic in a single month. In total traffic topped 4Gb for the month, which is ludicrous for a diary site with a few photos on it. 4Gb is actually my monthly limit — thankfully my web ISP isn’t too strict about charging extra for hitting that, but there’s always the risk if this is consistent that it’ll be costing me real money.

As a result I’ve started a list of bandwidth hogs’ IP addresses, which I’m putting in the .htaccess file. Anything with lots of hits and grabbing above about 5Mb per month is going onto the list, and the list is being duplicated (manually unfortunately) across to the other WordPress sites that I run.

Inspection of the access_log is particularly enlightening, with at present a staggering number of requests coming in with a referer at poker-related sites. Of the 6665 hits in the file for today (covering about 13 hours) there are 674 from texasholdemcenteral.com (note the wonky spelling) and 1212 from sportscribe.com. All of these too are now being blocked with a 403 (forbidden) via .htaccess.

Sigh. I suppose it’s just too much to expect people to place nice?

.htaccess extract – Feel free to copy for your own site to block miscreants.
Continue reading

Recent spam stopping techniques

Okay, two techniques, one that’s going to be comprimised sooner, one that’s going to be compromised later:

  1. A hidden field that must be supplied
  2. A javascript client-server MD5 oneway hash

I don’t see the second as a viable solution because it demands javascript (precluding certain users), and the first will be bested by the spammers when it becomes economically viable. I guess it depends on the implementation cost as to if it’s adopted here.

Why Googlebomb?

Why are webloggers googlebombing online poker?

I assume it’s to reduce the attractiveness of spaming the blogs with the term. Wouldn’t you want positions 1-10, rather than just #1, and really shut the action down? I don’t see that it will. But wikipedia will be regarded as a more relevant site, and that’s gotta be good, right? Speaking of which, I must go check for vandalisim on my pages…

SMS spam from sms.ac

I got an invitation to join sms.ac. A quick Google seemed to indicate it’s not a great idea unless you want to give your mobile number to people who will SMS-spam you.

Further, if they convince you to reveal your Hotmail password (on the pretext of letting you read it from your mobile) they’ll also spam the people in your address book, inviting them to join. Delightful. And the person who “invited” me? She wasn’t even aware it had happened.

So remember kids: sms.ac is bad. Now email this warning to all your friends.

The power of spam

When I registered my first domain name, toxiccustard.com, in November 1996, I didn’t keep my email address secret. It wasn’t obvious (at least to me) that spammers were picking up any valid email addresses they could find, left right and centre. The address: dbowen@toxiccustard.com. I can quote it now because it hasn’t been valid for many years.

But they keep distributing it, and keep spamming it. I know this because my web ISP told me last week that toxiccustard.com is now getting about five thousand e-mail messages PER DAY. Aye carumba.

In fact so much mail is coming in that before they realised the nature of it, they were saying they’d have to decline to provide me with shared web hosting for that domain in the future, because of the impact on other customers. As it is they’ve said okay they’ll live with it, since they’ll be upgrading their systems shortly so bouncing mail doesn’t impact them as much.

I’ve disabled mail completely on that domain in Plesk, and I’m looking into fiddling with the MX records, which hopefully should stop dead any mail way before it reaches anybody and starts costing them money. This may involve moving the domain to a new registrar, since the current mob doesn’t appear to provide this level of customisation.

The lesson: keep your email address secret. Once the spammers have it, expect a snowball effect. It may take 9 years, but eventually it’ll be unusable.

Briefs

The weird bounces I was getting a while back are apparently due to a bug in QMail. They’re also causing some mails to be sent multiple times from webmail. Triffic. But I’ve switched webmails from SquirrelMail to IMP, and that seems to help. I don’t like IMP’s “This mail was sent by IMP” footer, but I do like its features, especially the timezone setting, which was never satisfactory in SquirrelMail.

A big batch of Microsoft patches are out. Through as someone at work pointed out, they shouldn’t be due to buffer overflows, ‘cos MS claimed years ago that they’d eliminated them in Windows XP. (Thanks Ian)

Mr 99Zeroes has apparently been sacked from Google. As Scoble remarks, the rule for blogging about work really needs to be: Don’t piss off your boss. The alternative is simply not to blog about work.

C/Net’s new online news/RSS reader/aggregator: NewsBurst. (via Steve Rubel who features on the latest G’day World podcast)

An Englishman was arrested after he used the text-only browser Lynx to donate money to a tsunami fundraiser. Apparently British Telecom technicians looking through the web site logs thought it was a hacking attempt.

Spam Karma

Well after deleting what seems like hundreds of bloody comment and trackback spams over the past week, I’ve installed Spam Karma (billed as a “fearless Spam Killing Machine”) on this blog. If it’s successful, I’ll be installing it on my other WordPress blogs.

It includes blacklists, captcha or email verification for suspicious comments, a myriad of settings, all that good stuff. For now I’ve set it to “lenient” mode until I get a feel for how strict it is. Feel free to leave junk comments here to see how it goes. (But beware of deliberately leaving spammy comments — for all I know it may decide to blacklist your IP address!

PS. Tuesday 21:25. The manual install as in the ReadMe worked for fine me, except that you can’t get to the config page through the menus, you have to activate it from the plugins page, then go to the URL it quotes. (This is apparently a known thing with WP1.2, but I guess it applies to WP1.2.2 as well, which we’re running here. Presumably it doesn’t apply to the current nightly builds or to the future 1.5.)

Also be sure to try the test captcha page (linked off the config page) to make sure that bit works (eg the correct PHP libraries are there somewhere. If they’re not, I guess you need to hassle your ISP. Works fine for me.)

PS. Wednesday 21:15. There is a hitch: the e-mail it sends out summarising what it’s done is encoded with something. I think this is an incompatibility with the PHP setup on my ISP… the same thing happened with WordPress 1.2’s password reminder messages. I’ll have to dig around for a fix.

It should also be noted that Tony has tried to plonk it onto a blog he runs, and is having some issues. So it’s not all beer and skittles.

On the bright side, it tells me it caught 20 spam comments in the last 24 hours. I certainly haven’t seen any get let through.

PS. Thursday 20:05. Some are getting through, but evidently nowhere near the total number being caught. Hmmm.

Interview with a spammer

The Register’s Interview with a link spammer.

When Sam begins a spam run, he has one target, though he’ll accept any of six. Principal one: come top of the search engines for his chosen site’s phrase. “But you’ll accept coming in at 1,2 or 3, or if you come at 8,9 or 10. Actually, 8, 9 and 10 have better conversion rates. I don’t know why. Maybe the eyes fix on it when you scroll down the page.” And the cost of doing it? Once the code is written, pretty much zero. “Bandwidth is cheap,” he says. “You set it going in the evening and come back in the morning to see how it’s gone.”

So what beats them? Sounds like captchas (those distorted images requiring a human to type a letter)

So what does put a link spammer off? It’s those trusty friends, captchas – test humans are meant to be able to do but computers can’t, like reading distorted images of letters.

There’s several WP plug-ins that will do them; I haven’t tried it yet. But I will soon.

Comment spam vs nofollow

More comment spam hitting us at the moment, but curiously the comments don’t seem to have URLs with them, so I’m not sure what the point is. They’re all purporting to be from non-English-speaking e-mail addresses, and many in broken English, with a generic compliment about how marvellous your web site is. Odd.

Meanwhile, Google have come up with a new <rel=”nofollow”> attribute for links to help fight comment spam. And they’ve got a bunch of blogging heavyweights to back it, too, including the MT/TypePad, Blogger (duh), MSN Spaces and the WordPress gang, which might well cover a good proportion of blogs running today.

Now, W3C ratification, anybody? Oh pah, who cares?