New Betting Sites Betting Websites Not On Gamstop Non Gamstop Casinos Non Gamstop Casinos Non Gamstop Casinos

To Pwn a Spambot

Ah, comment spam, how do I love thee. Let me count the ways... I actually haven't had too much comment spam of late, the captchas and blocklist seem to have kept it out, but maybe I should. This blog should have more comment spam. Why? Because I want to mess with the spambots. I've been reading about the state of the spambot arms race and there's a fair bit of inventiveness on both sides. For one thing, they're getting smarter. Well equipped comment-spambots can now read captchas, operate off botnets parse javascript, convincingly emulate humans and sidestep a whole raft of systems designed to stop them. They're no longer simple little script-kiddie things that make HTTP posts to predictable pages.

If I were to set about making a spambot today, I would not be looking at a script (like most comment-spambots of the last several years), it would be a browser add-on. To convincingly emulate a human the bot should use the same tools and surf similarly to a human. It must produce unique, relevant text in its comments and field a set of working email addresses to reply to verification forms. It should use bugmenot or similar to bypass registration where required. It should use a large number of proxy servers or zombie machines and never the home IP, working gradually and returning often, keeping track of it's progress and regularly checking up on comments posted. It should post *one* comment per site, not thousands, and do so very discreetly. Where possible it should analyse comment and posting trends on a site (time of day, comment length, grammar, etc) to blend in. It would be formidable, and fortunately I have no interest in making it.

Mark my words, bots as I have described already exist and will become common, perhaps in the next two years. And they'll just keep getting smarter. AI will eventually be employed to this end. There's a lot of money in spam and with the blogosphere continuing to double every six months the potential markets become enormous. So long as dumb people continue to buy dumb stuff off the internet (honestly, who are these retards buying all those penis pills, I dont know *anyone* that stupid), this form of spam will increase.

So, as a webmaster, what do you do about bots like that? The thing that struck me most when reading about the development of systems like SpamKarma is their focus on blocking either the bots or the comments, using Baysian filters, P2P systems, etc. The focus is very much a 'close the door on them' approach. It won't work. They'll come through the windows, or dress up as your creepy Aunt Gladys. Baysian filters can be poisoned, and P2P systems can be joined and co-opted (like Kazaa - served!).

My approach is different. Next time a telemarketer calls dont just hang up the phone, try this:

talk at length about your problems, and how depressing the world is

say 'No honey, get away from there.. sorry, my 3 year old is playing by the pool/at the top of the stairs/with my gun, be right back...' and see how long they'll hold.

try to gently initiate phone sex (bonus: with a member of the same sex)

ask questions. What does this person think about Sex and the City? Low rise jeans? It's really easy to do. They'll start asking questions about you, just respond with something like 'Yes, I could use some home insurance. Seems like a sensible, almost cosmopolitan thing to do. I bet Carrie has home insurance, do you think Sex and The City promotes things like that? What do you think about Carrie? Really, I've always liked Miranda better...'

Savour the fun you have with these poor people, because it won't last. In no time at all you'll be put on *their* do-not-call list, which is shared between companies, and you wont have any more phone marketers to play with. Likewise if you get Jehovas Witnesses every Sunday, answer the door in your underwear and say 'Hey, can't chat, we're playing strip twister. Wanna come join us?'. People mention stuff like that to one another. Pretty soon a whole bunch of JWs will be praying for your wretched soul. But I digress, back to comment spambots.

The approach I want to take with them is basically to invite them in, and have some fun. The majority of these bots are still the older, dumber, script based type that make a lot of HTTP posts and little else. Many methods to detect them already exist, so I'll skip over that. Once one is identified, there's a few things we could do. We can send the standard error/blocked/go away page, but that's boring. Things I'm considering

Redirect the bot to a blog you don't like with a HTTP 302 response, let some wingnut/racist/evangelical deal with it. Use the bot as a tool to some other end. The site you redirect to should run on the same blogging software as yours for best results.

send a normal, correct HTTP header, XML document descriptor, and then an endless stream of invalid characters. If implemented as a shell script this will make the terminal it's running on beep endlessly at the user, as well as holding up the bot. It may continue to download indefinitely, halting it's run and if we're lucky crashing the program when it runs out of memory.

three words: buffer overflow exploit :-) a computer virus, padded a certain way, can sometimes be run by overflowing the memory assigned to some data with JUMP instructions to the beginning of the virus code. As these instructions flow into code memory a process may hit one and launch the virus, possibly giving you control over the spammer's computer. Then the fun begins in earnest. Well written programs are usually immune to this, but I'll bet my hat that most spambots are not very well written.

302 (temporary redirect) to another port on the next page viewed by the bot, see if it follows. This is to mess with bots using unsecured HTTP proxies, many of which won't go to some random, high number port. If badly coded the bot may interpret this as non-functioning-proxy, causing it to remove that proxy from its list, making it think it has less resources than it does. Overall, a hindrance and also useful information on how a particular bot works.

And I'm sure there's more. Now, for the current generation of bots, the smart ones that load a page completely, images and all, and parse the javascript. First we'll want to tell actual browsers from browser emulators. This I think can be done best by giving the bot octal or DWORD URLs to follow. Most people have never heard of octal URLs, their use a forgotten memory of the dark ages of the internet, but Internet Explorer and other browsers still understand them. More on this. There is very little chance that a kiddie programmer writing a browser emulator will have included code to handle these, no-one uses them anymore, which lets us know if we've got a bot using an actual browser to load the pages.

If it is, perhaps its vulnerable to a browser exploit? Maybe we can install us a trojan. Another option is to include on a page a huuuuuge image or ten and kill the browser by using all its memory (and make windows machines grind to a halt when we use up all the swap file). A PHP script could generate such an image on the fly without loading it into memory, just writing a picture file to the bot from a predefined jpeg header, so it's no work for our server but a huge load for the bot as it tries to draw the thing. The bot may have a maximum size for stuff it downloads so we'll leave off the Content-Length HTTP header field and let it guess how much data we're going to be sending.

Here's an idea of identifying the source of comment spam. If we suspect a bot, we can include an image in the page that's served off an FTP URL. Bots using HTTP proxies might not also have an FTP proxy set up, so while you can't tell where the webpage requests are originating from, the FTP connection of the browser getting the image might yield the IP address (and thus ISP and geographic location) of the spammer, a good clue to finding the actual person responsible. Then we cook up some mischief for them... a monkey for their back, if you will.

Created 2006-08-30 03:44:27 by 520 and filed under hacking