It appears I’m not the only one coming under generic form spam over the past few days. I caught this post by Sam Ruby the other day. It appears to be a totally generic form submission bot, but it’s pretty good. It’s been hitting the HelpSpot forums daily and making lots of work for me 🙁
My solution will be to add Bayesian filtering for the forums into HelpSpot. It’s already used for email and should work pretty well for the forums, though I’ll need to use a new table to store it to keep the word lists separate since it’s a different type of spam than email and I’ve noticed a different set of words/links being used.
I’m curious why more weblogs don’t use Bayesian filters. They work really well because at the end of the day the spammers must link to their bogus sites and the filter uses that to weed them out. It’s pretty easy to code. The hardest part is figuring out the odd Lisp language Graham used to prototype it.