View Full Version : How to get rid of SPAM in forms
08-06-2008, 06:59 PM
I have a few forms in PHP in my site where the people can leave comments. Unfortunately I'm getting a lot of junk in the comments with links and advertisement like the SPAM mails.
I added an image verification to the page, and it works, but the spam is still arriving!!!
How this works? I mean, they are probably using a script for uploading this junk, and i know some scripts can read the captcha images, but I'm getting 15 post every day...
Does anybody know how they do this? Because I think is an error in the logic of my php page... but it seems fine..
if anybody wants to take a look, the works one is http://www.novatechinfo.com/docs/troubleshooting.php
There are other forms, but after the image verification I'm only getting junk in this form.
As always, many thanks for any help.
08-06-2008, 07:13 PM
Ok, I think I know how they are doing this. I'm saving the image verification code in a 'hidden' variable to check, when the form is submitted, if the validation is right.
If you see the source with the browser, you are able to see the 'hidden' field.
I think that is what the SAPM script is doing.
What I don't know is how am I going to solve this...
08-07-2008, 07:46 AM
The easiest approach is probably to set up an index and use that in your hidden field. For example:
So if you have 'uib8J' in the captcha you pass '2' in the hidden field. Your server-side processing reads the '2' and therefore compares the verification field contents to 'uib8J' (since it has the same index information). It should only be a couple of lines of code extra.
08-07-2008, 10:48 AM
Yeah, is a good idea. I'm probably going to do something like this.
The only drawback is I have to generate all the images and store them. It's not big deal, but I think the original idea was cleaner, but faulty...
08-07-2008, 11:14 AM
Another option is to store a hash of the image text in the hidden field.
When the user enters the captcha text, compare the hash of what they entered to the hidden hash.
This has the advantages of your first approach without giving away the password.
08-07-2008, 12:23 PM
Tom's idea is a good one.
On the same note, you could just do a CRC of the string and use that as the comparison. It all depends on what is easiest for you and what libraries you have (I'm a Perl guy and only hack at PHP).
08-07-2008, 05:01 PM
O, I like it. The hash and CRC ideas are good, and I think pretty easy to implement.
08-07-2008, 05:51 PM
To prevent some spambots, you can created a hidden input field via CSS, which would only be visible to bots, not visitors. If there is any $_POST data in that field, you know a spambot is trying to use your form.
It only works until the spambot figures out what you're doing though.
Some sites are replacing Captcha with human readable questions that require correct answers.
You can set up a series of questions in a database, and randomly present one to your users posting such as:
"If you mix the colors blue and red, what color do you get?"
"Which of the following is a fruit? (purple, orange, red)"
You'd have to come up with enough questions to get the spam bot to give up trying answers though or screen scraping all your questions.
08-07-2008, 10:10 PM
Number one thing is to count the "http://" strings in your from submits. They almost always have a bunch of them.
08-08-2008, 10:35 AM
Uh, ok. I didn't know that.
08-10-2008, 01:20 PM
Ok guys, I updated my web pages just adding the Hash for comparison. It's a very clean solution, just changed 2 lines in my code, and until now, no more junk in the posts!
By the way, if somebody wants the code for the captcha (verification) images, just let me know.
08-12-2008, 03:41 AM
What do ya'll think about the idea of using a simple math problem, such as 3x5=fifteen...two plus five=7...something like that? I'd seen this before, but I don't remember where, except that it was on some politician's site where you can offer your opinions on one thing or another.
08-12-2008, 01:24 PM
Yes, is a good idea too, SneakyDave also wrote about that , and more easy to implement than the CAPTCHA images, but i already finish the script and is working properly now, after the Hash thing..
For next time maybe I'll do something like that, but in a second thought there are some implementations problems. If you only use numbers, there is probably an easy way to develop a script for reading and make the calculations. You have to use other options like colors, fruits,etc..
08-12-2008, 05:49 PM
If you only use numbers, there is probably an easy way to develop a script for reading and make the calculations. You have to use other options like colors, fruits,etc..
Oh, I was thinking if you mixed numerals with spelled numbers, then that would be harder to get around. But lets say you have four bananas (graphics) + three bananas = 7? or seven? Or is that too complicated to code, especially if it you have rotating fruit as well as numbers of fruit?
08-12-2008, 05:55 PM
One of my financial institutions uses a photo, and a description opportunity in the form. You pick the picture you want as part of the verification process and put the description (boat, car, house, etc) next to the photo when you log in.
08-12-2008, 06:00 PM
Nop, is not complicated and is a "fresh" idea. Ill probably give this a try in a future development.
08-12-2008, 07:56 PM
I had come across a discussion about this in Craigslist (http://blog.craigslist.org/2008/06/fight-spam-digitize-books/#comment-3789). They looked more than a little frustrated.
08-13-2008, 12:45 PM
I read Craiglist discussion and yes, its a big problem. But the true is after I added the hash to my form the problem was solved, so the spammer was not using a captcha reader script (apparently)
In a client I just had to change the page form adding a few "must fill" selects and this solved the spam problem for him (no captcha or anything else)
But this worked in small visited pages. I'm sure is not working in high visible ones.
This is always like this, the community finds a solution, the spammer/hacker/idiot next door finds a way to bypass it. The community looks again for a new solution and there we go.
08-13-2008, 01:04 PM
Just search for "http://" I tell ya.
08-13-2008, 05:27 PM
08-14-2008, 02:31 AM
Don't all your spam form submissions contain a bunch of links?
A solution I came up with recently that bypasses needing to modify a form submission program (i.e. to search for link code) is to set up sorting filters in my email client. Anything with the pre-set subject(s) gets searched for common spam items such as the href code that have no business showing up in any legitimate email. The drop-off in spam to manually delete is a breath of fresh air!
12-14-2008, 12:24 PM
Anyone know why some forms are targeted and others aren't? I have forms on multiple sites, and only one gets form spam. It is on a site with relatively high page rank, but there's a second form on the same site which doesn't get any spam.
I've just added a hidden field on the targeted form which is processed by NMS form mail. If text is added to the hidden field, NMS form mail sends the poster to a projecthoneypot.org honey pot instead of sending me the spam. I had been getting about a dozen spams a day before the modification. I haven't gotten any spam since the modifications which I finished about 12 hours ago, and three visitors have been sent to the honey pot.
I basically followed the instructions posted by rwedge at codingforums.com for how to modify NMS form mail (http://codingforums.com/showthread.php?t=113863) - but put a link to the honey pot as a destination for rejected posts. I haven't installed a local honey pot (just asked project honey pot for a quick link), but I'm planning this as a next step. And if the spam gets a lot worse, I'll probably figure out how to block the IP addresses which Project Honey Pot has identified as likely spam sources.
For a while, I was able to block form spam by blocking specific IP addresses, but more recently (last couple of months) it seems as though the offending IP addresses change daily. Also, it used to be that the fields were filled out with strings of random letters. More recently, the background level of random letter spam has been drowned out by the more traditional web link type comment spam.
Anyone else have experiences similar to these? Are there form address lists out there that are circulated by comment spammers (similar to email address lists circulated by email spammers) that would account for these patterns?
vBulletin® v3.6.8, Copyright ©2000-2013, Jelsoft Enterprises Ltd.