Have you ever wondered why CAPTCHA codes are so hard to enter? Cheaters can use bots to enter sweepstakes for them or vote for their contest entries, exploiting code to enter more often than the rules allow. But don’t CAPTCHA codes prevent bots from being used on sweepstakes forms?
Well, they’re supposed to, but it’s a game of cat and mouse; cheaters are always trying to crack CAPTCHAs, and companies are trying to strengthen their security to make them harder to get around (while still letting regular people enter).
Understanding the methods that spammers use to circumvent CAPTCHA sheds light on why those CAPTCHA codes are getting harder to enter.
1. Avoiding CAPTCHA with OCR
OCR, which stands for Optical Character Recognition, is a way for computers to identify text from images. If you want to scan a document into your computer and edit it like any of your electronic documents, you’ll scan the image into the computer and then use OCR software to convert the image into text.
If you have a nice, clear text CAPTCHA, cheaters can use OCR software to break the code.
This is why so many CAPTCHA codes are blurry, have wavy lines behind them, turn the letters sideways, or otherwise make the text hard to read.
If you’ve ever tried to scan in any documents, you’ll notice that while many words scan through without problems, any smears or smudges on the paper, or anything else that makes the text a little unclear, will cause the OCR software to make errors and confuse the words.
When CAPTCHA codes are hard to read, it increases the chance that cheaters’ OCR software won’t be able to break the code.
2. Displaying CAPTCHA Codes on Other Websites
CAPTCHAs are designed to be easy for humans to solve, but very hard for computers to enter automatically. But that doesn’t help if it’s humans who are unwittingly solving the CAPTCHAs.Cheaters and spammers have gotten around CAPTCHAs by passing the code to another website, where people enter the code to get access to some other feature. For example, the people think they’re solving a puzzle or typing a code to get access to an (often pornographic) picture.This is one reason why some CAPTCHAs expire so quickly. If a new CAPTCHA needs to be entered every few seconds, it reduces the odds that cheaters can trick someone into typing the response quickly enough.
3. Paying People to Crack CAPTCHA s
Some companies offer programs that allow cheaters to crack CAPTCHAs for $1 or less per crack. They work in a similar method to the trick above, but they pass the CAPTCHA codes to people working in sweat shops in third-world countries to solve. A fast-expiring CAPTCHA can also fight this kind of crack.
4. Exploting Poorly-Coded CAPTCHA s
Some CAPTCHAs are not coded correctly, so that it’s possible to guess the desired result from the code or to have the same CAPTCHA accepted over and over again. Luckily, sweepstakes companies can avoid this problem by using free CAPTCHA programs like Google’s Recaptcha.
The courts have found that circumventing CAPTCA violates the DMCA, making it illegal. You can read more about the issues involved in this Wired article: Is Breaking CAPTCHA a Crime?
As long as there’s profit in circumventing CAPTCHAs, criminals will always look for new ways to crack them, while companies will try new methods to boost security. If you’re having trouble with specific sweepstakes, read How to Solve Tricky CAPTCHAs.
Spammer bots are a problem, yes. But Captchas are a problem, too. They’re a bother, they’re not foolproof and they assume that everyone is guilty until proven innocent. What Captcha really stands for, in other words, is Computers Annoying People with Time-Wasting Challenges That Howl for Alternatives.
A couple of years ago, I don’t remember being truly baffled by a captcha. In fact, reCAPTCHA was one of the better systems I’d seen. It wasn’t difficult to solve, and it seemed to work when I used it on my own websites.
Fast forward to 2012, and I am trying to log into my Envato Marketplace account onGraphic River. I haven’t been there in a few months, and recently I’ve been working on changing my passwords to be unique-per-site. Understandably, I forgot my password.
But I didn’t entirely forget my password— I knew there are three possible passwords, across two possible usernames. Rather than going through the entire reset password process, which is a hastle and a last resort, I decided to try and guess. After a couple of attempts and failures, I was presented with a reCAPTCHA.
Normally I don’t have an issue with this— after all, I am guessing a password to a user, and I applaud Envato for trying to protect my account. But this time, I couldn’t read the captcha.
While the word “secretary” is perfectly visible, albeit faded, the first word is more of a puzzle. “Onightsl”? “Onighisl”? Are those even words?
It’s important to note the way reCAPTCHA works. Each user (or bot) is presented with a control word, and a word unrecognized by OCR. This control word is already known to Google (who runs reCAPTCHA). If you get this first word right, it is assumed that you get the second word correct as well. So, in reality, you only need to guess the key word correctly.
I decided to just guess the first word and hope “secretary” was the control. It wasn’t.
Now, not only did I not know if the password I entered was correct or not, I had to resolve another captcha.
Wonderful. This was near impossible to solve, and instead of wasting my time, I hit the refresh button on reCAPTCHA to get a new image.
Seriously, I am now wasting my time. Refresh.
Ok, so this is a little bit better. “Proximity” and… “rsgsrem”? Or was that “rsgmem”? Refresh.
Another cut off word. “and”? Possibly. Refresh.
You can see where this was heading.
Again, and again, and again. The capatchas were not only difficult for a computer to read, but impossible for a human.
The problem is, computers are getting better at guessing captchas.
In August of 2010, Chad Houck presented at DEF CON 18 with a system that beat reCAPTCHA’s visual system 10% of the time. Google modified their system prior to Houck’s presentation, but it was quickly defeated by Houck who described the modified system as “easier” to crack.
The audio capatcha system is even worse— in May 2012, Adam, C-P and Jeffball presented at LayerOne (a hacker conference) showing a program that beat Google’s audio system 99.1% of the time.
In our attempt to distinguish humans from bots, we have only proved that bots can be just as human as we are— at least when it comes to solving these captchas.