Goodbye CAPTCHAs, hello Distributed Porn-Powered Processing
There are some things that machines are better at doing than people, and vice versa. Automation is all about the former and CAPTCHAs - those little mangled-text images that you have to type in before you're allowed a free email account - are all about the latter.
The purpose of CAPTCHAs is to foil automated attempts by spammers to harvest tons of free email accounts. The trouble is that, as was identified over a year ago, you can automate circumvention, if you're clever about how you harness and use human processing power. In this case, you set up a site with content that people really want to get. (Porn, or warez, or... you get the idea.) In order for people to get to the content, they have to go through a CAPTCHA test - except that the CAPTCHA is actually grabbed from the web service whose defenses you want to breach. Your eager porn-surfing visitors are doing all the hard work for you.
I'm writing about this now in response to this post by Jon Udell, which discusses some of the pros and cons of CAPTCHAs. The main downside he identifies is that, in order to withstand computational defeat, some CAPTCHAs have become so hard that the average human can't pass them. Similarly, as Matt May points out in this excellent post, CAPTCHAs are an accessibility black hole. While these are notable problems, I think it's pretty trivial compared to the CAPTCHA-farming idea I've outlined above, which lowers the CAPTCHA barrier to a trivially-breakable level.
To sum up: CAPTCHAs are a pain to users, they trample all over good accessibility practice and, most importantly, they're useless as a defense against automation. So why the hell are Yahoo et al still using them? Am I wrong in calling CAPTCHA a dead duck? (I have no metrics to back them up, and invite any web techies from large CAPTCHA-using services to contradict me)
I took note of the CAPTCHA-farming idea when I saw it because it's an ingenious way of harnessing large amounts of brainpower in tiny chunks, for which there are all kinds of applications. Here's an example: instead of making CAPTCHA-style image tests which look like this...

... make ones that look like this...

... and then you can lay off half of your data entry & verification staff. (The above image is an excerpt from a census form on this Lockheed Martin press release, which claims that they have handwriting-recognition up to 85% accuracy. That still leaves a ton of human intervention if you're dealing with 100,000 forms)
Okay, I'm not being entirely serious with that example, but there are industries out there existing entirely to harness the power of web surfers who've lost their way. Prime example: those websites full of secondary link lists that exist purely to show up in Google results and act as a banner-loaded intermediary before sending the on their way to buy a digital camera, via an affiliate link. Popular Power - the late lamented startup that wanted to sell spare cycles of desktop computers to computationally-hungry customers - was aiming at the wrong resource. Distributed CPU cycles are worthless unless you're SETI or Pixar. Distributed brain cycles... now that's a much more intriguing proposition.
Or, to put it another way...
Tired: Third-world data-processing sweatshops
Wired: Thousands of clueless web surfers + a good aggregation engine
2 TrackBacks
Listed below are links to blogs that reference this entry: Goodbye CAPTCHAs, hello Distributed Porn-Powered Processing.
TrackBack URL for this entry: http://cheerleader.yoz.com/mt/mt-tb.cgi/143
Cheap tramadol. Tramadol cheap cod. Tramadol cheap. Read More

data entry & verification?
So when the user types 'John smith' how do you know they are right?
Just to get this into the Lazyweb: it's a perfect dovetail with the Project Gutenberg Distributed Proofreading Project: at least, I think so.
Have wiggle-room for difficult captchas, use it as part of the cross-checking process, and turn a volunteer process into a massive distributed squinting at scans of 19th-century Bodoni-heavy texts.
>> So when the user types 'John smith' how do you know they are right?
You can't. But you can have a preliminary set of criteria that have to be satisfied which leaves enough wiggle-room.
If you used it with Project Gutenberg's Distributed Proofreading, you have an 'initial scan' string for these sort of things. I presume that you have an initial scan with the 'John Smith' handwriting recognition stuff as well. So you use that as your base captcha entry, and allow a certain amount of leeway on what's entered.
heh, how about drug prescription verification? that could be a possibly life-saving use for this specific example....
>> So when the user types 'John smith' how do you know they are right?
As well as the criteria method than Nick suggests, there's the old-fashioned way used by many big data entry departments: have the data entered twice and compare. If they differ, have it entered a third time.
GMail only uses CAPTCHAs to prevent brute-force password guessing attacks. This seems to be a valid use.
The technology for this already exists, though their web site is pretty broke right now.
http://www.openmind.org
There are lots of silly online games that have been written to assign keywords to all the images on google, improve handwriting recognition, etc. They are based on multiple disconnected users viewing the same image and entering the same answer (for verification that their answer is correct) and recording the data.
A scientific application -- I'll try to explain this one quickly. We have tens of thousands of fragments of DNA that we've run on gels, and we want to know what size they are. The size of the DNA can be determined by comparing the position of the DNA band to markers of known sizes. I want to get all of this information into a database, so we don't have to manually look up the size of the fragment in question.
I could imagine scanning in all the images and then making them into captchas, with lots of redundancy for improved accuracy. But first you'd have to get your porn-surfing/game-playing audience to learn how to read gels. Hmm...
Captcha's aren't "useless". Do you think someone is going to go to the extreme efforts to build an automated circumvention system, which may take them several days/months/years, let alone it might not even work?
Plus, even with this automated system you speak of, they still have to do the extreme work of emailing all these people who are going to do this work for them.. and they may get caught spamming while doing so. And why would they even bother emailing people to verify captcha's if they could just email the people about their offer they are trying to sell, instead? It's double the work for nothing.
The point is: it detours people from spamming a significant amount. It's not like your door locks on the house are useless now that you've heard on the news that one person's house got broken into even though it was locked.
You can say it's possible for someone to break into your home if you have door locks and a security system. But isn't it harder for them to break in than if you left the door wide open?
L: Firstly, yes, people *are* going to build automated circumvention systems. If attackers weren't automating already, CAPTCHAs wouldn't have been invented. It's an arms race - the attackers will just automate further.
Secondly, I think you've misunderstood how CAPTCHA farms work - "they still have to do the extreme work of emailing all these people who are going to do this work for them." No, they don't. I'm not sure why you think CAPTCHA farming involves email. It involves websites. Email has very little to do with it.
CAPTCHAs are not useless. In fact, the "automated system" of which you speak seems to trace to a rather famous blog entry by Cory Doctorow of Boing Boing. In the lovely echochamber of the blogosphere, Cory's "[s]omeone told me" about the porn-defeats-CAPTCHAs idea has become gospel. That the techies have accepted this as fact is pathetic and laughable. (I thought some of you were trained in the scientific method?!)
http://www.boingboing.net/2004/01/27/solving_and_creating.html
CAPTCHAs can be done better, e.g. by combining visual CAPTCHAs with an audio alternative. Bottom line: it's about making it harder to abuse a system, and CAPTCHAs do that. There's no reason they can't be used in combination with other techniques as well. The near-religious hatred I see of this technology is a joke.
Firstly, Tom, it traces back several months earlier than Cory's post - back to a newspaper article from 2003, which I've linked to in my post and you seem to have entirely missed. But more importantly, you dismiss it as "pathetic and laughable" without giving any argument about *why* it won't work. As an experienced web techie I can see exactly how it'd work. Now, please stick to actual scientific method and disprove it with something more than bluster about the echo chamber.
Combining visual and audio CAPTCHAs may solve the problem of dealing with partially-sighted users but it still doesn't cope with the overall usability problem. And "near-religious hatred"? What the hell are you talking about?
tramadol hydrochloride side
prescription online tramadol