13 Apr 2006
Heard about CAPTCHAs lately? They are a kind of Turing Test (it is short for Completely Automated Public Turing-Test to Tell Computers and Humans Apart).
By many pages in the web that have a free account, spammers try to spread their "message". So the pages will try to tell apart the humans from the spambots (Of course that is not the whole Turing-Test that is based on a real communications - this difference reflects nicely what has changed since Good Old Fashioned AI).
Most of the actual CAPTCHAs are pictures, so you are asked to type in the letters that are shown. Many of these CAPTCHAs are not save anymore, as the arms race proceeds. Computers can read images, too.
On the official CAPTCHA-site (by Carnegy Mellon) you will find more examples: audio tests, pictures with object recognition and simple logical tasks.
I didn't like all of them. For example, I failed here. But the real problem CAPTCHAs have are not badly designed tasks. It's the arms race. It is ruled by this one simple principle: If you let computers generate the task, a computer might be able to solve it and eventually some hacker will come up with a program which actually does. And some go even further: They delegate the task to humans (and pay them with access to porn, so the deal is give one access to get another access - seems fair :-)).
I've been thinking about this lately and came up with some things that might be needed for successful (that means: not breakable) CAPTCHAs:
  1. you need to chose a domain where computers are not even close to reaching human performance. Object recognition is worth a try, but it is already a domain where computers start to reach first successes. Why not try reading people's intentions? Present situations with several participants/objects involved and let the user guess what person A is about to do. This task -well designed- can be solved by first graders and not in 10 years by a computer. reading intentions is out of reach.
  2. the fact that you can theoretically decode computationally everyhing that was encoded computationally needs to be taken care of. The ingredients of the task need to be combined at runtime. That is a combinatorial issue and cryptography has thought this through: A 128 bit key is theoretically computable, but practically this takes a long time. A situation that I described in point 1 is a good model for that...
  3. There remains the problem that spammers might use humans to solve those tasks: they delegate the actual CAPTCHA-task to some human that gets paid a little money (i don't know if that actually is cost-effective for them) or incidentally wants to see some delicate content that the spammer provides in that moment. What happens here is that "our" machine asks "their" machine to do the task, and "their" machine delegates the task to some human, just like a proxy server. So we need to make that delegation difficult. Maybe that would need some anti-semantic web approach: Data that is well-formed, but definitely not machine readable or delegatable. That is: the bot might transport a bunch of data to the human, but not the sense of the task.
After all, I think that this is a field where we can try out what modern AI is about and so something practical at the same time. It's interesting, though, that we are not working on the machine that wants to pass the Turing Test, but the spammers are. We are just talking of defining the game (I still prefer this side).

A last point: It should not be forgotten that there are a lot of people out there that cannot judge pictures because they cannot see them (they are blind). Some people raised this issue, especially the W3C, but I haven't heard a convincing way out of this dilemma.

[Update 17.07.2006] This is also a nice idea. Select the three "hot" people out of a bucnh of pics from the dating service "hot or not" to proof you're human...
# lastedited 17 Jul 2006
You are seeing a selection of all entries on this page. See all there are.