This page is my "blog".
It's just a place to leave some thoughts and things that are going on. Some will be about software, some about humans and some about both. I'll try not to post about the brand of my new toothbrush unless it's really important :-)
10 Jun 2007
For my web CMS "Polypager", I recently put a CAPTCHA mechanism into use (CAPTCHAs help to block comment spam by asking commentors to identify text from pictures, a task where humans are still much better than computers - I wrote about them in a previous blog post).
This is what it looks like:
It's a cool one, a service by Carnegie Mellon University which lets people identify two words. One is to make sure you are human and the other helps digitize books from libraries: Sometimes the software that scans in those books is not sure what to make of a word so people who want to write a comment can help.
So in this example, typing in "overlooks" may assure my CMS software you're a human while typing in "inquiry" helps to identify this word, which is a problem for the book scan machine of Carnegie Mellon (it could also be the other way round, who knows?).
For a year or so, I had tried other tricks to prevent comment spam, requiring no extra work by the user. But recent developments by the spammers seem to have taken those hurdles. As hundreds of spam comments rolled in, I had to take some action.
Anyway, though it's a really nice idea and all, I do fear that this secret from postsecret.com is really wide-spread:
I think the point behind this secret is that simply reading and entering text is too dumb a task - solving captchas should be fun, not work. Even the Hot-or-not approach is more work than fun. It's always the same task. Repetition is never fun for humans. Only machines like it.
Let's crunch some numbers: The recaptcha guys claim that "About 60 million CAPTCHAs are solved by humans around the world every day." Say that one of them takes 5 seconds to solve. That amounts to more than 3,400 days of time - and it seems that people regard this time as labor!
What ideas could we think of that make assuring that you're human more fun - like a little 5-second-game?
Here is one: You heard of this place called Web 2.0 where thousands of users generate content and the site creators do nothing but provide the platform?
Why not create a CAPTCHA - platform like this?
Users submit little riddles, you know, obvious ones that you can solve really quick. Just a little picture and a one-word solution to it.
Here, I made a quick example (using Incscape):
I know, it's not beautiful and not funny. But it's different. Everyone will submit different riddles. That's what humans like: "Gee, I wonder what kind of riddle I get this time!".
The fun part here is that they all come from different sources. Each one is different: some are funny, some have a message, others are beautiful ... you get it.
Let's assume it would work in the Web 2.0 manner and thousands or millions of riddle - CAPTCHAs come together, new ones all the time. They would be very, veheery hard to crack using machines. The spammers would have a hard time defining the problem space. They just wouldn't know what to adjust their algorithms to.
I hear you say that people would never submit riddles to a platform in satisfying numbers - but that is the same argument they brought forward when Wikipedia and YouTube started.
So, the riddle CAPTCHA platform already has some Web 2.0 features like user-created (and user-owned) content. We don't need to add social networking, but let's add the Architecture Of Participation:
Riddles could be ranked. Each commentor is asked how (s)he liked the riddle (this requires just one click and is not mandatory). This would create the popularity contest that Web 2.0 people like so much. However, the popularity of a riddle would not be reflected in the number of times it gets used. That would be too easy for the spammers.
Alright, that's how far I'm pushing this idea forward for now.
If anyone has objections why this is a bad idea, please comment.
This is what it looks like:
It's a cool one, a service by Carnegie Mellon University which lets people identify two words. One is to make sure you are human and the other helps digitize books from libraries: Sometimes the software that scans in those books is not sure what to make of a word so people who want to write a comment can help.
So in this example, typing in "overlooks" may assure my CMS software you're a human while typing in "inquiry" helps to identify this word, which is a problem for the book scan machine of Carnegie Mellon (it could also be the other way round, who knows?).
For a year or so, I had tried other tricks to prevent comment spam, requiring no extra work by the user. But recent developments by the spammers seem to have taken those hurdles. As hundreds of spam comments rolled in, I had to take some action.
Anyway, though it's a really nice idea and all, I do fear that this secret from postsecret.com is really wide-spread:
I think the point behind this secret is that simply reading and entering text is too dumb a task - solving captchas should be fun, not work. Even the Hot-or-not approach is more work than fun. It's always the same task. Repetition is never fun for humans. Only machines like it.
Let's crunch some numbers: The recaptcha guys claim that "About 60 million CAPTCHAs are solved by humans around the world every day." Say that one of them takes 5 seconds to solve. That amounts to more than 3,400 days of time - and it seems that people regard this time as labor!
What ideas could we think of that make assuring that you're human more fun - like a little 5-second-game?
Here is one: You heard of this place called Web 2.0 where thousands of users generate content and the site creators do nothing but provide the platform?
Why not create a CAPTCHA - platform like this?
Users submit little riddles, you know, obvious ones that you can solve really quick. Just a little picture and a one-word solution to it.
Here, I made a quick example (using Incscape):
I know, it's not beautiful and not funny. But it's different. Everyone will submit different riddles. That's what humans like: "Gee, I wonder what kind of riddle I get this time!".
The fun part here is that they all come from different sources. Each one is different: some are funny, some have a message, others are beautiful ... you get it.
Let's assume it would work in the Web 2.0 manner and thousands or millions of riddle - CAPTCHAs come together, new ones all the time. They would be very, veheery hard to crack using machines. The spammers would have a hard time defining the problem space. They just wouldn't know what to adjust their algorithms to.
I hear you say that people would never submit riddles to a platform in satisfying numbers - but that is the same argument they brought forward when Wikipedia and YouTube started.
So, the riddle CAPTCHA platform already has some Web 2.0 features like user-created (and user-owned) content. We don't need to add social networking, but let's add the Architecture Of Participation:
Riddles could be ranked. Each commentor is asked how (s)he liked the riddle (this requires just one click and is not mandatory). This would create the popularity contest that Web 2.0 people like so much. However, the popularity of a riddle would not be reflected in the number of times it gets used. That would be too easy for the spammers.
Alright, that's how far I'm pushing this idea forward for now.
If anyone has objections why this is a bad idea, please comment.
#
lastedited 11 Jun 2007
follow comments per RSS
You are seeing a selection of all entries on this page. See all there are.
I still like the idea, but I have another one.
Machines can't think very well yet, right?
Humans don't like to perform brainless tasks, especially if they take long, right?
most humans read pretty fast and do not mind pressing "submit" buttons, right?
Machines only learn when humans teach them, so using the same type of captcha (distorted digits) all over is not too smart, is it?
Here is the idea: Have six to ten (or even 100) "submit" buttons. Only one works, but it is a different one each time. Above the buttons you get a message (or a distorted image) saying: "To submit, please press the third button from the left in the second row from the top".
Less accessibility but similar: Buttons have different colours or sizes or texts and then the message could read "press green button" or even "press the smallest green button in the second row from the bottom that does not have an x in its text".
Would that work?
Sentences like "To submit, please press the third button from the left in the second row from the top" are quite understandable for machines. A world of colored buttons reminds me of SHRDLU (http://en.wikipedia.org/wiki/SHRDLU). Note that spambots can already identify colours and positions of web page elements by inspecting the CSS.
If sentences are generated by machines, other machines (like spambots) will be able to understand them. It's the natural uttered speech that is (yet) too rich and unpredictable.
I wouldn't have fun with 20 submit buttons. A machine would :-)
Some really interesting ideas. How about this as another. Present three paragraphs, one is a joke, and funny, the other two are not. The user has to select the funny one! Easy, and fun. Impossible for a computer?
At this point: yes, impossible. Machines don't know funny.
That could very well be one of those little games I was talking about and jokes are a good example how people could define their own ideas, making the problem space really really huge.
Just one thing: The computer has a 1/3 chance of succeeding by submitting random answers (1, 2 or 3). Having to try three times as much is no such big deal for bots.