CAPTCHA that is secure and useful

I stumbled over the site reCAPTCHA today, which offers free CAPTCHAs for everyone. This in itself is not extraordinary, but the CAPTCHA offered by the site is quite special. It presents the user with words that could not be matched by OCR software. The source is the Internet Archive, which archives thousands of books and makes them available to the public for free. Solving (and offering) the CAPTCHA is therefore a kind of a noble cause and way more useful than just annoying the user with a colored, hard to read sequence of numbers and letters.

On the technical side it works as follows: The word that could not be read is taken as a picture from the book that is being digitized and is shown together with a random word from reCAPTCHAs database that is presented similarly so the user cannot distinguish between them. If the user matches the word from the database reCAPTCHA assumes that the user typed in the correct answer to the other word too and it is placed in the book.

Leave a Reply