Captcha

Web Add comments

Back when PPP started using CAPTCHA, I clicked around to look all into what reCAPTCHA, the system they’re utilizing, was all about. I was actually really intrigued by what I found, and I’ve been meaning to discuss it for a while.

Roughly 60 million CAPTCHAS are solved by users daily. What reCAPTCHA does is uses that for an actual purpose–’reading’ books.

To archive human knowledge and to make information more accessible to the world, multiple projects are currently digitizing physical books that were written before the computer age. The book pages are being photographically scanned, and then, to make them searchable, transformed into text using “Optical Character Recognition” (OCR). The transformation into text is useful because scanning a book produces images, which are difficult to store on small devices, expensive to download, and cannot be searched. The problem is that OCR is not perfect.

captcha.gif

So how does this work? reCAPTCHA takes one of the words that OCR doesn’t recognize, and pairs it with a word already known. It’s assumed that if the reader gets the known word correct, then the unknown word is also correct. The system then gives the unknown word to a number of other users before being satisfied that they have the word solved.

I just think it’s really awesome to have turned such an average task, such a small, daily task, into something really useful.

Source

One Response to “Captcha”

  1. Mr. Fabulous Says:

    Very cool. I had no idea.

    I learn so much when I come here!

Leave a Reply

  • Categories

  • Archives

  •