Google announced the acquisition of reCAPTCHA, a company specialized in stopping spam bots in their tracks and digitizing books. In a blog post Google reveals they plan to use reCAPTCHA's technology to improve availability and accessibility of all the information on the Internet.
But there’s a twist — the words in many of the CAPTCHAs provided by reCAPTCHA come from scanned archival newspapers and old books. Computers find it hard to recognize these words because the ink and paper have degraded over time, but by typing them in as a CAPTCHA, crowds teach computers to read the scanned text.
In this way, reCAPTCHA’s unique technology improves the process that converts scanned images into plain text, known as Optical Character Recognition (OCR). This technology also powers large scale text scanning projects like Google Books and Google News Archive Search.