Linux Today: Linux News On Internet Time.

More on LinuxToday

Google Code Blog: Announcing Tesseract OCR

Sep 05, 2006, 15:45 (0 Talkback[s])
(Other stories by Luc Vincent)


Desktop-as-a-Service Designed for Any Cloud ? Nutanix Frame

"We wanted to let you all know that a few months ago we quietly released--or actually re-released--an Optical Character Recognition (OCR) engine into open source. You might wonder why Google is interested in OCR? In a nutshell, we are all about making information available to users, and when this information is in a paper document, OCR is the process by which we can convert the pages of this document into text that can then be used for indexing.

"This particular OCR engine, called Tesseract, was in fact not originally developed at Google! It was developed at Hewlett Packard Laboratories between 1985 and 1995..."

Complete Story

Related Stories: