Linux Today: Linux News On Internet Time.
Search Linux Today
Linux News Sections:  Developer -  High Performance -  Infrastructure -  IT Management -  Security -  Storage -
Linux Today Navigation
LT Home
Contribute
Contribute
Link to Us


More on LinuxToday


Google Code Blog: Announcing Tesseract OCR

Sep 05, 2006, 15:45 (0 Talkback[s])
(Other stories by Luc Vincent)

"We wanted to let you all know that a few months ago we quietly released--or actually re-released--an Optical Character Recognition (OCR) engine into open source. You might wonder why Google is interested in OCR? In a nutshell, we are all about making information available to users, and when this information is in a paper document, OCR is the process by which we can convert the pages of this document into text that can then be used for indexing.

"This particular OCR engine, called Tesseract, was in fact not originally developed at Google! It was developed at Hewlett Packard Laboratories between 1985 and 1995..."

Complete Story

Related Stories: