Linux Today: Linux News On Internet Time.

More on LinuxToday Google's Tesseract OCR Engine is a Quantum Leap Forward

Sep 29, 2006, 12:00 (0 Talkback[s])
(Other stories by Nathan Willis)

"The open source optical character recognition (OCR) landscape got dramatically better recently when Google released the Tesseract OCR engine as open source software.

"The Tesseract code was written at Hewlett-Packard in the 1980s and '90s. In 1995, it was one of the top-tier performers at UNLV's OCR competition, but when HP withdrew from the OCR software marketplace, the code languished. Then in 2005, HP handed off the code to UNLV's Information Science Research Institute (ISRI), an academic center doing ongoing research into OCR and related topics. ISRI discovered that original Tesseract developer Ray Smith was now an employee at Google, and asked the search engine giant if it was interested in the code. Google spent a few months updating the code to compile on modern operating systems, and released it on"

Complete Story

Related Stories: