---

Tesseract-ocr: convert scanned images into editable documents on Linux

[ Thanks to linuxaria for this link.
]

“In other words, using the program Tesseract-ocr (which uses
this technology), if take a piece of newspaper and we scan it in
our scanner, we get an image file (jpeg, tiff, etc …) from
which we can extrapolate a the text document and save it as a
normal txt that you can change, according to our convenience or
purpose.

“Hoping to make an useful thing, I tried to come to a procedure
as simple and less invasive as possible, drawing on some material
on the web, to enable all interested in the subject to do with
Ubuntu or Linux in general what still keeps them tied to
Windows.

“In this guide, I’ve used Ubuntu 10.10, and in addition to
Tesseract-ocr and gImageReader i’ve installed also the program
Xsane, that i will use to scan documents.”


Complete Story

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends, & analysis