dcsimg
Linux Today: Linux News On Internet Time.





More on LinuxToday


Tesseract-ocr: convert scanned images into editable documents on Linux

Apr 25, 2011, 19:07 (0 Talkback[s])

[ Thanks to linuxaria for this link. ]

"In other words, using the program Tesseract-ocr (which uses this technology), if take a piece of newspaper and we scan it in our scanner, we get an image file (jpeg, tiff, etc …) from which we can extrapolate a the text document and save it as a normal txt that you can change, according to our convenience or purpose.

"Hoping to make an useful thing, I tried to come to a procedure as simple and less invasive as possible, drawing on some material on the web, to enable all interested in the subject to do with Ubuntu or Linux in general what still keeps them tied to Windows.

"In this guide, I've used Ubuntu 10.10, and in addition to Tesseract-ocr and gImageReader i've installed also the program Xsane, that i will use to scan documents."

Complete Story

Related Stories: