a grey geek learning by mistakes

Ubuntu Learner

September 19th, 2007 at 12:16 pm

OCR on Ubuntu

Yes, me musing again. I suppose I’m wondering if I can dump XP altogether. OCR has been one stumbling block, as I need that fairly often.
The solution appears to be Tesseract, and here’s a tutorial on it. It seems pretty torturous, but I can’t say until I try, of course.

Optical Character Recognition With Tesseract OCR On Ubuntu 7.04 on HowtoForge

It says that

To get the best results from tesseract, you have to optimize the images.

I haven’t tried it yet, either, but Phatch might turn out to be very useful if one was scanning in a novel, for example, and had to optimize 250 images or so. Keep an eye on Phatch - it looks great.

Tesseract OCR
Phatch

 

RSS feed for comments on this post | TrackBack URI