Yes, me musing again. I suppose I’m wondering if I can dump XP altogether. OCR has been one stumbling block, as I need that fairly often.
The solution appears to be Tesseract, and here’s a tutorial on it. It seems pretty torturous, but I can’t say until I try, of course.
Optical Character Recognition With Tesseract OCR On Ubuntu 7.04 on HowtoForge
It says that
To get the best results from tesseract, you have to optimize the images.
I haven’t tried it yet, either, but Phatch might turn out to be very useful if one was scanning in a novel, for example, and had to optimize 250 images or so. Keep an eye on Phatch - it looks great.
