Ticket #500 (closed defect: worksforme)
tesseract 2.0.3 wants the source image filename to have a .tif extension
| Reported by: | giampaolo@… | Owned by: | decoder |
|---|---|---|---|
| Priority: | major | Milestone: | |
| Component: | Image Analysis | Version: | 3.5.1 |
| Keywords: | tesseract ABORT maketiff | Cc: |
Description
As of v.2.0.3 of the tesseract package, the source image filename must be terminated by a .tif extension, while FuzzyOcr? terminates it with .out.
This causes the following error messages:
Tesseract Open Source OCR Engine name_to_image_type:Error:Unrecognized image type:/tmp/XXX/prep.maketiff.out IMAGE::read_header:Error:Can't read this image type:/tmp/XXX/prep.maketiff.out /usr/bin/tesseract:Error:Read of file failed:/tmp/XXX/prep.maketiff.out Signal_exit 31 ABORT. LocCode?: 3 AbortCode?: 3
There is a "dirty patch" around by Arjan Opmeer:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=481383
Options seems to be either fix FuzzyOcr? in order to produce correctly extended image filenames (at least in the tiff case), or interact with the tesseract people in order to have the .tiff extension requirement removed.
