Ticket #500 (closed defect: worksforme)

Opened 2 years ago

Last modified 13 months ago

tesseract 2.0.3 wants the source image filename to have a .tif extension

Reported by: giampaolo@… Owned by: decoder
Priority: major Milestone:
Component: Image Analysis Version: 3.5.1
Keywords: tesseract ABORT maketiff Cc:

Description

As of v.2.0.3 of the tesseract package, the source image filename must be terminated by a .tif extension, while FuzzyOcr? terminates it with .out.

This causes the following error messages:

Tesseract Open Source OCR Engine name_to_image_type:Error:Unrecognized image type:/tmp/XXX/prep.maketiff.out IMAGE::read_header:Error:Can't read this image type:/tmp/XXX/prep.maketiff.out /usr/bin/tesseract:Error:Read of file failed:/tmp/XXX/prep.maketiff.out Signal_exit 31 ABORT. LocCode?: 3 AbortCode?: 3

There is a "dirty patch" around by Arjan Opmeer:

 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=481383

Options seems to be either fix FuzzyOcr? in order to produce correctly extended image filenames (at least in the tiff case), or interact with the tesseract people in order to have the .tiff extension requirement removed.

Change History

Changed 17 months ago by Dan

Thank you!

Ive been trying to figure out for 2½ days now, what was wrong with my TIFF format. And then i found your bugreport showing me its simply a file-extension issue. Tesseract should at LEAST be more specific in its errorcode.

Regards

 Dan

Changed 17 months ago by decoder

  • status changed from new to closed
  • resolution set to worksforme
Note: See TracTickets for help on using tickets.