Ticket #36 (closed defect: fixed)

Opened 4 years ago

Last modified 17 months ago

Example of image spam that gocr only finds gibberish in, even with pnm processing

Reported by: adam@… Owned by: decoder
Priority: minor Milestone:
Component: Image Analysis Version:
Keywords: example-spam Cc: adam@…

Description

Sorry if this is the wrong way to bring this up, but I couldn't find anything in the FAQ.gz, or Wiki, or IRC channel (or the mailing list, since it's private) that suggests what to do.

Here's an example of an image spam that gocr always interprets as gibberish, even after converting to a 3-color pnm and running "gocr -l 180 -d 2". I'm hoping that by posting this here we can somehow improve the system to detect it well. After all, if thwarting FuzzyOcr? is as simple as sending spam that looks like this...

By the way, why is the mailing list private? Is it to keep spammers from reading it? If so, isn't that like a very thin veil of security-by-obscurity?

Attachments

methodical.gif Download (10.9 KB) - added by adam@… 4 years ago.
Image spam received in Nov, 2006

Change History

Changed 4 years ago by adam@…

Image spam received in Nov, 2006

Changed 4 years ago by decoder

  • status changed from new to closed
  • resolution set to fixed

Hello,

I had a look on the picture, the problem is that it is white on dark background. You need either a pnminvert preprocessor or use ocrad with the -i parameter (for invert). Such pictures are not new, an the newest version of gocr (0.43) even claims to have an autodetection feature built in which automatically switches to an inverted mode when it is required :)

About the mailing list being private, yes it isn't absolutely secure but I guess it helps a bit :)

Best regards,

Chris

Note: See TracTickets for help on using tickets.