Ticket #44 (new task)

Opened 2 years ago

Last modified 1 year ago

fuzzyocr : 3.4.2 : gocr : 0.43 : attached sample images not getting scanned

Reported by: indiannic Assigned to: decoder
Priority: major Milestone: Development Release Version 3.4
Component: Image Analysis Version: 3.4
Keywords: Cc:

Description

hi

attached images are not getting detected

how can i tweak the system to detect these

rajesh

Attachments

trumpeter.gif (10.6 kB) - added by indiannic on 02.01.2007 04:05:57.
unseemly.gif (17.2 kB) - added by indiannic on 02.01.2007 04:08:08.
transmission.gif (15.7 kB) - added by jason@peaceworks.ca on 03.01.2007 17:00:12.
added another obfuscated image
spam1.gif (17.7 kB) - added by indiannic on 05.01.2007 14:45:34.
not detected
spam2.gif (16.4 kB) - added by indiannic on 05.01.2007 14:45:59.
not detected
mistreat.gif (18.4 kB) - added by thalamus on 10.01.2007 16:34:46.
similar image that isn't scanned

Change History

02.01.2007 04:05:57 changed by indiannic

  • attachment trumpeter.gif added.

02.01.2007 04:08:08 changed by indiannic

  • attachment unseemly.gif added.

03.01.2007 17:00:12 changed by jason@peaceworks.ca

  • attachment transmission.gif added.

added another obfuscated image

05.01.2007 14:36:54 changed by indiannic

hi

we use : fuzzyocr : 3.4.2 and gocr : 0.43

attached file transmission.gif added by jason@peaceworks.ca is scanned correctly - spamassin report below

9.0 FUZZY_OCR BODY: Mail contains an image with common spam text inside Words found: "otc" in 1 lines "bullish" in 1 lines "target" in 1 lines "drug" in 1 lines "portfolio" in 1 lines "company" in 1 lines "trade" in 1 lines (7 word occurrences found)

rajesh mahadevan

05.01.2007 14:45:34 changed by indiannic

  • attachment spam1.gif added.

not detected

05.01.2007 14:45:59 changed by indiannic

  • attachment spam2.gif added.

not detected

10.01.2007 16:34:46 changed by thalamus

  • attachment mistreat.gif added.

similar image that isn't scanned

10.01.2007 16:36:54 changed by thalamus

This image wasn't detected at all on our installation. Versions:

p5-FuzzyOcr?-2.3.b_2,1 gocr-0.43 p5-Mail-SpamAssassin?-3.1.7_1

(running on FreeBSD)

01.02.2007 11:15:20 changed by simcop2387

I'm not sure how easy this is to add in (or if its been done already, i'm running and old version), but i've done some experimenting with similar images, and i've found that what seems to work is to desaturate the image, then increase the contrast of it (just like many captchas), i've got a sample convert command that will do just that, and makes gocr read quite a bit of the text afterwards

convert -modulate 100,0 -contrast -contrast -contrast -contrast -contrast -contrast -contrast -contrast -contrast -contrast -contrast -contrast -contrast -contrast -monochrome INPUT.GIF OUTPUT.GIF

(thats alot of -contrast anyone know how to specify multiple ones in a shorter line?)

i convert it to monochrome since i found it ends up leaving typically less artifacts afterwards and helps gocr alot.

11.04.2007 15:55:03 changed by Lucky

Guys Im having the same problem here. it looks like this spams are bypassign my fuzzy, is there any way to tweak fuzzyocr to scan all this mails?

11.04.2007 15:57:02 changed by lucky.tladi@gmail.com

Howzit guys, im struggling with the very same problem. I have gocr,ocrad,fuzzyocr but all the spams seems to be bypassing my setup. Is there any way i can tweak my fuzzy to scann all this spams with images embedded on?

Any help would be appreciated.

Kindest Regards, Lucky

07.07.2007 03:26:35 changed by anonymous

15.08.2007 02:25:50 changed by anonymous


Add/Change #44 (fuzzyocr : 3.4.2 : gocr : 0.43 : attached sample images not getting scanned)