Ticket #416 (closed defect: wontfix)

Opened 11 months ago

Last modified 11 months ago

yahoo images scored as wrong extension

Reported by: AnonymousDog Assigned to: decoder
Priority: major Milestone: Development Release Version 3.5
Component: Image Analysis Version: 3.5.1
Keywords: yahoo image wrong extension Cc:

Description

"Defect" seems harsh: I think yahoo has changed something in the way they are doing inline images. FOCR finds the images, but they are of type <img=foo src="cid:2213604777000000@web33509.mail.mud.yahoo.com"...>; so it IDs those as having the wrong extension, ".com". Can we eliminate this FP? Attached: message src and focr log snippet.

Attachments

FOCRlog.txt (1.0 kB) - added by anonymous on 09.10.2007 17:47:30.
messagesrc.txt (6.0 kB) - added by Anonymous Dog on 09.10.2007 17:48:12.
Message Source

Change History

09.10.2007 17:47:30 changed by anonymous

  • attachment FOCRlog.txt added.

09.10.2007 17:48:12 changed by Anonymous Dog

  • attachment messagesrc.txt added.

Message Source

09.10.2007 17:50:58 changed by Anonymous Dog

Also, it's scoring FUZZY_OCR_WRONG_EXTENSION as many times as there are images in the email (multiple times). This seems overkill; how can we configure not to score multiple hits?

10.10.2007 22:15:21 changed by decoder

  • status changed from new to closed.
  • resolution set to wontfix.

Hi,

currently, there is no way to exclude these FPs, what you can do is to generally ignore wrong extensions by setting the score (focr_wrongext_score) to 0.0 (or something acceptably low) in the config file. There is also no way to score this once for all images as the criteria which is scored is based on each image. However, with lowering the score you should be fine.

Best regards,

Chris

10.10.2007 23:50:32 changed by Anonymous Dog

Thanks for your response, Chris.


Add/Change #416 (yahoo images scored as wrong extension)




Action