Would it be possible to somehow find out what spamassassin thought of the email as a whole, and then process the image hash accordingly?
For example, consider the following mail log snippet:
spamd[16638]: FuzzyOcr: _O 25_hFe1 __I.? ..,.
spamd[16638]: FuzzyOcr: 0iscover the _ew Miracle
spamd[16638]: FuzzyOcr: 5upplement for Satr, _'_ '' '_U
spamd[16638]: FuzzyOcr: * Effective Weight less!. i I
spamd[16638]: FuzzyOcr: ' ~?
spamd[16638]: FuzzyOcr: M'od,o!!R
spamd[16638]: FuzzyOcr: ,_._? _ _ i J _
spamd[16638]: FuzzyOcr:
spamd[16638]: FuzzyOcr: ___m O_p_ __m____
spamd[16638]: FuzzyOcr: <<=end
spamd[16638]: FuzzyOcr: Not enough OCR Hits without space stripping, doing second matching pass...
spamd[16638]: FuzzyOcr: Message is ham, saving...
spamd[16638]: FuzzyOcr: Adding Hash to "/etc/mail/spamassassin/FuzzyOcr/FuzzyOcr.safe.db" with score "0"
spamd[16638]: FuzzyOcr: Digest: <snip>
spamd[16638]: FuzzyOcr: FuzzyOcr ending successfully...
spamd[16638]: FuzzyOcr: Processed in 1.221773 sec.
spamd[16638]: spamd: identified spam (24.4/5.0) for jake:1001 in 4.2 seconds, 29387 bytes
Here, FuzzyOcr? fails to tag the message as spam: none of the words found in the message were in its word file. Fair enough.
Spamassassin however tagged the spam with 24.4!!
Would it be possible to somehow re-processes the image hash and mark it as spam instead of ham?