root/tags/samples/README

Revision 3, 2.9 kB (checked in by decoder, 2 years ago)

Added current stable and testing release
Added samples
Added patches to external toolchain

Line 
1 These eml files are sample spam emails to test your installation of FuzzyOCR.
2
3 Use spamassassin -t < samplefile.eml to test :)
4
5 ocr-gif.eml: Contains a corrupted gif image, additionally I changed the content-type to jpeg, so the output should show:
6
7  1.5 FUZZY_OCR_WRONG_CTYPE  BODY: Mail contains an image with wrong
8                             content-type set
9                             Image has format "GIF" but content-type is
10                             "image/jpeg"
11  3.0 FUZZY_OCR_CORRUPT_IMG  BODY: Mail contains a corrupted image
12                             Corrupt image: GIF-LIB error: Image is
13                             defective, decoding aborted.
14
15   10 FUZZY_OCR              BODY: Mail contains an image with common spam text inside
16                             Words found:
17                             "stock" with fuzz of 0.2
18                             "price" with fuzz of 0.2
19                             "price" with fuzz of 0.2
20                             "stock" with fuzz of 0
21                             "company" with fuzz of 0
22                             "trade" with fuzz of 0.2
23                             "service" with fuzz of 0.285714285714286
24                             "investor" with fuzz of 0.25
25                             (8 word occurrences found)
26
27 ocr-jpg.eml: Contains a jpeg file. Output should show:
28
29  6.0 FUZZY_OCR              BODY: Mail contains an image with common spam text inside
30                             Words found:
31                             "viagra" with fuzz of 0
32                             "cialis" with fuzz of 0
33                             "viagra" with fuzz of 0
34                             "levitra" with fuzz of 0
35                             (4 word occurrences found)
36
37
38 ocr-png.eml: Contains a png file. Output should show:
39
40   20 FUZZY_OCR              BODY: Mail contains an image with common spam text inside
41                             Words found:
42                             "price" with fuzz of 0.2
43                             "company" with fuzz of 0
44                             "price" with fuzz of 0
45                             "price" with fuzz of 0.2
46                             "software" with fuzz of 0
47                             "investor" with fuzz of 0
48                             "trade" with fuzz of 0.2
49                             "price" with fuzz of 0.2
50                             "service" with fuzz of 0
51                             "software" with fuzz of 0
52                             "company" with fuzz of 0
53                             "service" with fuzz of 0
54                             "stock" with fuzz of 0
55                             "trade" with fuzz of 0
56                             "levitra" with fuzz of 0.285714285714286
57                             "price" with fuzz of 0
58                             "buy" with fuzz of 0
59                             "price" with fuzz of 0.2
60                             (18 word occurrences found)
Note: See TracBrowser for help on using the browser.