Ticket #110 (closed defect: fixed)

Opened 2 years ago

Last modified 1 month ago

temp directory issue breaking pamthreshold

Reported by: bbggll Assigned to: decoder
Priority: critical Milestone: Development Release Version 3.5
Component: Image Analysis Version: 3.5.1
Keywords: Cc:

Description

Error running preprocessor(pamthreshold): pamthreshold -simple -threshold 0.5

Attachments

tutu.res (45.7 kB) - added by on on 01.11.2007 04:05:07.
Output of spamassassin --debug

Change History

(follow-up: ↓ 9 ) 21.03.2007 10:51:44 changed by jump

I have the same issue. It's a bug ?

2007-03-21 00:19:06 [29046] Skipping scanset because of errors, trying next... 2007-03-21 00:19:06 [29046] Error running preprocessor(pamthreshold): /usr/bin/pamthreshold -simple -threshold 0.5 2007-03-21 00:19:06 [29046] Errors in Scanset "ocrad-decolorize" 2007-03-21 00:19:06 [29046] Return code: 256, Error: pamthreshold: bad magic number - not a PAM, PPM, PGM, or PBM file

(follow-up: ↓ 4 ) 03.04.2007 22:06:14 changed by anonymous

i guess this here describes the same problem: http://people.debian.org/~terpstra/thread/20070314.135229.92056e9a.en.html

unfortunately i don't understand a word :-(

maybe someone here with polish-knowledge?

05.04.2007 16:04:15 changed by anonymous

btw. workaround: In the file "FuzzyOcr?.scansets" replace all lines containing

preprocessors = ppmtopgm, pamthreshold, pamtopnm

with

preprocessors = ppmtopgm, pamthreshold, pamtopnm

There seems to be a license problem (?)..

cheers

(in reply to: ↑ 2 ) 25.04.2007 14:34:36 changed by anonymous

Replying to anonymous:

i guess this here describes the same problem: http://people.debian.org/~terpstra/thread/20070314.135229.92056e9a.en.html unfortunately i don't understand a word :-( maybe someone here with polish-knowledge?

I was the author of mentioned thread ;) But it was discussion about well known (http://www200.pair.com/mecham/spam/image_spam2.html) Debian problem with netpbm packet which does not support pamtreshold. Right now I work with manually compiled netpbm and all old pamtreshold problems are solved - Fuzzy works fine!

Unfortunatelly I noticed quite new errors in my logs lately:

2007-04-25 09:59:16 [27106] Error running preprocessor pamthreshold): /usr/local/netpbm/bin/pamthreshold -simple -threshold 0.5 2007-04-25 09:59:16 [27106] Errors in Scanset "ocrad-decolorize-invert" 2007-04-25 09:59:16 [27106] Return code: 256, Error: pamthreshold: bad magic number - not a PAM, PPM, PGM, or PBM file 2007-04-25 09:59:16 [27106] Skipping scanset because of errors, trying next...

These logs are very similar to yours (but does not concern the Polish thread you mentioned above imho). But I still have no idea what is going on... :(

07.07.2007 03:30:19 changed by anonymous

15.08.2007 02:26:08 changed by anonymous

01.11.2007 04:05:07 changed by on

  • attachment tutu.res added.

Output of spamassassin --debug

01.11.2007 04:16:22 changed by on

I have seen the same error, and I wanted to understand so I did some digging.

I ran by hand the preprocessor chain, and when runing by hand (ppmtopgm, then pamthreshold) there is no error.

Now I compared the file created by hand with ppmtopgm and the one created by FuzzyOcr?. The second one is a tad bit longer than the first:

mail<on>: /usr/local/bin/ppmtopgm English-for-Teachers.jpg.pnm >toto
mail<on>: ll toto prep.ppmtopgm.out
804819 432 -rw-r-----  1 on  csimstaff  422265 Nov  1 08:51 prep.ppmtopgm.out
804840 336 -rw-r--r--  1 on  csimstaff  317215 Nov  1 09:34 toto

I suspect that it is the trailing garbage that could cause pamthreshold to hang. Both files diplay OK with xv.

After removing the garbage, I find that pamthreshold is working.

So where is the garbage coming from?

I went on examining the various scans that take place. For that given image, the preprocessing with ppmtopgm and pamthreshold is working fine a first time and failing the next time, I will attach the output of a spamassassing --debug on that piece of email.

What I note:

1)

[98464] dbg: FuzzyOcr: Saved pid: 98504
[98504] dbg: FuzzyOcr: Exec : /usr/local/bin/ppmtopgm
[98504] dbg: FuzzyOcr: Stdin : </tmp/.spamassassin9846478Qc8Wtmp/Teaching-English.jpg.pnm
[98504] dbg: FuzzyOcr: Stdout: >/tmp/.spamassassin98464801k17tmp/prep.ppmtopgm.out
[98504] dbg: FuzzyOcr: Stderr: >/tmp/.spamassassin98464801k17tmp/prep.ppmtopgm.err

2)

[98553] dbg: FuzzyOcr: Exec : /usr/local/bin/ppmtopgm
[98553] dbg: FuzzyOcr: Stdin : </tmp/.spamassassin98464801k17tmp/English-for-Teachers.jpg.pnm
[98553] dbg: FuzzyOcr: Stdout: >/tmp/.spamassassin98464801k17tmp/prep.ppmtopgm.out
[98553] dbg: FuzzyOcr: Stderr: >/tmp/.spamassassin98464801k17tmp/prep.ppmtopgm.err

In 1) pamthreshold is working, not in 2)

Note that in 1) the ppmtopgm is creating the temporary images in the same directory as in 2), while it is a different image, why reusing the directory of some other guy?

There are also other images being scanned between 1) and 2), all are using the same temporary directory as in 2) !

A wild guess is that Perl is mixing the various temporary directories when forking the various image scans, would it happen with every messages containing more than one image?

(in reply to: ↑ 1 ) 08.11.2007 05:11:49 changed by on

  • summary changed from error in log to temp directory issue breaking pamthreshold.

I think that may be the answer:

--- FuzzyOcr/Config.pm     Sun Jan  7 19:05:18 2007
+++ FuzzyOcr/Config.pm  Thu Nov  8 10:59:42 2007
@@ -109,6 +109,7 @@
 }
 
 sub get_tmpdir {
+    my $tmpdir=pop @tmpdirs;
     return $tmpdir;
 }

Reading through the code and trying to understand how the directory allocation is working, I found that get_tmpdir returns some improbable variable, withe the counterpart set_tmpdir pushes the directory to an array.

From one test it solved the pahmthreshold issue.

(in reply to: ↑ description ) 15.11.2007 07:38:44 changed by on

  • status changed from new to closed.
  • resolution set to fixed.

Corrected see ticket #421

01.04.2008 06:33:33 changed by anonymous

徐州辉煌钢结构工程有限公司是一家集网架钢结构设计、制作、安装及技术服务为一体的大型专业化企业。公司坐落于有网架之乡美誉的江苏省徐州市,这里是全国优质网架原材料供应基地,也是全国网架技术熟练工人培训基地,有着人才,技术和原材料的地域优势。企业创办多年来,本着"求实创新、开拓进取"的精神,不断引进吸收国内外先进技术经验,汇集来自全国各地从事专业管理,专业设计、制造、检测试验等高级优秀人才,配置了各种先进的成套生产和检测设备,能满足制造生产能力要求的流水生产线。公司始终本着“诚信为本、信守合同、用户至上”的理念,坚持贯彻实践三个“第一”——质量第一、信誉第一,服务第一,企业不断深化改革,深挖潜力,降低成本,以最低的价格吸引客户,以最好的质量服务客户,让辉煌网架钢构建设遍布全国各地,多年来深受广大客户及建设单位的一致好评。   竭诚欢迎各界新老朋友真诚合作、共创辉煌、共享绩效,公司将一如既往地为各界朋友提供优秀的服务!

01.04.2008 09:26:49 changed by anonymous

徐州辉煌钢结构工程有限公司是一家集网架钢结构设计、制作、安装及技术服务为一体的大型专业化企业。公司坐落于有网架之乡美誉的江苏省徐州市,这里是全国优质网架原材料供应基地,也是全国网架技术熟练工人培训基地,有着人才,技术和原材料的地域优势。企业创办多年来,本着"求实创新、开拓进取"的精神,不断引进吸收国内外先进技术经验,汇集来自全国各地从事专业管理,专业设计、制造、检测试验等高级优秀人才,配置了各种先进的成套生产和检测设备,能满足制造生产能力要求的流水生产线。公司始终本着“诚信为本、信守合同、用户至上”的理念,坚持贯彻实践三个“第一”——质量第一、信誉第一,服务第一,企业不断深化改革,深挖潜力,降低成本,以最低的价格吸引客户,以最好的质量服务客户,让辉煌网架钢构建设遍布全国各地,多年来深受广大客户及建设单位的一致好评。   竭诚欢迎各界新老朋友真诚合作、共创辉煌、共享绩效,公司将一如既往地为各界朋友提供优秀的服务!

09.04.2008 08:38:05 changed by anonymous

电子地磅解码器,吨位遥控器/本吨位遥控器引进日本先进技术研制而成,完全采用数字式集成电路技术,采用万能解码数据处理线路,适用于10----150吨以下吨位,无须对地磅作任何改动 具有防拦截,防扫描等优点,解码器安装于车上或离地磅8米以内,在电子称旁40米或60米以内,能控制电子称的数码数据,最小值20公斤,规格10 /15/20,此产品主要产生电子磁场干扰和控制,从而使吨位变大或变小,性能稳定可靠,体积小,遥控主机 解码处理器 如烟盒大小,遥控器配两种型号,隐蔽性强,附件含使用光盘一套,

20.07.2008 04:04:49 changed by anonymous

status changed from new to closed. resolution set to fixed. Corrected see ticket #421

20.07.2008 04:05:23 changed by anonymous

20.07.2008 04:05:55 changed by anonymous

电子地磅解码器,吨位遥控器/本吨位遥控器引进日本先进技术研制而成,完全采用数字式集成电路技术,采用万能解码数据处理线路,适用于10----150吨以下吨位,无须对地磅作任何改动 具有防拦截,防扫描等优点,解码器安装于车上或离地磅8米以内,在电子称旁40米或60米以内,能控制电子称的数码数据,最小值20公斤,规格10 /15/20,此产品主要产生电子磁场干扰和控制,从而使吨位变大或变小,性能稳定可靠,体积小,遥控主机 解码处理器 如烟盒大小,遥控器配两种型号,隐蔽性强,附件含使用光盘一套,


Add/Change #110 (temp directory issue breaking pamthreshold)




Action