Ticket #34 (assigned enhancement)

Opened 2 years ago

Last modified 4 months ago

PostgreSQL support

Reported by: juan@xdracco.net Assigned to: decoder (accepted)
Priority: trivial Milestone: Development Release Version 3.5
Component: Hash Database Version: SVN
Keywords: Cc:

Description

I realize MySQL is still experimental. Is PostgreSQL support scheduled any time soon? I would be more than happy to test out any PostgreSQL implementations.

Attachments

fuzzyocr_pgsql.patch.tar.bz2 (21.6 kB) - added by Juan on 27.12.2006 09:07:48.
FuzzyOcr? w/ PostgreSQL support (patched files)
FuzzyOcr-3.5.0-rc1_PgSQL-patches.tar.gz (27.7 kB) - added by Juan on 28.12.2006 02:37:18.
newest PgSQL patchset

Change History

(in reply to: ↑ description ) 26.12.2006 09:33:20 changed by juan@xdracco.net

I currently have FuzzyOcr? using PostgreSQL 8.1.6. My hack is nothing more than changing a couple of lines in Config.pm:

Line 38 from: use constant HAS_DBD_MYSQL => eval { require DBD::mysql; };

TO

use constant HAS_DBD_MYSQL => eval { require DBD::pg };

Line 164 from: my $dsn = "dbi:mysql:database=".$conf->{focr_mysql_db};

TO

my $dsn = "dbi:pg:database=".$conf->{focr_mysql_db};

Last but not least, DBD-Pg is required from cpan for this to function. I am using DBD-Pg 1.43.

So, as you can see, it's a sure hack but it does work with PostgreSQL. The reason I didn't create a subroutine is because I was unsure of how to get something like this:

use constant HAS_DBD_PG => eval { require DBD::pg };

... and because I am not proficient in perl. =)

I would like to help out if I can so if you have any questions, please contact me at the address I posted with.

26.12.2006 10:35:30 changed by juan@xdracco.net

I had to make some changes to Hashing.pm and Config.pm...

I am pretty sure that even MySQL support doesn't work the way it is currently coded because it called the wrong db by calling $db.$table, $db.$dbfile and $dbfile.* which doesn't work.

To get it to work:

In Hashing.pm:

All instances of $db.$table must be simply $table All instances of $db.$dbfile must be simply $dbfile All instances of $dbfile.* must be simply * (ex: $dbfile.key='$key' should be key='$key')

In Config.pm:

All instances of $db.$tab must be simply $tab All instances of $tab.* must be simply * (ex: $tab.key='$key' should be key='$key')

Lastly, I need to apply a correction to a previous post: The corrected syntax for initiating PostgreSQL is: my $dsn = "dbi:Pg:database=".$conf->{focr_mysql_db};

AND NOT

my $dsn = "dbi:pg:database=".$conf->{focr_mysql_db};

After these changes, hashing is stored in PgSQL and incoming images are compared to previous hashes in PgSQL....

Great piece of work dude!!!!

26.12.2006 21:28:59 changed by decoder

  • status changed from new to assigned.

well it isnt scheduled yet (we are trying to bring out this version as stable first), but it shouldn't be a big deal. To my knowledge, Mysql and Postgresql aren't that different, and since we only use basic features of mysql, it should be easy to modify the code to use postgresql instead. I'll look at the code tomorrow and maybe I can implement this with a patch then :)

Best regards,

Chris

27.12.2006 03:34:41 changed by juan@xdracco.net

Howdy ho,

Well, I no longer wanted to use my production server to test Fuzzy so I set up a test box on my laptop to start playing with the code.

This is what I concluded as being the easiest but I could be wrong.

1. Change all focr_mysql_* settings in the cf file to focr_sql_* for generic purposes 2. Add focr_sql_type. Options mysql/pgsql

I currently modifying the code in 4 files but haven't gotten to the point to test yet.

I will continue with this patch unless you think I should halt. If anything, my perl skills are improving... =)

Thanks for your quick reply!

27.12.2006 03:55:27 changed by juan@xdracco.net

This post is just FYI's.....

Let me know if I can help out in any way... I know how it goes when you really want to get a project to stable and then some dork, like me, happens upon your project and makes a silly, yet probably easy, request. But seriously, no rush dude. My production server is working just fine with PgSQL with my hacked version of 3.5_rc1 + patches.

I am using ocrad and tesseract and no gocr.

I and some others are working on the Gentoo ebuild. So hopefully, when Fuzzy 3.5 goes stable, the ebuild won't be far behind.

I have a postgre schema that you can add with the source.

Juan

27.12.2006 08:01:32 changed by juan@xdracco.net

Hello again,

I was bored, so I added PostgreSQL support.

I have archived the 4 relevant files, the patch and the postgresql dump for your review. It works quite well as far as I can tell (with the execption of that one bug mentioned above)...

Here is the PgSQL patch file (6 files total): http://www.xdracco.net/~dracco/fuzzyocr_pgsql.patch.tar.bz2

I'm a little weary of the patch file (i've never patched in directory format, only single file). So before you enter code|mode, check out what I've done to see if it can be used.

PS: I can't verify that it works with MySQL as I do not use it but it should be seemless as I didn't modify any query statements in and of themselves, with the exception of correcting the variables contained within the query (see my 2nd post)...


I do believe I have found an SQL bug as it occurs with a fresh install as well as with my modifcations. The bug is in the check_image_hash_db subroutine in Hashing.pm. In a nutshell, it doesn't update the hash table. The debug error is:

[6339] warn: DBD::Pg::db do failed: ERROR: syntax error at or near "check" at character 27 [6339] dbg: FuzzyOcr?: update hash set match = '1', check = '1167200229' where key = '255:255:255:255:173820::0:0:0:0:14680'

The line in question is 136 from my modified Hashing.pm file: $sql = qq(update $dbfile set match = '$next', check = '$now' where key = '$key');

27.12.2006 09:07:48 changed by Juan

  • attachment fuzzyocr_pgsql.patch.tar.bz2 added.

FuzzyOcr? w/ PostgreSQL support (patched files)

27.12.2006 09:09:29 changed by Juan

Hehe. I didn't notice the 'attachment' link at the top. Sooooo, I've uploaded the file.

Juan

27.12.2006 10:03:09 changed by Juan

Nope. No bug on your part. Sorry for even bringing it up. It appears to pg related. i am trying to debug it. getting sleepy though... =)

27.12.2006 21:37:04 changed by Juan

Hello,

I installed my patches on one production server last night and after testing it all last night and most of this morning on my laptop, I believe the patch works as expected with the following exceptions:

1. syslog (spamd) reports being unable to find DBD::mysql. Since I am using Postgre, I have no traces of MySQL installed. I am unable to get rid of the message. But regardless of the syslog message, all functionality seems normal (as expected).

Dec 27 11:58:32 zeke spamd[12464]: Can't locate DBD/mysql.pm in @INC (@INC contains: ../etc/mail/spamassassin ../lib /usr/lib/perl5/vendor_perl/5.8.8/i686-linux /usr/lib/perl5/vendor_perl/5.8.8 /etc/perl /usr/lib/perl5/vendor_perl /usr/lib/perl5/site_perl/5.8.8/i686-linux /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl /usr/lib/perl5/5.8.8/i686-linux /usr/lib/perl5/5.8.8 /usr/local/lib/site_perl .) at /etc/mail/spamassassin/FuzzyOcr/Config.pm line 169.

2. When an exact match is detected, the database is not updated (might be Pg related by not sure why it would be since it is simple SQL syntax). I have verified that the image to be updated already exists in the database.

[24226] info: FuzzyOcr?: Found Score <28.5> for Exact Image Hash [24226] info: FuzzyOcr?: Matched [1] time(s). Prev match: 12 hrs. 48 sec. ago [24226] warn: DBD::Pg::db do failed: ERROR: syntax error at or near "check" at character 30 [24226] dbg: FuzzyOcr?: update hash set match = '1', check = '1167251106' where key = '255:255:255:255:173820::0:0:0:0:14680'

Regards,

Juan

28.12.2006 00:35:06 changed by Juan

More info.....

To test, I removed check = '$now' and the hash table successfully updates the match field. I can't figure out why it doesn't work when check = '$now' is the syntax. I've tried different formats including use of the sprintf function.

On the DBD-mysql error.. I've been reading up and it appears this is normal for the eval statement. I didn't realize it allowed the script to continue with non-fatal errors when requirements aren't found. Unless I misunderstood....

Juan

28.12.2006 02:37:18 changed by Juan

  • attachment FuzzyOcr-3.5.0-rc1_PgSQL-patches.tar.gz added.

newest PgSQL patchset

28.12.2006 02:39:16 changed by Juan

It appears 'check' is a reserved word for PostgreSQL. I changed 'check' to 'last_seen' and now the DB updates as expected. I have uploaded the newest files, patch and sql schema...

Juan

So, it looks like the only thing left is to attempt to get rid of the eval info message in syslog...

(follow-up: ↓ 13 ) 28.12.2006 03:06:03 changed by Juan

Documentation on 'check': http://www.postgresql.org/docs/8.1/interactive/ddl-constraints.html

Same applies to 7.x versions so FuzzyOcr? should be compatable with the Pg 7.x/8.x series.

(in reply to: ↑ 12 ) 20.05.2007 06:05:14 changed by anonymous

Replying to Juan:

Documentation on 'check': http://www.postgresql.org/docs/8.1/interactive/ddl-constraints.html Same applies to 7.x versions so FuzzyOcr? should be compatable with the Pg 7.x/8.x series.

Can anyone get these attachments to work? I cannot get either to open.

07.07.2007 03:16:36 changed by anonymous

07.07.2007 19:14:20 changed by anonymous

15.08.2007 02:26:10 changed by anonymous

01.04.2008 09:26:44 changed by anonymous

徐州辉煌钢结构工程有限公司是一家集网架钢结构设计、制作、安装及技术服务为一体的大型专业化企业。公司坐落于有网架之乡美誉的江苏省徐州市,这里是全国优质网架原材料供应基地,也是全国网架技术熟练工人培训基地,有着人才,技术和原材料的地域优势。企业创办多年来,本着"求实创新、开拓进取"的精神,不断引进吸收国内外先进技术经验,汇集来自全国各地从事专业管理,专业设计、制造、检测试验等高级优秀人才,配置了各种先进的成套生产和检测设备,能满足制造生产能力要求的流水生产线。公司始终本着“诚信为本、信守合同、用户至上”的理念,坚持贯彻实践三个“第一”——质量第一、信誉第一,服务第一,企业不断深化改革,深挖潜力,降低成本,以最低的价格吸引客户,以最好的质量服务客户,让辉煌网架钢构建设遍布全国各地,多年来深受广大客户及建设单位的一致好评。   竭诚欢迎各界新老朋友真诚合作、共创辉煌、共享绩效,公司将一如既往地为各界朋友提供优秀的服务!

09.04.2008 08:37:19 changed by anonymous

电子地磅解码器,吨位遥控器/本吨位遥控器引进日本先进技术研制而成,完全采用数字式集成电路技术,采用万能解码数据处理线路,适用于10----150吨以下吨位,无须对地磅作任何改动 具有防拦截,防扫描等优点,解码器安装于车上或离地磅8米以内,在电子称旁40米或60米以内,能控制电子称的数码数据,最小值20公斤,规格10 /15/20,此产品主要产生电子磁场干扰和控制,从而使吨位变大或变小,性能稳定可靠,体积小,遥控主机 解码处理器 如烟盒大小,遥控器配两种型号,隐蔽性强,附件含使用光盘一套,


Add/Change #34 (PostgreSQL support)