|
|
|
Bug with HTML messages?
|
Previous Topic
Next Topic
|
| Message |
Author |
Posted: Thu Jul 29, 2010 1:00 pm Subject: Bug with HTML messages? |
|
|
Denis Beauchemin
|
|
Hello,
I just deployed the latest MS (4.80.10) on a fully patched RHEL 5.5
system and it is behaving strangely with HTML-only emails in Thunderbird
3.1 (didn't test others). I tested it with an email with plenty of spam
content but it got delivered with a really low score: 1.5. I then tested
it with the same email sent with *both* HTML and text and it got a score
of 13.
Could there be a problem with some Perl-HTML module?
SA scoring for HTML-only email: SpamAssassin (not cached,
score=1.542, requis 4.5, HTML_MESSAGE 0.00,
HTML_TAG_BALANCE_HEAD 0.82, MIME_HTML_ONLY 0.72,
TVD_SPACE_RATIO 0.00)
SA scoring for HTML+TXT email: (not cached,
score=13.313, requis 4.5, BAYES_00 -1.90, DEAR_SOMETHING 1.97,
DRUGS_ERECTILE 1.99, DRUGS_ERECTILE_OBFU 1.11, DRUG_ED_CAPS 0.94,
HTML_MESSAGE 0.00, KAM_VIAGRA1 3.00, KAM_VIAGRA5 3.10,
KAM_VIAGRA6 3.10)
Am I the only one with this problem? I was also able to replicate it on
a much older MS setup (v. 4.63.2):
SA scoring with HTML-only email: (not cached, score=12.639, requis 4.5,
autolearn=spam, DEAR_SOMETHING 1.60, DRUGS_ERECTILE 0.28,
DRUGS_ERECTILE_OBFU 1.23, DRUG_ED_CAPS 0.32, HTML_MESSAGE 0.00,
KAM_VIAGRA1 3.00, KAM_VIAGRA5 3.10, KAM_VIAGRA6 3.10)
SA scoring with HTML+TXT email: (not cached, score=7.181, requis 4.5,
DCC_CHECK 2.17, HTML_MESSAGE 0.00, HTML_TAG_BALANCE_HEAD 1.33,
MIME_HTML_ONLY 1.46, TVD_SPACE_RATIO 2.22)
The output from MailScanner --version on the newest server:
Running on
Linux smtpe3.usherbrooke.ca 2.6.18-194.8.1.el5PAE #1 SMP Wed Jun 23
11:16:22 EDT 2010 i686 i686 i386 GNU/Linux
This is Red Hat Enterprise Linux Server release 5.5 (Tikanga)
This is Perl version 5.008008 (5.8.8)
This is MailScanner version 4.80.10
Module versions are:
1.00 AnyDBM_File
1.30 Archive::Zip
0.23 bignum
1.04 Carp
1.42 Compress::Zlib
1.119 Convert::BinHex
0.17 Convert::TNEF
2.121_08 Data::Dumper
2.27 Date::Parse
1.00 DirHandle
1.05 Fcntl
2.74 File::Basename
2.09 File::Copy
2.01 FileHandle
1.08 File::Path
0.20 File::Temp
0.90 Filesys::Df
3.64 HTML::Entities
3.64 HTML::Parser
3.57 HTML::TokeParser
1.23 IO
1.14 IO::File
1.13 IO::Pipe
2.04 Mail::Header
1.89 Math::BigInt
0.22 Math::BigRat
3.05 MIME::Base64
5.427 MIME::Decoder
5.427 MIME::Decoder::UU
5.427 MIME::Head
5.427 MIME::Parser
3.03 MIME::QuotedPrint
5.427 MIME::Tools
0.13 Net::CIDR
1.25 Net::IP
0.16 OLE::Storage_Lite
1.04 Pod::Escapes
3.05 Pod::Simple
1.09 POSIX
1.21 Scalar::Util
1.78 Socket
2.16 Storable
1.4 Sys::Hostname::Long
0.27 Sys::Syslog
1.26 Test::Pod
0.86 Test::Simple
1.9717 Time::HiRes
1.02 Time::localtime
Optional module versions are:
1.39_01 Archive::Tar
0.23 bignum
1.82 Business::ISBN
1.10 Business::ISBN::Data
1.08 Data::Dump
1.814 DB_File
1.25 DBD::SQLite
1.607 DBI
1.15 Digest
1.01 Digest::HMAC
2.36 Digest::MD5
2.11 Digest::SHA1
1.00 Encode::Detect
0.17008 Error
0.18 ExtUtils::CBuilder
2.18 ExtUtils::ParseXS
2.38 Getopt::Long
0.44 Inline
1.08 IO::String
1.04 IO::Zlib
2.21 IP::Country
missing Mail::ClamAV
3.003001 Mail::SpamAssassin
v2.004 Mail::SPF
1.999001 Mail::SPF::Query
0.2808 Module::Build
0.20 Net::CIDR::Lite
0.65 Net::DNS
0.002.2 Net::DNS::Resolver::Programmable
0.33 Net::LDAP
4.004 NetAddr::IP
1.94 Parse::RecDescent
missing SAVI
2.64 Test::Harness
0.95 Test::Manifest
1.98 Text::Balanced
1.35 URI
0.7203 version
0.62 YAML
Thanks!
Denis
--
Denis Beauchemin, analyste
Université de Sherbrooke, S.T.I.
T: 819.821.8000x62252 F: 819.821.8045 |
| Back to top |
|
 |
Posted: Thu Jul 29, 2010 1:09 pm Subject: Bug with HTML messages? |
|
|
Denis Beauchemin
|
|
Oops, I switched those 2:
| Quote: | SA scoring with HTML-only email: (not cached, score=12.639, requis
4.5, autolearn=spam, DEAR_SOMETHING 1.60, DRUGS_ERECTILE 0.28,
DRUGS_ERECTILE_OBFU 1.23, DRUG_ED_CAPS 0.32, HTML_MESSAGE 0.00,
KAM_VIAGRA1 3.00, KAM_VIAGRA5 3.10, KAM_VIAGRA6 3.10)
SA scoring with HTML+TXT email: (not cached, score=7.181, requis 4.5,
DCC_CHECK 2.17, HTML_MESSAGE 0.00, HTML_TAG_BALANCE_HEAD 1.33,
MIME_HTML_ONLY 1.46, TVD_SPACE_RATIO 2.22)
|
SA scoring with HTML+TXT email: (not cached, score=12.639, requis 4.5,
autolearn=spam, DEAR_SOMETHING 1.60, DRUGS_ERECTILE 0.28,
DRUGS_ERECTILE_OBFU 1.23, DRUG_ED_CAPS 0.32, HTML_MESSAGE 0.00,
KAM_VIAGRA1 3.00, KAM_VIAGRA5 3.10, KAM_VIAGRA6 3.10)
SA scoring with HTML-only email: (not cached, score=7.181, requis 4.5,
DCC_CHECK 2.17, HTML_MESSAGE 0.00, HTML_TAG_BALANCE_HEAD 1.33,
MIME_HTML_ONLY 1.46, TVD_SPACE_RATIO 2.22)
Denis
--
Denis Beauchemin, analyste
Université de Sherbrooke, S.T.I.
T: 819.821.8000x62252 F: 819.821.8045 |
| Back to top |
|
 |
Posted: Thu Jul 29, 2010 1:57 pm Subject: Bug with HTML messages? |
|
|
Alex Broens
|
|
On 2010-07-29 20:07, Denis Beauchemin wrote:
| Quote: | Oops, I switched those 2:
| Quote: | SA scoring with HTML-only email: (not cached, score=12.639, requis
4.5, autolearn=spam, DEAR_SOMETHING 1.60, DRUGS_ERECTILE 0.28,
DRUGS_ERECTILE_OBFU 1.23, DRUG_ED_CAPS 0.32, HTML_MESSAGE 0.00,
KAM_VIAGRA1 3.00, KAM_VIAGRA5 3.10, KAM_VIAGRA6 3.10)
SA scoring with HTML+TXT email: (not cached, score=7.181, requis 4.5,
DCC_CHECK 2.17, HTML_MESSAGE 0.00, HTML_TAG_BALANCE_HEAD 1.33,
MIME_HTML_ONLY 1.46, TVD_SPACE_RATIO 2.22)
|
SA scoring with HTML+TXT email: (not cached, score=12.639, requis 4.5,
autolearn=spam, DEAR_SOMETHING 1.60, DRUGS_ERECTILE 0.28,
DRUGS_ERECTILE_OBFU 1.23, DRUG_ED_CAPS 0.32, HTML_MESSAGE 0.00,
KAM_VIAGRA1 3.00, KAM_VIAGRA5 3.10, KAM_VIAGRA6 3.10)
SA scoring with HTML-only email: (not cached, score=7.181, requis 4.5,
DCC_CHECK 2.17, HTML_MESSAGE 0.00, HTML_TAG_BALANCE_HEAD 1.33,
MIME_HTML_ONLY 1.46, TVD_SPACE_RATIO 2.22)
|
you are not feeding it consistent content.
where is the problem?
get a real spam wich was tagged as spam and dumped in your quaratine.
feed that to spamassassin (without MS), is the result consistent?
if not, its usually due to MS's msg chunk settings |
| Back to top |
|
 |
Posted: Thu Jul 29, 2010 2:05 pm Subject: Bug with HTML messages? |
|
|
Denis Beauchemin
|
|
Le 2010-07-29 14:57, Alex Broens a écrit :
| Quote: | On 2010-07-29 20:07, Denis Beauchemin wrote:
| Quote: | Oops, I switched those 2:
| Quote: | SA scoring with HTML-only email: (not cached, score=12.639, requis
4.5, autolearn=spam, DEAR_SOMETHING 1.60, DRUGS_ERECTILE 0.28,
DRUGS_ERECTILE_OBFU 1.23, DRUG_ED_CAPS 0.32, HTML_MESSAGE 0.00,
KAM_VIAGRA1 3.00, KAM_VIAGRA5 3.10, KAM_VIAGRA6 3.10)
SA scoring with HTML+TXT email: (not cached, score=7.181, requis
4.5, DCC_CHECK 2.17, HTML_MESSAGE 0.00, HTML_TAG_BALANCE_HEAD 1.33,
MIME_HTML_ONLY 1.46, TVD_SPACE_RATIO 2.22)
|
SA scoring with HTML+TXT email: (not cached, score=12.639, requis
4.5, autolearn=spam, DEAR_SOMETHING 1.60, DRUGS_ERECTILE 0.28,
DRUGS_ERECTILE_OBFU 1.23, DRUG_ED_CAPS 0.32, HTML_MESSAGE 0.00,
KAM_VIAGRA1 3.00, KAM_VIAGRA5 3.10, KAM_VIAGRA6 3.10)
SA scoring with HTML-only email: (not cached, score=7.181, requis
4.5, DCC_CHECK 2.17, HTML_MESSAGE 0.00, HTML_TAG_BALANCE_HEAD 1.33,
MIME_HTML_ONLY 1.46, TVD_SPACE_RATIO 2.22)
|
you are not feeding it consistent content.
where is the problem?
get a real spam wich was tagged as spam and dumped in your quaratine.
feed that to spamassassin (without MS), is the result consistent?
if not, its usually due to MS's msg chunk settings
|
Alex,
When I feed the emails to SA they get scored much higher than through MS.
My point is that an HTML-only email with quite common spam words are not
being scored, while an HTML+TXT email with the same spam words get
scored. This looks quite suspicious to me.
Denis
--
Denis Beauchemin, analyste
Université de Sherbrooke, S.T.I.
T: 819.821.8000x62252 F: 819.821.8045 |
| Back to top |
|
 |
Posted: Thu Jul 29, 2010 2:51 pm Subject: Bug with HTML messages? |
|
|
Martin Hepworth
|
|
Denis
make sure MS "Run as User" can see all the rules etc.
I presume you're testing the SA scores with the same user as MS is running as.
Martin
2010/7/29 Denis Beauchemin <Denis.Beauchemin@usherbrooke.ca (Denis.Beauchemin@usherbrooke.ca)>
| Quote: |
Le 2010-07-29 14:57, Alex Broens a écrit :
| Quote: | On 2010-07-29 20:07, Denis Beauchemin wrote:
| Quote: | Oops, I switched those 2:
| Quote: | SA scoring with HTML-only email: (not cached, score=12.639, requis 4.5, autolearn=spam, DEAR_SOMETHING 1.60, DRUGS_ERECTILE 0.28, DRUGS_ERECTILE_OBFU 1.23, DRUG_ED_CAPS 0.32, HTML_MESSAGE 0.00, KAM_VIAGRA1 3.00, KAM_VIAGRA5 3.10, KAM_VIAGRA6 3.10)
SA scoring with HTML+TXT email: (not cached, score=7.181, requis 4.5, DCC_CHECK 2.17, HTML_MESSAGE 0.00, HTML_TAG_BALANCE_HEAD 1.33, MIME_HTML_ONLY 1.46, TVD_SPACE_RATIO 2.22)
|
SA scoring with HTML+TXT email: (not cached, score=12.639, requis 4.5, autolearn=spam, DEAR_SOMETHING 1.60, DRUGS_ERECTILE 0.28, DRUGS_ERECTILE_OBFU 1.23, DRUG_ED_CAPS 0.32, HTML_MESSAGE 0.00, KAM_VIAGRA1 3.00, KAM_VIAGRA5 3.10, KAM_VIAGRA6 3.10)
SA scoring with HTML-only email: (not cached, score=7.181, requis 4.5, DCC_CHECK 2.17, HTML_MESSAGE 0.00, HTML_TAG_BALANCE_HEAD 1.33, MIME_HTML_ONLY 1.46, TVD_SPACE_RATIO 2.22)
|
you are not feeding it consistent content.
where is the problem?
get a real spam wich was tagged as spam and dumped in your quaratine.
feed that to spamassassin (without MS), is the result consistent?
if not, its usually due to MS's msg chunk settings
|
Alex,
When I feed the emails to SA they get scored much higher than through MS.
My point is that an HTML-only email with quite common spam words are not being scored, while an HTML+TXT email with the same spam words get scored. This looks quite suspicious to me.
Denis
--
Denis Beauchemin, analyste
Université de Sherbrooke, S.T.I.
T: 819.821.8000x62252 F: 819.821.8045
--
MailScanner mailing list
mailscanner@lists.mailscanner.info (mailscanner@lists.mailscanner.info)
http://lists.mailscanner.info/mailman/listinfo/mailscanner
Before posting, read http://wiki.mailscanner.info/posting
Support MailScanner development - buy the book off the website!
|
--
Martin Hepworth
Oxford, UK
-- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. |
| Back to top |
|
 |
Posted: Thu Jul 29, 2010 2:58 pm Subject: Bug with HTML messages? |
|
|
Denis Beauchemin
|
|
Le 2010-07-29 15:51, Martin Hepworth a écrit :
| Quote: | Denis
make sure MS "Run as User" can see all the rules etc.
I presume you're testing the SA scores with the same user as MS is
running as.
Martin
|
Hi Martin,
Yes, run as user = root and I am also testing as root.
Denis
--
Denis Beauchemin, analyste
Université de Sherbrooke, S.T.I.
T: 819.821.8000x62252 F: 819.821.8045 |
| Back to top |
|
 |
Posted: Thu Jul 29, 2010 4:09 pm Subject: Bug with HTML messages? |
|
|
Alex Broens
|
|
On 2010-07-29 21:04, Denis Beauchemin wrote:
| Quote: |
Le 2010-07-29 14:57, Alex Broens a écrit :
| Quote: | On 2010-07-29 20:07, Denis Beauchemin wrote:
| Quote: | Oops, I switched those 2:
| Quote: | SA scoring with HTML-only email: (not cached, score=12.639, requis
4.5, autolearn=spam, DEAR_SOMETHING 1.60, DRUGS_ERECTILE 0.28,
DRUGS_ERECTILE_OBFU 1.23, DRUG_ED_CAPS 0.32, HTML_MESSAGE 0.00,
KAM_VIAGRA1 3.00, KAM_VIAGRA5 3.10, KAM_VIAGRA6 3.10)
SA scoring with HTML+TXT email: (not cached, score=7.181, requis
4.5, DCC_CHECK 2.17, HTML_MESSAGE 0.00, HTML_TAG_BALANCE_HEAD 1.33,
MIME_HTML_ONLY 1.46, TVD_SPACE_RATIO 2.22)
|
SA scoring with HTML+TXT email: (not cached, score=12.639, requis
4.5, autolearn=spam, DEAR_SOMETHING 1.60, DRUGS_ERECTILE 0.28,
DRUGS_ERECTILE_OBFU 1.23, DRUG_ED_CAPS 0.32, HTML_MESSAGE 0.00,
KAM_VIAGRA1 3.00, KAM_VIAGRA5 3.10, KAM_VIAGRA6 3.10)
SA scoring with HTML-only email: (not cached, score=7.181, requis
4.5, DCC_CHECK 2.17, HTML_MESSAGE 0.00, HTML_TAG_BALANCE_HEAD 1.33,
MIME_HTML_ONLY 1.46, TVD_SPACE_RATIO 2.22)
|
you are not feeding it consistent content.
where is the problem?
get a real spam wich was tagged as spam and dumped in your quaratine.
feed that to spamassassin (without MS), is the result consistent?
if not, its usually due to MS's msg chunk settings
|
Alex,
When I feed the emails to SA they get scored much higher than through MS.
My point is that an HTML-only email with quite common spam words are not
being scored, while an HTML+TXT email with the same spam words get
scored. This looks quite suspicious to me.
Denis
|
what are your MS "chunk" settings?
are you sure MS is sending the full message to SA?
pls post the sample message you're using in pastebin so ppl can try to
reproduce.
Alex
--
MailScanner mailing list
mailscanner@lists.mailscanner.info
http://lists.mailscanner.info/mailman/listinfo/mailscanner
Before posting, read http://wiki.mailscanner.info/posting
Support MailScanner development - buy the book off the website! |
| Back to top |
|
 |
Posted: Thu Jul 29, 2010 8:14 pm Subject: Bug with HTML messages? |
|
|
Denis Beauchemin
|
|
Le 2010-07-29 17:09, Alex Broens a écrit :
| Quote: | On 2010-07-29 21:04, Denis Beauchemin wrote:
| Quote: |
Le 2010-07-29 14:57, Alex Broens a écrit :
| Quote: | On 2010-07-29 20:07, Denis Beauchemin wrote:
| Quote: | Oops, I switched those 2:
| Quote: | SA scoring with HTML-only email: (not cached, score=12.639, requis
4.5, autolearn=spam, DEAR_SOMETHING 1.60, DRUGS_ERECTILE 0.28,
DRUGS_ERECTILE_OBFU 1.23, DRUG_ED_CAPS 0.32, HTML_MESSAGE 0.00,
KAM_VIAGRA1 3.00, KAM_VIAGRA5 3.10, KAM_VIAGRA6 3.10)
SA scoring with HTML+TXT email: (not cached, score=7.181, requis
4.5, DCC_CHECK 2.17, HTML_MESSAGE 0.00, HTML_TAG_BALANCE_HEAD
1.33, MIME_HTML_ONLY 1.46, TVD_SPACE_RATIO 2.22)
|
SA scoring with HTML+TXT email: (not cached, score=12.639, requis
4.5, autolearn=spam, DEAR_SOMETHING 1.60, DRUGS_ERECTILE 0.28,
DRUGS_ERECTILE_OBFU 1.23, DRUG_ED_CAPS 0.32, HTML_MESSAGE 0.00,
KAM_VIAGRA1 3.00, KAM_VIAGRA5 3.10, KAM_VIAGRA6 3.10)
SA scoring with HTML-only email: (not cached, score=7.181, requis
4.5, DCC_CHECK 2.17, HTML_MESSAGE 0.00, HTML_TAG_BALANCE_HEAD 1.33,
MIME_HTML_ONLY 1.46, TVD_SPACE_RATIO 2.22)
|
you are not feeding it consistent content.
where is the problem?
get a real spam wich was tagged as spam and dumped in your quaratine.
feed that to spamassassin (without MS), is the result consistent?
if not, its usually due to MS's msg chunk settings
|
Alex,
When I feed the emails to SA they get scored much higher than through
MS.
My point is that an HTML-only email with quite common spam words are
not being scored, while an HTML+TXT email with the same spam words
get scored. This looks quite suspicious to me.
Denis
|
what are your MS "chunk" settings?
are you sure MS is sending the full message to SA?
pls post the sample message you're using in pastebin so ppl can try to
reproduce.
Alex
|
Alex,
I don't think my chunk settings really matter since the email is really
short: http://pastebin.com/sMf7rW6s for the HTML-only version and
http://pastebin.com/eiTTWuer for the HTML+TXT version.
Some MS settings that were changed from the default values:
Max Spam Check Size = 500000
Max SpamAssassin Size = 200k trackback
Thanks!
Denis
--
Denis Beauchemin, analyste
Université de Sherbrooke, S.T.I.
T: 819.821.8000x62252 F: 819.821.8045 |
| Back to top |
|
 |
Posted: Fri Jul 30, 2010 2:10 am Subject: Bug with HTML messages? |
|
|
Alex Broens
|
|
On 2010-07-30 3:14, Denis Beauchemin wrote:
| Quote: | Le 2010-07-29 17:09, Alex Broens a écrit :
| Quote: | On 2010-07-29 21:04, Denis Beauchemin wrote:
| Quote: |
Le 2010-07-29 14:57, Alex Broens a écrit :
| Quote: | On 2010-07-29 20:07, Denis Beauchemin wrote:
| Quote: | Oops, I switched those 2:
| Quote: | SA scoring with HTML-only email: (not cached, score=12.639, requis
4.5, autolearn=spam, DEAR_SOMETHING 1.60, DRUGS_ERECTILE 0.28,
DRUGS_ERECTILE_OBFU 1.23, DRUG_ED_CAPS 0.32, HTML_MESSAGE 0.00,
KAM_VIAGRA1 3.00, KAM_VIAGRA5 3.10, KAM_VIAGRA6 3.10)
SA scoring with HTML+TXT email: (not cached, score=7.181, requis
4.5, DCC_CHECK 2.17, HTML_MESSAGE 0.00, HTML_TAG_BALANCE_HEAD
1.33, MIME_HTML_ONLY 1.46, TVD_SPACE_RATIO 2.22)
|
SA scoring with HTML+TXT email: (not cached, score=12.639, requis
4.5, autolearn=spam, DEAR_SOMETHING 1.60, DRUGS_ERECTILE 0.28,
DRUGS_ERECTILE_OBFU 1.23, DRUG_ED_CAPS 0.32, HTML_MESSAGE 0.00,
KAM_VIAGRA1 3.00, KAM_VIAGRA5 3.10, KAM_VIAGRA6 3.10)
SA scoring with HTML-only email: (not cached, score=7.181, requis
4.5, DCC_CHECK 2.17, HTML_MESSAGE 0.00, HTML_TAG_BALANCE_HEAD 1.33,
MIME_HTML_ONLY 1.46, TVD_SPACE_RATIO 2.22)
|
you are not feeding it consistent content.
where is the problem?
get a real spam wich was tagged as spam and dumped in your quaratine.
feed that to spamassassin (without MS), is the result consistent?
if not, its usually due to MS's msg chunk settings
|
Alex,
When I feed the emails to SA they get scored much higher than through
MS.
My point is that an HTML-only email with quite common spam words are
not being scored, while an HTML+TXT email with the same spam words
get scored. This looks quite suspicious to me.
Denis
|
what are your MS "chunk" settings?
are you sure MS is sending the full message to SA?
pls post the sample message you're using in pastebin so ppl can try to
reproduce.
Alex
|
Alex,
I don't think my chunk settings really matter since the email is really
short: http://pastebin.com/sMf7rW6s for the HTML-only version and
http://pastebin.com/eiTTWuer for the HTML+TXT version.
Some MS settings that were changed from the default values:
Max Spam Check Size = 500000
Max SpamAssassin Size = 200k trackback
|
for SA these are two VERY different messages and scores them accordingly.
Nothing wrong in that.
Alex
--
MailScanner mailing list
mailscanner@lists.mailscanner.info
http://lists.mailscanner.info/mailman/listinfo/mailscanner
Before posting, read http://wiki.mailscanner.info/posting
Support MailScanner development - buy the book off the website! |
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|