Merge #2448

2448: Give a chance to rspamd's bayes classifier r=mergify[bot] a=nextgens ## What type of PR? bug-fix + documentation ## What does this PR do? As pointed out in #2442, the bayesian filter of rspamd doesn't get any chance to run as ``min_learns`` is set to 200 and we never teach it any HAM. This PR enables rspamd's autolearn feature, that will "reinforce" good/bad by learning from the scoring of other modules. It ensures both that we will eventually reach the 200 mark but also that the data stays fresh. I've also taken this opportunity to update the documentation & FAQ accordingly, to ensure that users teach their HAM & SPAM to both the fuzzy and bayes classifiers. Thank you to [woj-tek](https://github.com/woj-tek) for doing the ground work on this. ### Related issue(s) - closes #2442 ## Prerequisites Before we can consider review and merge, please make sure the following list is done and checked. If an entry in not applicable, you can check it or remove it from the list. - [x] In case of feature or enhancement: documentation updated accordingly - [x] Unless it's docs or a minor change: add [changelog](https://mailu.io/master/contributors/workflow.html#changelog) entry file. Co-authored-by: Florent Daigniere <nextgens@freenetproject.org>
3 years ago · b5e7cad2d3
parent ba27cdb3a8 256fa5c90c
commit b5e7cad2d3
4 changed files with 11 additions and 0 deletions
--- a/core/rspamd/conf/classifier-bayes.conf
+++ b/core/rspamd/conf/classifier-bayes.conf
@ -0,0 +1,6 @@
+autolearn {
+  spam_threshold = 6.0; # When to learn spam (score >= threshold)
+  ham_threshold = -0.5; # When to learn ham (score <= threshold)
+  check_balance = true; # Check spam and ham balance
+  min_balance = 0.9; # Keep diff for spam/ham learns for at least this value
+}
--- a/docs/antispam.rst
+++ b/docs/antispam.rst
@ -59,6 +59,7 @@ If you already have an existing mailbox and want Mailu to learn them all as ham
 .. code-block:: bash

  rspamc -h antispam:11334 -P mailu -f 13 fuzzy_add /mail/user\@example.com/.Ham_Learn/cur/
+  rspamc -h antispam:11334 -P mailu learn_ham /mail/user\@example.com/.Ham_Learn/cur/

 This should learn every file located in the ``Ham_Learn`` folder from user@example.com 

@ -67,6 +68,7 @@ Likewise, to learn all messages within the folder ``Spam_Learn`` as spam message
 .. code-block:: bash

  rspamc -h antispam:11334 -P mailu -f 11 fuzzy_add /mail/user\@example.com/.Spam_Learn/cur/
+  rspamc -h antispam:11334 -P mailu learn_spam /mail/user\@example.com/.Spam_Learn/cur/

 *Issue reference:* `1438`_.

--- a/docs/faq.rst
+++ b/docs/faq.rst
@ -736,6 +736,7 @@ If you already have an existing mailbox and want Mailu to learn them all as ham
 .. code-block:: bash

  rspamc -h antispam:11334 -P mailu -f 13 fuzzy_add /mail/user\@example.com/.Ham_Learn/cur/
+  rspamc -h antispam:11334 -P mailu learn_ham /mail/user\@example.com/.Ham_Learn/cur/

 This should learn every file located in the ``Ham_Learn`` folder from user@example.com 

@ -744,6 +745,7 @@ Likewise, to lean all messages within the folder ``Spam_Learn`` as spam messages
 .. code-block:: bash

  rspamc -h antispam:11334 -P mailu -f 11 fuzzy_add /mail/user\@example.com/.Spam_Learn/cur/
+  rspamc -h antispam:11334 -P mailu learn_spam /mail/user\@example.com/.Spam_Learn/cur/

 *Issue reference:* `1438`_.

--- a/towncrier/newsfragments/2447.bugfix
+++ b/towncrier/newsfragments/2447.bugfix
@ -0,0 +1 @@
+Enable rspamd's autolearn feature to ensure that its bayes classifier has enough HAM to make it usable. Previously the bayes module would never work unless some HAM had been learnt manually.
				`@ -0,0 +1 @@`
				`Enable rspamd's autolearn feature to ensure that its bayes classifier has enough HAM to make it usable. Previously the bayes module would never work unless some HAM had been learnt manually.`