2448: Give a chance to rspamd's bayes classifier r=mergify[bot] a=nextgens

## What type of PR?

bug-fix + documentation

## What does this PR do?

As pointed out in #2442, the bayesian filter of rspamd doesn't get any chance to run as ``min_learns`` is set to 200 and we never teach it any HAM.

This PR enables rspamd's autolearn feature, that will "reinforce" good/bad by learning from the scoring of other modules. It ensures both that we will eventually reach the 200 mark but also that the data stays fresh.

I've also taken this opportunity to update the documentation & FAQ accordingly, to ensure that users teach their HAM & SPAM to both the fuzzy and bayes classifiers.

Thank you to [woj-tek](https://github.com/woj-tek) for doing the ground work on this.

### Related issue(s)
- closes #2442

## Prerequisites
Before we can consider review and merge, please make sure the following list is done and checked.
If an entry in not applicable, you can check it or remove it from the list.

- [x] In case of feature or enhancement: documentation updated accordingly
- [x] Unless it's docs or a minor change: add [changelog](https://mailu.io/master/contributors/workflow.html#changelog) entry file.


Co-authored-by: Florent Daigniere <nextgens@freenetproject.org>
master
bors[bot] 2 years ago committed by GitHub
commit b5e7cad2d3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -0,0 +1,6 @@
autolearn {
spam_threshold = 6.0; # When to learn spam (score >= threshold)
ham_threshold = -0.5; # When to learn ham (score <= threshold)
check_balance = true; # Check spam and ham balance
min_balance = 0.9; # Keep diff for spam/ham learns for at least this value
}

@ -59,6 +59,7 @@ If you already have an existing mailbox and want Mailu to learn them all as ham
.. code-block:: bash
rspamc -h antispam:11334 -P mailu -f 13 fuzzy_add /mail/user\@example.com/.Ham_Learn/cur/
rspamc -h antispam:11334 -P mailu learn_ham /mail/user\@example.com/.Ham_Learn/cur/
This should learn every file located in the ``Ham_Learn`` folder from user@example.com
@ -67,6 +68,7 @@ Likewise, to learn all messages within the folder ``Spam_Learn`` as spam message
.. code-block:: bash
rspamc -h antispam:11334 -P mailu -f 11 fuzzy_add /mail/user\@example.com/.Spam_Learn/cur/
rspamc -h antispam:11334 -P mailu learn_spam /mail/user\@example.com/.Spam_Learn/cur/
*Issue reference:* `1438`_.

@ -736,6 +736,7 @@ If you already have an existing mailbox and want Mailu to learn them all as ham
.. code-block:: bash
rspamc -h antispam:11334 -P mailu -f 13 fuzzy_add /mail/user\@example.com/.Ham_Learn/cur/
rspamc -h antispam:11334 -P mailu learn_ham /mail/user\@example.com/.Ham_Learn/cur/
This should learn every file located in the ``Ham_Learn`` folder from user@example.com
@ -744,6 +745,7 @@ Likewise, to lean all messages within the folder ``Spam_Learn`` as spam messages
.. code-block:: bash
rspamc -h antispam:11334 -P mailu -f 11 fuzzy_add /mail/user\@example.com/.Spam_Learn/cur/
rspamc -h antispam:11334 -P mailu learn_spam /mail/user\@example.com/.Spam_Learn/cur/
*Issue reference:* `1438`_.

@ -0,0 +1 @@
Enable rspamd's autolearn feature to ensure that its bayes classifier has enough HAM to make it usable. Previously the bayes module would never work unless some HAM had been learnt manually.
Loading…
Cancel
Save