I like the idea and would support it. My thoughts.
* False-positives need a good way to handle via UI
*User should be able to turn the feature off
* You're algorithmic approach could probably be improved on. The current approach could be easily circumventer I suspect unlike more traditional approaches like Naive Fisher Classifiers and such.
* By only storing the last 10 messages its also easier to circumvent than a larger value. Perhaps make this configurable?