Support for non-binary classification


Non-binary classification, described in this white paper by Ben Kamen (http://www.fogcreek.com/FogBugz/Downloads/KamensPaper.pdf).


PatrickBurrows wrote Aug 30, 2010 at 10:07 PM

I've implemented a ghetto version of this feature using nBayes by creating spam and ham training indexes for each category. I then just examine each category individually to determine if a given block of text is spam or ham for that specific category.

In my app it is entirely possible for a given block of text to exist in multiple categories at once, so this is fine.

Of course the issues with my implementation are training and scaling. Speed is definitely sacrificed as the number of categories increases. Additionally, for each new category, I have to manually train that category.

If this functionality were built into nBayes, then that would be much better (and presumably faster, if for no other reasons than there would be half the number of index files.)

Feel free to contact me if you need help implementing this feature, if there is some piece that can be broken out.

wrote Aug 28, 2012 at 3:32 AM

wrote Feb 14, 2013 at 1:10 AM