Ordinal classifiers have become quite popular in recent years. However, no one has systematically tested yet how sensitive they are to noise. The findings of an experiment which compares for the first time the accuracy of several known ordinal and non-ordinal classifiers in presence of varying levels of non-monotone noise are reported here.
The findings clearly show that some models are more sensitive than others to non-monotone noise. Some classifiers which ranked higher in absence of noise performed poorly when the noise level increased even modestly. Others, which ranked relatively low in noiseless datasets, ranked much better when the noise levels increased. Two classifiers which assure monotone classifications became practically useless at relatively low levels of noise, while other classifiers` accuracies deteriorated at a much slower pace.
Three competing accuracy-related measures were used: Accuracy, Kappa and the Gini Index, and all were subjected to statistical tests. The lesson to be learned from this experiment is that it is very important to measure and report, among other things, the levels of noise which are present in datasets which are used for the evaluation of classification models.