训练第一个分类器使用 ZeroR 设置基线

Created: November-22, 2018

ZeroR 是一个简单的分类器。它不是每个实例运行，而是在类的一般分布上运行。它选择具有最大先验概率的类。它不是一个好的分类器，因为它不使用候选者中的任何信息，但它通常用作基线。注意：其他基线可用作 aswel，例如：行业标准分类器或手工制作的规则

 // First we tell our data that it's class is hidden in the last attribute
 data.setClassIndex(data.numAttributes() -1);
 // Then we split the data in to two sets
 // randomize first because we don't want unequal distributions
 data.randomize(new java.util.Random(0));
 Instances testset = new Instances(data, 0, 50);
 Instances trainset = new Instances(data, 50, 99);
 
 // Now we build a classifier
 // Train it with the trainset
 ZeroR classifier1 = new ZeroR();
 classifier1.buildClassifier(trainset);
 // Next we test it against the testset
 Evaluation Test = new Evaluation(trainset);
 Test.evaluateModel(classifier1, testset);
 System.out.println(Test.toSummaryString());

该集中最大的类为你提供 34％的正确率。（149 个中的 50 个）

StackOverflow 文档

注意：ZeroR 的性能约为 30％。这是因为我们随机分成了火车和测试集。因此，列车组中最大的一组将是测试装置中最小的一组。制作一个好的测试/火车套装值得你光顾