machine learning - Generating a Decision Tree that Perfectly Models the Training Set? -

- May 15, 2010

i have data set rules, , want generate decision tree @ least has 100% accuracy @ classifying rules, can never 100%. set minnumobjs 1 , made unpruned 84% correctly classified instances.

my attributes are:

@attribute users numeric @attribute bandwidth numeric @attribute latency numeric @attribute mode {c,h,dcf,mp,dc,ind}

ex data:

2,200000,0,c 2,200000,1000,c 2,200000,2000,mp 2,200000,5000,c 2,400000,0,c 2,400000,1000,dcf

can me understand why can never 100% of instances classified , how can 100% of them classified (while still allowing attributes numeric)

thanks

it impossible 100% accuracy due identical feature vectors having different labels. guessing in case users, bandwidth, , latency features, while mode label trying predict. if so, there may identical values of {users, bandwidth, latency} happen have different mode labels.

in general, having different labels same features may occur through 1 of several ways:

there noise in data due bad reading of data.
there source of randomness not captured.
there more possible features can distinguish between different labels, features not in data set.

one thing can run training set through decision tree , find items misclassified. try determine why wrong , see if data instances exhibit wrote above (namely there data instances same features different labels).

Search This Blog

Arrya Code

machine learning - Generating a Decision Tree that Perfectly Models the Training Set? -

Comments

Post a Comment

Popular posts from this blog

ios - Memory not freeing up after popping viewcontroller using ARC -

Java JSoup error fetching URL -

webstorm - PhpStorm file cache conflict with TypeScript compiler -