scalation.analytics.DecisionTreeC45

Instance Constructors

new DecisionTreeC45(x: MatrixI, y: Array[Int], featr: Array[String], isCont: Array[Boolean], vc: Array[Int], k: Int = 2)

x
the data vectors stored as rows of a matrix
y
the class array, where y_i = class for row i of the matrix x
featr
the string array holding the feature names
isCont
boolean value to indicate whether according feature is continuous
vc
the value count array indicating number of distinct values per feature
k
the number of classes

Type Members

class Node extends AnyRef

Class that contains information for a tree node.

Value Members

final def !=(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def !=(arg0: Any): Boolean

Definition Classes
Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def ==(arg0: Any): Boolean

Definition Classes
Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def buildTree(opt: (Int, Double)): Unit

Given the next most distinguishing feature/attribute, extend the decision tree.
Given the next most distinguishing feature/attribute, extend the decision tree.
opt
the optimal feature and its gain
def calThreshold(f: Int): Unit

Given a continuous feature, adjust its threshold to improve gain.
Given a continuous feature, adjust its threshold to improve gain.
f
the feature index to consider
def classify(z: VectorD): Int

Given a data vector z, classify it returning the class number (0, .
Given a data vector z, classify it returning the class number (0, ..., k-1) by following a decision path from the root to a leaf.
z
the data vector to classify (some continuous features)

Definition Classes
DecisionTreeC45 → Classifier
def classify(z: VectorI): Int

Given a data vector z, classify it returning the class number (0, .
Given a data vector z, classify it returning the class number (0, ..., k-1) by following a decision path from the root to a leaf.
z
the data vector to classify (purely discrete features)

Definition Classes
DecisionTreeC45 → Classifier
def clone(): AnyRef

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws()
def entropy(prob: VectorD): Double

Given a k-dimensional probability vector, compute its entropy (a measure of disorder).
Given a k-dimensional probability vector, compute its entropy (a measure of disorder).
prob
the probability vector (e.g., (0, 1) -> 0, (.5, .5) -> 1)

See also
http://en.wikipedia.org/wiki/Entropy_%28information_theory%29
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws()
def flaw(method: String, message: String): Unit

Show the flaw by printing the error message.
Show the flaw by printing the error message.
method
the method where the error occurred
message
the error message

Definition Classes
Error
def frequency(fCol: VectorI, value: Int, cont: Boolean = false, thres: Double = 0): (Double, VectorD)

Given a feature column (e.
Given a feature column (e.g., 2 (Humidity)) and a value (e.g., 1 (High)) use the frequency of ocurrence the value for each classification (e.g., 0 (no), 1 (yes)) to estimate k probabilities. Also, determine the fraction of training cases where the feature has this value (e.g., fraction where Humidity is High = 7/14).
fCol
a feature column to consider (e.g., Humidity)
value
one of the possible values for this feature (e.g., 1 (High))
cont
indicates whether is calculating continuous feature
thres
threshold for continuous feature
def gain(f: Int): Double

Compute the information gain due to using the values of a feature/attribute to distinguish the training cases (e.
Compute the information gain due to using the values of a feature/attribute to distinguish the training cases (e.g., how well does Humidity with its values Normal and High indicate whether one will play tennis).
f
the feature to consider (e.g., 2 (Humidity))
final def getClass(): java.lang.Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def nextXY(f: Int, value: Int): (MatrixI, Array[Int])

Return new x matrix and y array for next step of constructing decision tree.
Return new x matrix and y array for next step of constructing decision tree.
f
the feature index
value
one of the features values
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
def printTree: Unit

Print out the decision tree using Breadth First Search (BFS).
var root: Node
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
var threshold: Array[Double]
def toString(): String

Definition Classes
AnyRef → Any
def train(): Unit

Train the classifier, i.
Train the classifier, i.e., determine which feature provides the most information gain and select it as the root of the decision tree.

Definition Classes
DecisionTreeC45 → Classifier
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws()
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws()
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws()
val x: MatrixI

the data vectors stored as rows of a matrix
val y: Array[Int]

the class array, where y_i = class for row i of the matrix x

DecisionTreeC45

class DecisionTreeC45 extends Classifier with Error

Instance Constructors

new DecisionTreeC45(x: MatrixI, y: Array[Int], featr: Array[String], isCont: Array[Boolean], vc: Array[Int], k: Int = 2)

Type Members

class Node extends AnyRef

Value Members

final def !=(arg0: AnyRef): Boolean

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: AnyRef): Boolean

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

def buildTree(opt: (Int, Double)): Unit

def calThreshold(f: Int): Unit

def classify(z: VectorD): Int

def classify(z: VectorI): Int

def clone(): AnyRef

def entropy(prob: VectorD): Double

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

def flaw(method: String, message: String): Unit

def frequency(fCol: VectorI, value: Int, cont: Boolean = false, thres: Double = 0): (Double, VectorD)

def gain(f: Int): Double

final def getClass(): java.lang.Class[_]

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

final def ne(arg0: AnyRef): Boolean

def nextXY(f: Int, value: Int): (MatrixI, Array[Int])

final def notify(): Unit

final def notifyAll(): Unit

def printTree: Unit

var root: Node

final def synchronized[T0](arg0: ⇒ T0): T0

var threshold: Array[Double]

def toString(): String

def train(): Unit

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

val x: MatrixI

val y: Array[Int]

Inherited from Error

Inherited from Classifier

Inherited from AnyRef

Inherited from Any