Package teamwork :: Package math :: Module id3
[hide private]
[frames] | no frames]

Module id3

source code

This module contains the functions for calculating the information gain of a dataset as defined by the ID3 (Information Theoretic) heuristic.

Functions [hide private]
dict
frequency(data, attr)
Computes the histogram over values for the given attribute
source code
 
entropy(data, target_attr)
Calculates the entropy of the given data set for the target attribute.
source code
 
gain(data, attr, target_attr)
Calculates the information gain (reduction in entropy) that would result by splitting the data on the chosen attribute (attr).
source code
Function Details [hide private]

frequency(data, attr)

source code 

Computes the histogram over values for the given attribute

Parameters:
  • data (dict[]) - the data to be analyzed
  • attr - the attribute to be analyzed
Returns: dict
a dictionary of frequency counts, indexed by attribute values

Warning: assumes all fields have the same number of possible values