lifecycles.algorithms.purity

purity(labels: list) Tuple[str, float]

compute the purity of a set of labels. Purity is defined as the relative frequency of the most frequent attribute value

Parameters:

labels – the list of labels

Returns:

a tuple of the most frequent attribute value and its frequency

Example:

>>> purity(['a','a','b','b','b'])
('b', 0.6)