nltk.classify.decisiontree module¶
A classifier model that decides which label to assign to a token on the basis of a tree structure, where branches correspond to conditions on feature values, and leaves correspond to label assignments.
- class nltk.classify.decisiontree.DecisionTreeClassifier[source]¶
Bases:
ClassifierI
- __init__(label, feature_name=None, decisions=None, default=None)[source]¶
- Parameters
label – The most likely label for tokens that reach this node in the decision tree. If this decision tree has no children, then this label will be assigned to any token that reaches this decision tree.
feature_name – The name of the feature that this decision tree selects for.
decisions – A dictionary mapping from feature values for the feature identified by
feature_name
to child decision trees.default – The child that will be used if the value of feature
feature_name
does not match any of the keys indecisions
. This is used when constructing binary decision trees.
- static best_binary_stump(feature_names, labeled_featuresets, feature_values, verbose=False)[source]¶
- classify(featureset)[source]¶
- Returns
the most appropriate label for the given featureset.
- Return type
label
- labels()[source]¶
- Returns
the list of category labels used by this classifier.
- Return type
list of (immutable)
- pretty_format(width=70, prefix='', depth=4)[source]¶
Return a string containing a pretty-printed version of this decision tree. Each line in this string corresponds to a single decision tree node or leaf, and indentation is used to display the structure of the decision tree.
- pseudocode(prefix='', depth=4)[source]¶
Return a string representation of this decision tree that expresses the decisions it makes as a nested set of pseudocode if statements.
- refine(labeled_featuresets, entropy_cutoff, depth_cutoff, support_cutoff, binary=False, feature_values=None, verbose=False)[source]¶
- static train(labeled_featuresets, entropy_cutoff=0.05, depth_cutoff=100, support_cutoff=10, binary=False, feature_values=None, verbose=False)[source]¶
- Parameters
binary – If true, then treat all feature/value pairs as individual binary features, rather than using a single n-way branch for each feature.