nltk.parse.earleychart module¶
Data classes and parser implementations for incremental chart parsers, which use dynamic programming to efficiently parse a text. A “chart parser” derives parse trees for a text by iteratively adding “edges” to a “chart”. Each “edge” represents a hypothesis about the tree structure for a subsequence of the text. The “chart” is a “blackboard” for composing and combining these hypotheses.
A parser is “incremental”, if it guarantees that for all i, j where i < j, all edges ending at i are built before any edges ending at j. This is appealing for, say, speech recognizer hypothesis filtering.
The main parser class is EarleyChartParser
, which is a top-down
algorithm, originally formulated by Jay Earley (1970).
- class nltk.parse.earleychart.CompleteFundamentalRule[source]¶
Bases:
SingleEdgeFundamentalRule
- class nltk.parse.earleychart.CompleterRule[source]¶
Bases:
CompleteFundamentalRule
- class nltk.parse.earleychart.EarleyChartParser[source]¶
Bases:
IncrementalChartParser
- __init__(grammar, **parser_args)[source]¶
Create a new Earley chart parser, that uses
grammar
to parse texts.- Parameters
grammar (CFG) – The grammar used to parse texts.
trace (int) – The level of tracing that should be used when parsing a text.
0
will generate no tracing output; and higher numbers will produce more verbose tracing output.trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.
chart_class – The class that should be used to create the charts used by this parser.
- class nltk.parse.earleychart.FeatureCompleterRule[source]¶
Bases:
CompleterRule
- class nltk.parse.earleychart.FeatureEarleyChartParser[source]¶
Bases:
FeatureIncrementalChartParser
- __init__(grammar, **parser_args)[source]¶
Create a new Earley chart parser, that uses
grammar
to parse texts.- Parameters
grammar (CFG) – The grammar used to parse texts.
trace (int) – The level of tracing that should be used when parsing a text.
0
will generate no tracing output; and higher numbers will produce more verbose tracing output.trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.
chart_class – The class that should be used to create the charts used by this parser.
- class nltk.parse.earleychart.FeatureIncrementalBottomUpChartParser[source]¶
Bases:
FeatureIncrementalChartParser
- __init__(grammar, **parser_args)[source]¶
Create a new Earley chart parser, that uses
grammar
to parse texts.- Parameters
grammar (CFG) – The grammar used to parse texts.
trace (int) – The level of tracing that should be used when parsing a text.
0
will generate no tracing output; and higher numbers will produce more verbose tracing output.trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.
chart_class – The class that should be used to create the charts used by this parser.
- class nltk.parse.earleychart.FeatureIncrementalBottomUpLeftCornerChartParser[source]¶
Bases:
FeatureIncrementalChartParser
- __init__(grammar, **parser_args)[source]¶
Create a new Earley chart parser, that uses
grammar
to parse texts.- Parameters
grammar (CFG) – The grammar used to parse texts.
trace (int) – The level of tracing that should be used when parsing a text.
0
will generate no tracing output; and higher numbers will produce more verbose tracing output.trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.
chart_class – The class that should be used to create the charts used by this parser.
- class nltk.parse.earleychart.FeatureIncrementalChart[source]¶
Bases:
IncrementalChart
,FeatureChart
- class nltk.parse.earleychart.FeatureIncrementalChartParser[source]¶
Bases:
IncrementalChartParser
,FeatureChartParser
- __init__(grammar, strategy=[<nltk.parse.chart.LeafInitRule object>, <nltk.parse.featurechart.FeatureEmptyPredictRule object>, <nltk.parse.featurechart.FeatureBottomUpPredictCombineRule object>, <nltk.parse.earleychart.FeatureCompleteFundamentalRule object>], trace_chart_width=20, chart_class=<class 'nltk.parse.earleychart.FeatureIncrementalChart'>, **parser_args)[source]¶
Create a new Earley chart parser, that uses
grammar
to parse texts.- Parameters
grammar (CFG) – The grammar used to parse texts.
trace (int) – The level of tracing that should be used when parsing a text.
0
will generate no tracing output; and higher numbers will produce more verbose tracing output.trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.
chart_class – The class that should be used to create the charts used by this parser.
- class nltk.parse.earleychart.FeatureIncrementalTopDownChartParser[source]¶
Bases:
FeatureIncrementalChartParser
- __init__(grammar, **parser_args)[source]¶
Create a new Earley chart parser, that uses
grammar
to parse texts.- Parameters
grammar (CFG) – The grammar used to parse texts.
trace (int) – The level of tracing that should be used when parsing a text.
0
will generate no tracing output; and higher numbers will produce more verbose tracing output.trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.
chart_class – The class that should be used to create the charts used by this parser.
- class nltk.parse.earleychart.FeaturePredictorRule[source]¶
Bases:
FeatureTopDownPredictRule
- class nltk.parse.earleychart.FeatureScannerRule[source]¶
Bases:
ScannerRule
- class nltk.parse.earleychart.FilteredCompleteFundamentalRule[source]¶
- class nltk.parse.earleychart.IncrementalBottomUpChartParser[source]¶
Bases:
IncrementalChartParser
- __init__(grammar, **parser_args)[source]¶
Create a new Earley chart parser, that uses
grammar
to parse texts.- Parameters
grammar (CFG) – The grammar used to parse texts.
trace (int) – The level of tracing that should be used when parsing a text.
0
will generate no tracing output; and higher numbers will produce more verbose tracing output.trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.
chart_class – The class that should be used to create the charts used by this parser.
- class nltk.parse.earleychart.IncrementalBottomUpLeftCornerChartParser[source]¶
Bases:
IncrementalChartParser
- __init__(grammar, **parser_args)[source]¶
Create a new Earley chart parser, that uses
grammar
to parse texts.- Parameters
grammar (CFG) – The grammar used to parse texts.
trace (int) – The level of tracing that should be used when parsing a text.
0
will generate no tracing output; and higher numbers will produce more verbose tracing output.trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.
chart_class – The class that should be used to create the charts used by this parser.
- class nltk.parse.earleychart.IncrementalChart[source]¶
Bases:
Chart
- edges()[source]¶
Return a list of all edges in this chart. New edges that are added to the chart after the call to edges() will not be contained in this list.
- Return type
list(EdgeI)
- See
iteredges
,select
- iteredges()[source]¶
Return an iterator over the edges in this chart. It is not guaranteed that new edges which are added to the chart before the iterator is exhausted will also be generated.
- Return type
iter(EdgeI)
- See
edges
,select
- select(end, **restrictions)[source]¶
Return an iterator over the edges in this chart. Any new edges that are added to the chart before the iterator is exahusted will also be generated.
restrictions
can be used to restrict the set of edges that will be generated.- Parameters
span – Only generate edges
e
wheree.span()==span
start – Only generate edges
e
wheree.start()==start
end – Only generate edges
e
wheree.end()==end
length – Only generate edges
e
wheree.length()==length
lhs – Only generate edges
e
wheree.lhs()==lhs
rhs – Only generate edges
e
wheree.rhs()==rhs
nextsym – Only generate edges
e
wheree.nextsym()==nextsym
dot – Only generate edges
e
wheree.dot()==dot
is_complete – Only generate edges
e
wheree.is_complete()==is_complete
is_incomplete – Only generate edges
e
wheree.is_incomplete()==is_incomplete
- Return type
iter(EdgeI)
- class nltk.parse.earleychart.IncrementalChartParser[source]¶
Bases:
ChartParser
An incremental chart parser implementing Jay Earley’s parsing algorithm:
For each index end in [0, 1, …, N]:For each edge such that edge.end = end:If edge is incomplete and edge.next is not a part of speech:Apply PredictorRule to edgeIf edge is incomplete and edge.next is a part of speech:Apply ScannerRule to edgeIf edge is complete:Apply CompleterRule to edgeReturn any complete parses in the chart- __init__(grammar, strategy=[<nltk.parse.chart.LeafInitRule object>, <nltk.parse.chart.EmptyPredictRule object>, <nltk.parse.chart.BottomUpPredictCombineRule object>, <nltk.parse.earleychart.CompleteFundamentalRule object>], trace=0, trace_chart_width=50, chart_class=<class 'nltk.parse.earleychart.IncrementalChart'>)[source]¶
Create a new Earley chart parser, that uses
grammar
to parse texts.- Parameters
grammar (CFG) – The grammar used to parse texts.
trace (int) – The level of tracing that should be used when parsing a text.
0
will generate no tracing output; and higher numbers will produce more verbose tracing output.trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.
chart_class – The class that should be used to create the charts used by this parser.
- class nltk.parse.earleychart.IncrementalLeftCornerChartParser[source]¶
Bases:
IncrementalChartParser
- __init__(grammar, **parser_args)[source]¶
Create a new Earley chart parser, that uses
grammar
to parse texts.- Parameters
grammar (CFG) – The grammar used to parse texts.
trace (int) – The level of tracing that should be used when parsing a text.
0
will generate no tracing output; and higher numbers will produce more verbose tracing output.trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.
chart_class – The class that should be used to create the charts used by this parser.
- class nltk.parse.earleychart.IncrementalTopDownChartParser[source]¶
Bases:
IncrementalChartParser
- __init__(grammar, **parser_args)[source]¶
Create a new Earley chart parser, that uses
grammar
to parse texts.- Parameters
grammar (CFG) – The grammar used to parse texts.
trace (int) – The level of tracing that should be used when parsing a text.
0
will generate no tracing output; and higher numbers will produce more verbose tracing output.trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.
chart_class – The class that should be used to create the charts used by this parser.
- class nltk.parse.earleychart.PredictorRule[source]¶
Bases:
CachedTopDownPredictRule
- class nltk.parse.earleychart.ScannerRule[source]¶
Bases:
CompleteFundamentalRule