nltk.corpus.reader.propbank module¶
- class nltk.corpus.reader.propbank.PropbankChainTreePointer[source]¶
Bases:
PropbankPointer
- pieces¶
A list of the pieces that make up this chain. Elements may be either
PropbankSplitTreePointer
orPropbankTreePointer
pointers.
- class nltk.corpus.reader.propbank.PropbankCorpusReader[source]¶
Bases:
CorpusReader
Corpus reader for the propbank corpus, which augments the Penn Treebank with information about the predicate argument structure of every verb instance. The corpus consists of two parts: the predicate-argument annotations themselves, and a set of “frameset files” which define the argument labels used by the annotations, on a per-verb basis. Each “frameset file” contains one or more predicates, such as
'turn'
or'turn_on'
, each of which is divided into coarse-grained word senses called “rolesets”. For each “roleset”, the frameset file provides descriptions of the argument roles, along with examples.- __init__(root, propfile, framefiles='', verbsfile=None, parse_fileid_xform=None, parse_corpus=None, encoding='utf8')[source]¶
- Parameters
root – The root directory for this corpus.
propfile – The name of the file containing the predicate- argument annotations (relative to
root
).framefiles – A list or regexp specifying the frameset fileids for this corpus.
parse_fileid_xform – A transform that should be applied to the fileids in this corpus. This should be a function of one argument (a fileid) that returns a string (the new fileid).
parse_corpus – The corpus containing the parse trees corresponding to this corpus. These parse trees are necessary to resolve the tree pointers used by propbank.
- instances(baseform=None)[source]¶
- Returns
a corpus view that acts as a list of
PropBankInstance
objects, one for each noun in the corpus.
- class nltk.corpus.reader.propbank.PropbankInflection[source]¶
Bases:
object
- ACTIVE = 'a'¶
- FINITE = 'v'¶
- FUTURE = 'f'¶
- GERUND = 'g'¶
- INFINITIVE = 'i'¶
- NONE = '-'¶
- PARTICIPLE = 'p'¶
- PASSIVE = 'p'¶
- PAST = 'p'¶
- PERFECT = 'p'¶
- PERFECT_AND_PROGRESSIVE = 'b'¶
- PRESENT = 'n'¶
- PROGRESSIVE = 'o'¶
- THIRD_PERSON = '3'¶
- class nltk.corpus.reader.propbank.PropbankInstance[source]¶
Bases:
object
- __init__(fileid, sentnum, wordnum, tagger, roleset, inflection, predicate, arguments, parse_corpus=None)[source]¶
- arguments¶
A list of tuples (argloc, argid), specifying the location and identifier for each of the predicate’s argument in the containing sentence. Argument identifiers are strings such as
'ARG0'
or'ARGM-TMP'
. This list does not contain the predicate.
- property baseform¶
The baseform of the predicate.
- fileid¶
The name of the file containing the parse tree for this instance’s sentence.
- inflection¶
A
PropbankInflection
object describing the inflection of this instance’s predicate.
- parse_corpus¶
A corpus reader for the parse trees corresponding to the instances in this propbank corpus.
- predicate¶
A
PropbankTreePointer
indicating the position of this instance’s predicate within its containing sentence.
- property predid¶
Identifier of the predicate.
- roleset¶
The name of the roleset used by this instance’s predicate. Use
propbank.roleset() <PropbankCorpusReader.roleset>
to look up information about the roleset.
- property sensenumber¶
The sense number of the predicate.
- sentnum¶
The sentence number of this sentence within
fileid
. Indexing starts from zero.
- tagger¶
An identifier for the tagger who tagged this instance; or
'gold'
if this is an adjuticated instance.
- property tree¶
The parse tree corresponding to this instance, or None if the corresponding tree is not available.
- wordnum¶
The word number of this instance’s predicate within its containing sentence. Word numbers are indexed starting from zero, and include traces and other empty parse elements.
- class nltk.corpus.reader.propbank.PropbankPointer[source]¶
Bases:
object
A pointer used by propbank to identify one or more constituents in a parse tree.
PropbankPointer
is an abstract base class with three concrete subclasses:PropbankTreePointer
is used to point to single constituents.PropbankSplitTreePointer
is used to point to ‘split’ constituents, which consist of a sequence of two or morePropbankTreePointer
pointers.PropbankChainTreePointer
is used to point to entire trace chains in a tree. It consists of a sequence of pieces, which can bePropbankTreePointer
orPropbankSplitTreePointer
pointers.
- class nltk.corpus.reader.propbank.PropbankSplitTreePointer[source]¶
Bases:
PropbankPointer
- pieces¶
A list of the pieces that make up this chain. Elements are all
PropbankTreePointer
pointers.
- class nltk.corpus.reader.propbank.PropbankTreePointer[source]¶
Bases:
PropbankPointer
wordnum:height*wordnum:height*… wordnum:height,