nltk.corpus.reader.senseval module¶
Read from the Senseval 2 Corpus.
SENSEVAL [http://www.senseval.org/] Evaluation exercises for Word Sense Disambiguation. Organized by ACL-SIGLEX [https://www.siglex.org/]
Prepared by Ted Pedersen <tpederse@umn.edu>, University of Minnesota, https://www.d.umn.edu/~tpederse/data.html Distributed with permission.
The NLTK version of the Senseval 2 files uses well-formed XML. Each instance of the ambiguous words “hard”, “interest”, “line”, and “serve” is tagged with a sense identifier, and supplied with context.
- class nltk.corpus.reader.senseval.SensevalCorpusReader[source]¶
Bases:
CorpusReader
- class nltk.corpus.reader.senseval.SensevalCorpusView[source]¶
Bases:
StreamBackedCorpusView
- __init__(fileid, encoding)[source]¶
Create a new corpus view, based on the file
fileid
, and read withblock_reader
. See the class documentation for more information.- Parameters
fileid – The path to the file that is read by this corpus view.
fileid
can either be a string or aPathPointer
.startpos – The file position at which the view will start reading. This can be used to skip over preface sections.
encoding – The unicode encoding that should be used to read the file’s contents. If no encoding is specified, then the file’s contents will be read as a non-unicode string (i.e., a str).