nltk.corpus.reader.lin module¶
- class nltk.corpus.reader.lin.LinThesaurusCorpusReader[source]¶
Bases:
CorpusReader
Wrapper for the LISP-formatted thesauruses distributed by Dekang Lin.
- __init__(root, badscore=0.0)[source]¶
Initialize the thesaurus.
- Parameters
root (C{string}) – root directory containing thesaurus LISP files
badscore (C{float}) – the score to give to words which do not appear in each other’s sets of synonyms
- scored_synonyms(ngram, fileid=None)[source]¶
Returns a list of scored synonyms (tuples of synonyms and scores) for the current ngram
- Parameters
ngram (C{string}) – ngram to lookup
fileid (C{string}) – thesaurus fileid to search in. If None, search all fileids.
- Returns
If fileid is specified, list of tuples of scores and synonyms; otherwise, list of tuples of fileids and lists, where inner lists consist of tuples of scores and synonyms.
- similarity(ngram1, ngram2, fileid=None)[source]¶
Returns the similarity score for two ngrams.
- Parameters
ngram1 (C{string}) – first ngram to compare
ngram2 (C{string}) – second ngram to compare
fileid (C{string}) – thesaurus fileid to search in. If None, search all fileids.
- Returns
If fileid is specified, just the score for the two ngrams; otherwise, list of tuples of fileids and scores.
- synonyms(ngram, fileid=None)[source]¶
Returns a list of synonyms for the current ngram.
- Parameters
ngram (C{string}) – ngram to lookup
fileid (C{string}) – thesaurus fileid to search in. If None, search all fileids.
- Returns
If fileid is specified, list of synonyms; otherwise, list of tuples of fileids and lists, where inner lists contain synonyms.