nltk.corpus.reader.bcp47 module¶
- class nltk.corpus.reader.bcp47.BCP47CorpusReader[source]¶
Bases:
CorpusReader
Parse BCP-47 composite language tags
Supports all the main subtags, and the ‘u-sd’ extension:
>>> from nltk.corpus import bcp47 >>> bcp47.name('oc-gascon-u-sd-fr64') 'Occitan (post 1500): Gascon: Pyrénées-Atlantiques'
Can load a conversion table to Wikidata Q-codes: >>> bcp47.load_wiki_q() >>> bcp47.wiki_q[‘en-GI-spanglis’] ‘Q79388’