Masih ingat langkah ini
>>> from nltk.corpus import gutenberg
pilih nama korpusnya
>>> gutenberg.fileids()
tampilkan nama filenya
['austen-emma.txt', 'austen-persuasion.txt', 'austen-sense.txt', ...]
>>> emma = gutenberg.words('austen-emma.txt')
setelah ini
>>> for fileid in gutenberg.fileids():
... num_chars = len(gutenberg.raw(fileid))
... num_words = len(gutenberg.words(fileid))
... num_sents = len(gutenberg.sents(fileid))
... num_vocab = len(set([w.lower() for w in gutenberg.words(fileid)]))
... print int(num_chars/num_words), int(num_words/num_sents), int(num_words/num_vocab), fileid
ia akan menampilkan (harus di print)
three statistics for each text: average word length, average sentence length, and the number of times each vocabulary item appears in the text on average (our lexical diversity score).
Jika ingin menampilkan kata2nya, maka harus sebagai berikut....
>>> macbeth_sentences = gutenberg.sents('shakespeare-macbeth.txt')
>>> macbeth_sentences
menampilkan kalimat2 dalam macbeth
>>> macbeth_sentences[1037]
menampilkan kalimat ke 1037
['Double', ',', 'double', ',', 'toile', 'and', 'trouble', ';',
'Fire', 'burne', ',', 'and', 'Cauldron', 'bubble']
>>> longest_len = max([len(s) for s in macbeth_sentences])
>>> [s for s in macbeth_sentences if len(s) == longest_len]
menampilkan kalimat terpanjang di Macbeth
[['Doubtfull', 'it', 'stood', ',', 'As', 'two', 'spent', 'Swimmers', ',', 'that',
'doe', 'cling', 'together', ',', 'And', 'choake', 'their', 'Art', ':', 'The',
'mercilesse', 'Macdonwald', ...], ...]
>>> from nltk.corpus import webtext
menampilkan kalimat yang berasal dari web text
>>> for fileid in webtext.fileids():
... print fileid, webtext.raw(fileid)[:65], '...'
...
firefox.txt Cookie Manager: "Don't allow sites that set removed cookies to se...
grail.txt SCENE 1: [wind] [clop clop clop] KING ARTHUR: Whoa there! [clop...
overheard.txt White guy: So, do you have any plans for this evening? Asian girl...
pirates.txt PIRATES OF THE CARRIBEAN: DEAD MAN'S CHEST, by Ted Elliott & Terr...
singles.txt 25 SEXY MALE, seeks attrac older single lady, for discreet encoun...
wine.txt Lovely delicate, fragrant Rhone wine. Polished leather and strawb...
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment