Let's first define our input text, I will just Copy/Paste the first paragraph of Game of Thrones Wikipedia page:
input_text = "Game of Thrones is an American fantasy drama television series \
created by David Benioff and D. B. Weiss for HBO. It is an adaptation of A Song \
of Ice and Fire, George R. R. Martin's series of fantasy novels, the first of \
which is A Game of Thrones. The show was filmed in Belfast and elsewhere in the \
United Kingdom, Canada, Croatia, Iceland, Malta, Morocco, Spain, and the \
United States.[1] The series premiered on HBO in the United States on April \
17, 2011, and concluded on May 19, 2019, with 73 episodes broadcast over \
eight seasons. Set on the fictional continents of Westeros and Essos, Game of \
Thrones has several plots and a large ensemble cast, and follows several story \
arcs. One arc is about the Iron Throne of the Seven Kingdoms, and follows a web \
of alliances and conflicts among the noble dynasties either vying to claim the \
throne or fighting for independence from it. Another focuses on the last \
descendant of the realm's deposed ruling dynasty, who has been exiled and is \
plotting a return to the throne, while another story arc follows the Night's \
Watch, a brotherhood defending the realm against the fierce peoples and \
legendary creatures of the North."
To be able to apply nltk functions we need to convert our text of type 'str' to 'nltk.text.Text'.
import nltk text = nltk.Text( input_text.split() )
text.similar()
Distributional similarity: find other words which appear in the same contexts as the specified word; list most similar words first.
Parameters:
- word (str) – The word used to seed the similarity search
- num (int) – The number of words to generate (default=20)
The similar() method takes an input_word and returns other words who appear in a similar range of contexts in the text.
For example let's see what are the words used in similar context to the word 'game' in our text:
text.similar('game') #output: song web
text.common_contexts()
Find contexts where the specified words appear; list most frequent common contexts first.
Parameters:
- word (str) – The word used to seed the similarity search
- num (int) – The number of words to generate (default=20)
The common_contexts() method allows you to examine the contexts that are shared by two or more words. Let's see in which context the words 'game' and 'web' were used in the text:
text.common_contexts(['game', 'web']) #outputs a_of
This means that in the text we'll find 'a game of' and 'a song of'.
Comments
Post a Comment