Coding challenge 1
Copy the code below.
firstName = 'george' lastName = 'eliot' born = 1819 died = 1880 placeOfBirth = '' placeOfDeath = 'London'
Use these variables to print the following sentence:
‘George Eliot was born in 1819 and passed away in 1880 in London.’
The print() function formally requires strings. The variables ‘born’ and ‘died’ are integers, however. To incorporate these numbers into the sentence, they must firstly be converted into strings. This can be done using the str() function.
str( born )
Information about the placenames (following the preposition “in”) must only be shown if the corresponding valiables have actually been assigned a value. To test if this is the case, use the length function, as follows:
if( len( placeOfBirth ) > 0)
Also use the year of death to provide a chronological category for the author that is described. If the year of death is 1960, for instance, the category must be “20th century literature”.
Coding challenge 2
The type-token ratio of a text can be calculated by dividing the number of types (the unique words that occur in a text) by the number of tokens (the total number of words). This ratio gives an indication of the lexical variation of the text. Is the text very repetitive, or does it make use of a wide range of different words?
Create a program in Python that can calculate the type-token ratio of E.M. Forster’s novel A Room with a View.
Take a number of measures to ensure the accuracy of the counts:
- Leading and trailing punctuation must be r emoved from all the words.
- Words must be counted in a case-insensitive manner
- Compound nouns containing a hyphen need to be counted as a unit
- Digits can be disregarded
You can use the code below as a basis.
import re tokens = 0 freq = dict() novel = open( "ARoomWithAView.txt" , encoding = 'utf-8' ) for line in novel: words = re.split( r'\s+' , line ) for w in words: #print(w) tokens += 1 freq[w] = freq.get( w , 0) + 1 types = len(freq) print(types) ttr = types / tokens print( str( ttr) )
Coding Challenge 3
Calculate the type-token ratio of the 10 novels in the sample corpus that is provided in the file repository. To make sure that the ratios can be compared on an equal footing, consider only the first 3000 words of each text.
Coding challenge 4
Using Python’s NLTK modules, create a program which can extract all the adverbs ending in “ly” that are used in P.B. Shelley’s Complete Poems. The list of adverbs does not need to be de-duplicated. It must be printed in a separate file named ‘adverbs.txt’.
Information on how to install NTLK can be found here: https://www.nltk.org/install.html