Lessons

Introduction to Programming
5. Bonus: Python Code-example

In this bonus lesson, we take a closer look at a Python program that counts the number of words in a text file. This is what the program looks like when you run it:

Here we find the number of times that the words "human", "brain" and "zenware" appear in the book '1984' by George Orwell.
Here we find the number of times that the words "human", "brain" and "zenware" appear in the book '1984' by George Orwell.

Before we dive into the code itself, here is a brief description of the overall approach to the solution:

  1. Select which text file to analyze.
  2. Review the text one line at a time and:
    • Break the line into words.
    • Increment the number of times each word is found.
  3. Once the text is analyzed, the user can look up words as they wish.

Converted to Python code, the program could look something like this:

import re
frequencyOfWords = {}

bookName = input("Enter the book you'd like to search through: ")
file = bookName + ".txt"
book = open(file, "r", encoding="utf8")

for line in book:
  wordList = re.split('\; |\, |\. |\"|\! |\n| ', line)
  for word in wordList:
  	if word:
  		lowercaseWord = word.lower()
	  	timesCountedSoFar = frequencyOfWords.get(lowercaseWord, 0)
	  	frequencyOfWords[lowercaseWord] = timesCountedSoFar + 1

while True:
	searchedWord = input("Give me a word: ").lower()
	wordOccurrences = frequencyOfWords.get(searchedWord)
	if (wordOccurrences):
		print('"{}" occured {} times.'.format(searchedWord, wordOccurrences))
	else:
		print('The word "{}" was never used in the book: {}.'.format(searchedWord, bookName))

Below, the program is described in smaller parts, so you can get a better understanding of what the individual lines of code actually do: