data-science-puzzle

Want to guess the word?
You better use some objects,
So that you can play.

Introduction

In this project, you will create a retro-text version of the classic game show, Wheel of Fortune, which is a spin on the traditional word-guessing game.

The project combines functional and object-oriented programming paradigms, focusing on the material from Chapter 2.5 of Composing Programs. The project also involves understanding, extending, and testing a large program.

The wheel.zip archive contains several files. Your changes will be made to:

  • board.py
  • game.py
  • player.py
  • secret.py
  • wordset.py

You can obtain all the files needed for this project by downloading this zip archive.

Logistics

This is a 3-week project. You may work with one other partner. You should not share your code with students who are not your partner or copy from anyone else's solutions.

In the end, you will submit one project for both partners. The project is worth 20 points. 18 points are assigned based on correctness for your final submission, and 2 points are for the mid-project checkpoint.

You will turn in the following files:

Project Submission

  • board.py
  • game.py
  • player.py
  • secret.py
  • wordset.py
  • computerTest.py

You do not need to modify or turn in any other files to complete the project. To submit the project, run the following command:

python3 ok --submit

You will be able to view your submissions on the Ok dashboard.

For the functions that we ask you to complete, there may be some initial code that we provide. If you would rather not use that code, feel free to delete it and start from scratch. You may also add new function definitions as you see fit.

However, please do not modify any other functions. Doing so may result in your code failing our autograder tests. Also, please do not change any function signatures (names, argument order, or number of arguments).

Make sure you have added your partner on OK before submitting the project.

The assignment is due on Wednesday, 05/08, at 9pm. We will have a mid-project checkpoint due on Wednesday, 04/24 which is worth 2 points.

  • Complete Part 1: Problems 0-2 by Wednesday, 04/24, at 9pm for the mid-project checkpoint. Submit to okpy using python3 ok --submit
  • The entire project (including Part 1) is due on Wednesday, 05/08, at 9pm. Submit to okpy using python3 ok --submit

Extra Credit Opportunity: If you submit the entire project by Wednesday, 5/01, you can earn one point of EC towards your project grade.

Testing

Throughout this project, you should be testing the correctness of your code. It is good practice to test often, so that it is easy to isolate any problems.

We have provided an autograder called ok to help you with testing your code and tracking your progress.

The primary purpose of ok is to test your implementations, but there is a catch. At first, the test cases are locked. To unlock tests, run the following command from your terminal:

python3 ok -u --local

This command will start an interactive prompt that looks like:

=====================================================================
Assignment: Wheel of Fortune
OK, version ...
=====================================================================

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Unlocking tests

At each "? ", type what you would expect the output to be.
Type exit() to quit

---------------------------------------------------------------------
Question 0 > Suite 1 > Case 1
(cases remaining: 1)

>>> Code here
?

At the ?, you can type what you expect the output to be. If you are correct, then this test case will be available the next time you run the autograder.

The idea is to understand conceptually what your program should do first, before you start writing any code.

Once you have unlocked some tests and written some code, you can check the correctness of your program using the tests that you have unlocked:

python3 ok --local

Most of the time, you will want to focus on a particular question. Use the -q option as directed in the problems below.

The tests folder is used to store autograder tests, so make sure not to modify it. You may lose all your unlocking progress if you do. If you need to get a fresh copy, you can download the zip archive and copy it over, but you will need to start unlocking from scratch.

Core Concepts

This project will give you a chance to work with classes as a means of composing large software out of well-designed components. You get to build a computerized version of the popular American game show Wheel of Fortune using object oriented programming. Although the whole project is not very large, we want to give you the feeling of working with classes at the larger scale. The game is built from a collection of classes partitioned into several files. The idea is that you should not be looking inside a class when writing code that uses it. You should only pay attention to the interface, i.e., the attributes and methods defined on objects of the class and, to a lesser degree, the attributes and methods on the class itself.

The Game

The following shows a little example of playing the game where the computer picks the word and the human guesses.

['_', '_', 'a', 't']
Guessed chars:  ['e', 'a', 't', 'f']
guesser , please enter your next guess.
h

['-', 'h', 'a', 't']
Guessed chars:  ['e', 'a', 't', 'f', 'h']
guesser , please enter your next guess.
t
Please enter a single character not yet guessed
w
(['w', 'h', 'a', 't'], 6)
>>>

Of course, why should it be played that way? You might want to pick the word and have the computer do the guessing. Ah, that would require some Data 8 thinking. What letter would be most likely to appear in the word, given all that you know so far?

>>> inhuman()
picker , pick your secret word.
consecrate
['_', '_', '_', '_', '_', '_', '_', '_', '_', '_'] []

['_', '_', '_', '_', '_', '_', '_', '_', '_', '_']
Guessed chars:  []
30824  possible words.
Computer guesses:  e
['_', '_', '_', '_', 'e', '_', '_', '_', '_', 'e'] ['e']

['_', '_', '_', '_', 'e', '_', '_', '_', '_', 'e']
Guessed chars:  ['e']
813  possible words.
Computer guesses:  r
['_', '_', '_', '_', 'e', '_', 'r', '_', '_', 'e'] ['e', 'r']

['_', '_', '_', '_', 'e', '_', 'r', '_', '_', 'e']
Guessed chars:  ['e', 'r']
25  possible words.
Computer guesses:  a
['_', '_', '_', '_', 'e', '_', 'r', 'a', '_', 'e'] ['e', 'r', 'a']

['_', '_', '_', '_', 'e', '_', 'r', 'a', '_', 'e']
Guessed chars:  ['e', 'r', 'a']
11  possible words.
Computer guesses:  t
['_', '_', '_', '_', 'e', '_', 'r', 'a', 't', 'e'] ['e', 'r', 'a', 't']

['_', '_', '_', '_', 'e', '_', 'r', 'a', 't', 'e']
Guessed chars:  ['e', 'r', 'a', 't']
6  possible words.
['consecrate', 'palpebrate', 'perpetrate', 'subserrate', 'uniserrate', 'vertebrate']
Computer guesses:  p
['_', '_', '_', '_', 'e', '_', 'r', 'a', 't', 'e'] ['e', 'r', 'a', 't', 'p']

['_', '_', '_', '_', 'e', '_', 'r', 'a', 't', 'e']
Guessed chars:  ['e', 'r', 'a', 't', 'p']
6  possible words.
['consecrate', 'palpebrate', 'perpetrate', 'subserrate', 'uniserrate', 'vertebrate']
Computer guesses:  s
['_', '_', '_', 's', 'e', '_', 'r', 'a', 't', 'e'] ['e', 'r', 'a', 't', 'p', 's']

['_', '_', '_', 's', 'e', '_', 'r', 'a', 't', 'e']
Guessed chars:  ['e', 'r', 'a', 't', 'p', 's']
3  possible words.
['consecrate', 'subserrate', 'uniserrate']
Computer guesses:  c
['c', '_', '_', 's', 'e', 'c', 'r', 'a', 't', 'e'] ['e', 'r', 'a', 't', 'p', 's', 'c']

['c', '_', '_', 's', 'e', 'c', 'r', 'a', 't', 'e']
Guessed chars:  ['e', 'r', 'a', 't', 'p', 's', 'c']
1  possible words.
['consecrate']
Computer guesses:  n
['c', '_', 'n', 's', 'e', 'c', 'r', 'a', 't', 'e'] ['e', 'r', 'a', 't', 'p', 's', 'c', 'n']

['c', '_', 'n', 's', 'e', 'c', 'r', 'a', 't', 'e']
Guessed chars:  ['e', 'r', 'a', 't', 'p', 's', 'c', 'n']
1  possible words.
['consecrate']
Computer guesses:  o
(['c', 'o', 'n', 's', 'e', 'c', 'r', 'a', 't', 'e'], 9)

And why should we settle for just a two-player game? Let's get all Wheel of Fortune and allow multiple guessers.

The Code

Most concepts in the game have a corresponding class that encapsulates the logic for that concept. To make this project concrete, we have gone through a process similar to what we did together in lab to design a set of classes for this game. In the project you will implement these classes, but hopefully in the future, when you are designing software systems from scratch, this will help you think about that design process.

The classes go together to make a game flow such as the following 2-player game with a computer picking the word and the human guessing the letters. You can find the following snippet of code in human.py.

from wordset import Dictionary
from player import Player, ComputerPlayer, HumanPlayer
from game import Game

def human_plays():
    dictionary = Dictionary("assets/lincoln.txt")
    Player(dictionary)
    picker = ComputerPlayer()
    guesser = HumanPlayer(name="guesser")
    game = Game(picker, [guesser] )
    board =  game.play(True)
    print("Solved ", board.word()," in ",len(board.guesses()), "guesses")

if __name__ == "__main__":
    human_plays()

The Dictionary class takes a path for a file, reads in the file and extracts all the words in it to build a set of words to be used for the game. (Here we are using Lincoln's Gettysburg address.)

Player is a base player class defines the interface used by all type of players. In constructing it, a common dictionary is made available to all the player objects that may be involved in the game. Two player classes are derived from Player; they are HumanPlayer and ComputerPlayer. We will look at these in more detail below. One player picks the word. The other player (or players) take turns guessing. Here, like a conventional computer game, the ComputerPlayer object picks the word and the HumanPlayer object guesses - by allowing you to type to it.

We build a Game with these players. We then let the Game.play method carryout the whole process - ask the picker player for the secret word, hide that, create a Board, cycle through the guessers letting them guess until they miss.

When the game is all done, the board object is returned, conveying the entire history of play.

These classes are defined in several small files. We briefly describe the classes here. You will implement most of them, testing as you go.

To start a human player version of text-based game, run

python3 human.py

Note that if you are running this command before you implement Game.py, there will be an error. That's because you can't play until Game.py is implemented! You'll get to do that once you finish Problem 4.

Secret Words

The SecretWord class represents the secret word that the picker has in mind and the guesser(s) need to guess. It is implemented in secret.py. It has four special methods and match. Collectively, these provide the abstraction of hiding a secret word. An object is constructed with the word hidden inside. (You'll need to use an attribute for that.) The __repr__ and __str__ special methods are written for you (to suggest that the word is secret). The object can be asked for the length of the secret word (len of the object) and match can be given a character which it matches against the secret word and returns a list of indices of characters that match. Notice that the doctest is on the class. It shows how the methods work together. It does nor reveal the object attributes that might be used, nor the representation.

Word Sets

The WordSet class represents what the players keep in their heads - a bunch of words. It is implemented in the file wordset.py. A WordSet is constructed by providing it a text string or a list of words. In the first case, it splits the string into words. In either case, it strips out the punctuation (take a look at string.punctuation for a hint) and converts the word to lowercase. (A function is provided in utils.py to help with that.) Each word should appear only once in the WordSet, i.e., eliminate the duplicates.

The words method returns a sorted list of the words in the WordSet. This does not mean that they are represented internally as a sorted list, but that is certainly one option. The __contains__ special method tests whether a word is in the WordSet. This allows the in operator to be used in expressions with a WordSet as shown in the doctest. You are to implement the three methods. You can use whatever data structure you like to store the words, but you might want to look at set.

Dictionary

The Dictionary class represents the set of words that is being used in the game. Only words in the dictionary can be picked. It is also in wordset.py. It inherits from WordSet but redefines the __init__ method to obtain the words from a file. We have implemented this for you in order to show the preferred method of opening and reading a tile.

#
# Dictionary class
#
class Dictionary(WordSet):
    """
    Construct a dictionary from all the words in a text file.
    Subclass of WordSet with a file based initializer.

    >>> from wordset import Dictionary
    >>> Dictionary('assets/lincoln.txt').words()[55]
    'government'
    """
    def __init__(self, filename):
        with open(filename) as fp:
            text = fp.read()
            WordSet.__init__(self, text)

Notice also how it uses the superclass to invoke the initialization method in that class.

Board

The Board class represents the board that holds the state of the game. Logically, it holds the sequence of spaces that represent the characters that have been guessed so far. It also keeps track of all the characters that have been guess so far. When it is created it is provided with a SecretWord object that encapsulates the secret word. It never peeks inside, but it holds on to it. The Board object provides a set of methods that allow players to get information about the board - the length of the word that is being guessed, the list of characters that have been guessed so far, with the unguessed positions represented by '_', the guesses, hits, and misses so far.

It also has methods that are used to play on the board. The guess method is the main operation that moves the game forward. When a player guesses a character, it is applied to the board using this method. The done method determines if the entire word has been guessed, i.e., the game is done.

We have provided you with the display method, which presents the state of the board to the screen for the human user.

Player

The Player class is a base class that provides only an interface. You can find it in player.py. This interface is used by all types of players. Every player takes in a dictionary which is uses to populate possible_words.

HumanPlayer

The HumanPlayer class inherits from Player and implements that interface. It provides the interface to screen and keyboard to support the human player. We have provided this for you so that you do not need to worry about how to format the I/O (input/output).

class HumanPlayer(Player):
    """
    HumanPlayer is initialized with a name and implements the player interface
    such that:
    - guess requests a guess from a person, via the input device
    - pick_word requests a secret word and verifies that it is in the dictionary
    """
    def __init__(self, dict=Dictionary('assets/lincoln.txt'), name='Human'):
        """Creates a player with the name and word set"""
        super().__init__(name, dict)

    def guess(self, board):
        """Asks the user to guess a character."""

        print(self.name, ", please enter your next guess.")
        guess = input()
        while (len(guess) != 1) or (guess in board.guesses()):
            print('Please enter a single character not yet guessed')
            guess = input()
        return guess

    def pick_word(self):
        """
        Requests a word from user to use as a secret words.
        Only allows words from the dictionary.
        """
        print(self.name,", pick your secret word.")
        word = input()
        while not word in self.possible_words:
            print(word, " is not in the dictionary. Another:")
            word = input()
        return word

The pick_word method asks the human for a word and makes sure that it is in the dictionary. The guess method asks the human for a character and makes sure that it is a valid guess.

ComputerPlayer

The ComputerPlayer class is the fun part. It has the same methods as the HumanPlayer but it has to do inference in the characters that have been guessed so far, looking at the dictionary, in order to make a good guess. You get to have fun with how it might pick words for you to guess.

WordMunch

The WordMunch class is also in wordset.py. This is designed to help your ComputerPlayer. Think about how you are going to guess characters from looking at the board and looking at the dictionary. What are the tricks that you use when you guess? Generally you start with e because it is the most frequently used character in the English language.

WordMunch inherits from WordSet and is intended to hold smaller and smaller set of words, typically starting with the full dictionary, as the guesses narrow down the possible words that could be the secret word. The frequency method computes the number of times each lowercase character occurs in all the words contained in the WordSet. The filter method discards all the words that don't satisfy some predicate, as specified by the higher order function that is passed to it. It applies the function to every word, keeps the ones that come out Truth-y and discards the others, thereby narrowing down the possibilities.

Game

The Game class implements the taking turns among players. It is in game.py. The initialization receives the player objects - the picker and all the guessers. The play method carries out the logic of the game. It has the picker pick a word. Makes a secret out of it. Creates the initial board with this secret. It then cycles through the guessers, giving each a chance to guess repeatedly until it misses. The winner method returns the player that won the game after it is done.

Problems

Now that we have introduced you to the design of all the classes, it's time to get to work building the game. Before you begin (and in general with any project), make sure you take the time to read through the code of the project and pay attention to the comments we included. They are there for your benefit.

Problem 0 (1 pts)

Answer the following questions with your partner after you have read the entire set *.py files. If you cannot answer these questions, read the file again or ask a question on Piazza.

  • In SecretWord, you want the secret word to be private. How do you name object attributes to indicate that it is private?
  • What data structures could be used to hold the word set? What would make one choice better than another?
  • Notice that WordSet.words should return a sorted list of words. How can we take advantage of this sorted property to better check if a word is in the WordSet.
  • How does Dictionary invoke the initializer for its superclass?
  • Do Players ever access the SecretWord object? What does Board reveal about the secret word?
  • Compare these two approaches for providing a set of possible words to players:

    • Have each subclass of Player (ie. HumanPlayer, DummyPlayer, ComputerPlayer) take in a dictionary of words
    • Have a dictionary of words as a class attribute of Player.
  • Which of the above approaches will be implemented in the project?
  • Why does the Player base class define methods that don't do anything?

Put your answers in a text file titled answers.txt and be prepared to explain them to a TA at lab or during office hours.

Problem 1 (2 pts)

Before you begin: take the time to actually read the entire spec. Programming as a data scientist, software engineer, researcher or essentially in any professional capacity requires a deep understanding of the problem at hand. You will usually produce something similar to the spec provided above in a file called README.md. Understanding how we are modeling the game will save you time in the long run. Trust us:)

Complete the implementation of the __init__, __len__, and match methods in the SecretWord class in file secret.py.

Test your understanding with:

python3 ok -q 01 -u

Use OK to test your code:

python3 ok -q 01

Problem 2 (2 pt)

Complete the implementation of WordSet by filling in the __init__, words, and __contains__ methods in wordset.py.

Hint: Take a look at these string functions. split, strip, and lower are particularly useful! Hint: take a look at string.punctuation

Test your understanding with:

python3 ok -q 02 -u

Use OK to test your code:

python3 ok -q 02

Problem 3 (3 pt)

Complete the implementation of Board. Please note that this has several small methods that work together. The doctest tests them in combination. It does not depend on the representation that you choose to use for the board and for keeping track of the letters than have been guessed, hits, or misses. You will need to decide on the representation. You should write tests for your individual methods. It will solidify your understanding and make you a better programmer.

Test your understanding with:

python3 ok -q 03 -u

Use OK to test your code:

python3 ok -q 03

Problem 4 (3 pt)

Complete the implementation of Player. This is just the initialization method to set the class attribute.

Now you should pass the first test case

Use OK to test your code:

python3 ok -q 04

Complete the implementation of the pick_word method of ComputerPlayer. Hint: you will want to look at random.sample.

Since ComputerPlayer.pick_word needs to yield a random word, we provided a test using random.seed, a way to initialize the random number generator to get deterministic results. This method should not take many lines.

Now you should pass the first two test cases.

Use OK to test your code:

python3 ok -q 04

Complete the implementation of the Game class in game.py.

The init method is provided with a player object that is the picker and a list of players that are the guesses. The game engine is the play method. It gets the secret word from the picker, turns it into a SecretWord object and creates the board with this secret. Then it cycles through the guessers until the game is done. A guesser is presented with the board by invoking its guess method. The resulting guessed character is the played onto the board with the board's guess method. If the guess is successful, the player gets to keep playing. When it missed, the next player gets a turn. Finally, the winner method can be invoked when the game is over to obtain the winning player.

Testing the game is a little trickier, because it normally works with player object that take human input or have non-deterministic behavior. We've provided dummy players that you can test your Game against these before you connect your HumanPlayer and picker-only version of ComputerPlayer.

At this point you should have a complete human player game. Run:

python3 human.py

When you get tired of Lincoln, go find a more interesting book to use for words. The bigger the book, the better chance you'll have against the machine. You can grab Shakespeare from the first data 8 lecture. Or on Unix machines, including macs, you can use /usr/share/dict/words. Have fun!

Test your implementation before moving on: The provided ok test is pretty basic, as it only uses dummy players. Ensure that your code can run a full game with human player before moving on to Q5.

Use OK to test your code:

python3 ok -q 04

Problem 5 (4 pt)

Now you get to turn the tables and build the computer guesser. You will do a sequence of implementations of the guess method in ComputerPlayer corresponding to increasing skill levels of the player. When skill_level = 0 guess the most frequent letter that has not been guessed. Break ties alphabetically, key_of_max in utils.py does this automatically:) To do this, you must first complete the frequency method in wordset.py

Example: If your dictionary has ["so", "leap", "booth", "yea"], and your secret word is yea, your frequency dictionary will be {'o': 3, 'a': 2, 'e': 2, 'b': 1, 'h': 1, 'l': 1, 'p': 1, 's': 1, 't': 1, 'y': 1} your computer player should guess in that order, sorted alphabetically, then by frequency until you guess yea.

Try this out using:

python3 computer.py

You get to pick words and see what this algorithm has trouble with. Observe how many guesses it takes to guess words that you give it.

For skill_level = 1, filter out all the words in the dictionary that are not the correct length. You can use the frequencies of letters in the filtered list to make your guess. To do this, you must first complete the filter method in wordset.py. Note that the argument ffun in filter should be a predicate function that takes a word as its argument and returns True or False.

Example: If your dictionary has ["so", "leap", "booth", "yea"], and your secret word is yea, your frequency dictionary will be {'a': 1, 'e': 1, 'y': 1} your computer player should guess in that order, sorted alphabetically, then by frequency until you guess yea.

Test your understanding using:

python3 ok -q 05 -u

Hint: it may be easier to iterate through all of the letters of the alphabet instead of iterating through every letter of every word in your word set while you're building your dictionary. Take a look at the string documentation page for a value that might contain all lowercase letters.

Use OK to test your code:

python3 ok -q 05

The doctests for this question simply run the unlock tests, which imply you are on the right track, but might not be fully correct. Use computerTest.py to test your code more robustly.

The file generates dictionaries mapping all the words in the Gettysburg Address to the number of guesses your players took. You should use python -i computerTest.py to run the file. Implement the functions to fill in the provided dictionaries with your players guessses and see how you match up with the staff solution. You should pass the first two statements after q5.

Use OK to test your code:

python3 ok -q computer_plays

Problem 6 (3 pt)

Level up your ComputerPlayer by guessing the highest frequency character in the words that match up with your guess so far. For skill_level = 2, at each successive guess, use the correctly guessed characters to filter out words of the right length that do not match these characters.

For the previous skill levels, your guessing order was determined at the begining and didn't change as you kept guessing. This time, we want to adaptively filter out words that don't match the parts of the word that we have guessed correctly so far.

Example: If your dictionary has ["bee", "see", "sea", "tea"], and your secret word is see, guessing 'e' will narrow it down to bee and see since sea and tea do not match the secret word's last character.

Test your understanding using:

python3 ok -q 06 -u

Use OK to test your code:

python3 ok -q 06

The doctests for this question simply run the unlock tests, which imply you are on the right track, but might not be fully correct. Use computerTest.py to test your code more robustly.

You should use python -i computerTest.py to run the file. Add the player with skill_level = 2 and store the results in g2. You should pass all three statements after q6.

Use OK to test your code:

python3 ok -q computer_plays

Once you have completed all questions, you can run all the OK tests at once using the command below.

python3 ok

Congratulations on finishing the project! See the Project Submission tab for submission instructions.

Extra Practice

If enjoyed this project and want to take it a step further, play around with the following extensions. These extra features are only for practice and will not be graded.

  • Implement a multiplayer version of computer player.
  • Instead of character frequencies for all unused characters in the possible words, look at character frequencies for the specific unguessed slots. Build a frequency per slot and use that in making the guess.
  • While it might seem wise to make the most likely guess, a rather different choice is the guess that will put you on the shortest path (fewest misses) to a unique determination. Once you know the word you should not miss at all. This approach introduces a bit of a search tree.
  • Come up with better ways of guessing, such as looking at character combinations. Show how much better these perform than simply filtering on length and match.