HW7: Hinky Binky Hanywoohoo man

Assignment 7: The hinky binky hangywoohoo man

Due Date

Thursday, December 04, 2014

Collaboration Policy - Read Carefully

As other assignments must work on this project individually, but you may discuss this assignment with other students in the class and ask and provide help in useful ways, preferable over our email list so we can all benefit from your great ideas. You may consult (but not copy) any outside resources you including books, papers, web sites and people (but no penguins or sea urchins).

If you use resources other than the class materials, indicate what you used along with your answer.

Key Concepts (and Goals):

Convert your Java expertise, and C skills to C++ expertise.

String processing.

File I/O.

Opportunity to learn about: Linear, and Arrays (associative arrays) depending on the algorithm that you choose.

The Assignment Definition:

Create a program that cheats by adaptively change what the word is so that it is a disadvantage of the human player (i.e., increase the likelihood that the human will lose).

Hangman Background:

You will program a derivation of the classic pen and pencil game of Hangman, but with a twist.

To review, the classic rules of hangman are the following:

One player thinks of a secret word, she does not reveal the world, but represent the word by writing the number of dashes representing the word length.

The opposing player begins guessing letters. Whenever she guesses a letter contained in the hidden word, the first player reveals each instance of that letter in the word. Otherwise, the guess is wrong, and a diagram of 'hangman' progresses towards completion.

The game ends either when all the letters in the word have been revealed or when the guesser has run out of guesses (the hangman diagram is complete).

To review strategies, and the history of hangman see ( here ).

A Sample of the Sequence of Events (off the classic game and your game)

...

You have 5 guesses left. Used letters: a d e f g h i j k l o p q r s u w y Word: -la-ki-g Enter guess: z Sorry, there are no z's. You have 4 guesses left. Used letters:a d e f g h i j k l o p q r s t u w y z Word: -la-ki-g Enter guess: x Sorry, there are no x's.

You have 3 guesses left. Used letters:a d e f g h i j k l o p q r s t u w x y z Word: -la-ki-g Enter guess: c Sorry, there are no c's.

You have 2 guesses left. Used letters:a c d e f g h i j k l o p q r s t u w x y z Word: -la-ki-g Enter guess: v Sorry, there are no v's.

You have 1 guesses left. Used letters:a c d e f g h i j k l o p q r s t u v w x y z Word: -la-ki-g Enter guess: b Yes, there is 1 copy of a b.

You have 1 guesses left. Used letters:a b c d e f g h i j k l o p q r s t u v w x y z Word: bla-ki-g Enter guess: m Sorry, there are no m's. You lose! The word was: blanking Play again?

Tutorial / References:

Interesting reading about hings on how to play hangman:
http://datagenetics.com/blog/april12012/index.html

Wordlist ( 500 words ) good for testing

Wordlist ( 1,000 words ) make it more challenging

Wordlist ( 2,000 words ) lets make it real challenging.

Dictionary.txt (zillions of words from Schwartz).

The Twist:

In this assignment you will write a variation of hangman. Here, your program will take the place of one of the players in the game. The twist is that this program will not select one particular word (specified by dashes), but instead the program cheats by adaptively dodging the opposing player's guesses, and keep track of a set of possible words instead of a particular one word.

The 'program' will maintains a list of words in the English language, and as the opposing player guess letters it will continuously pares down the word list to try to dodge the player's guesses as much as possible.

You may devise any algorithm, as long as your approach causes the human to make more guesses on average than if you picked one word and stuck with it.

A Simple Example Algorithm of Cheating.

The computer cheats by pretending it selected a single word form the dictionary bu tin reality it is constantly maintaining a list of possible words and revising this list to minimize the changes of the human's success.

Here is an example on how the computer may cheat:

Lets suppose you have the following dictionary of words.

BEER, PEAR, BEAR, PEER, PEEP, and BEET.

This list represents the current list of possible words that you may originally choose, we will update the list as appropriate, as the human guesses letters.

Lets say the human pick the letter B. Because we have a number of words that do not contain the letter B, we will remove the words with a letter B.

Now the remaining list is:

PEAR, PEER, and PEEP

Now they select that letter P. We are forced now to give credit for a letter, but we want to give the human as few matches as possible. So, we chose the words with the fewest number of P's. That leaves:

PEAR and PEER.

Finally the human selects the letter E, and again we like to minimize the choice of the human, and we select the match for PEAR, and the computer has arrived to a specific word

An algorithm from Berkeley that may use if you wish:

It is depicted by a flow chart:

A Fancier Assignment that you may use: (abbreviated from the Stanford's nifty site) [cite]

This suggested algorithm is from Keith Schwartz's CS2 and CS1 classes.

He claims that most student's at CS2 can complete the assignment is 5-7 hours.

Here is their approach:

Maintain a list of 'all' words in the English language of a particular length. Whenever the player guesses, the computer partitions the words into "word families" based on the positions of the guessed letters in the words. For example, if the word list is ECHO, HEAL, BEST, and LAZY and the player guesses the letter 'E', then there would be three word families:

E---, containing ECHO.

-E--, containing HEAL and BEST.

----, containing LAZY.

Once the words are partitioned into word families, or equivalence classes, the computer can pick the largest of these classes to use as its remaining word list. It then reveals the letters in the positions indicated by the word family. In this case, the computer would pick the family -E-- and would reveal an E in the second position of the word (because the middle class has the most words.

Here is the algorithm more procedurally:

1) Construct a list of all words in the English language whose length matches the input length (you can of-course simplify this step since you are only implementing a particular length).

2) Print out how many guesses the user has remaining, along with any letters the player has guessed and the current blanked-out version of the word. If the user chose earlier to see the number of words remaining, print that out too.

3) Prompt the user for a single letter guess, re-prompting until the user enters a letter that she hasn't guessed yet. Make sure that the input is exactly one character long and that it's a letter of the alphabet.

4) Partition the words in the dictionary into groups by word family.

5) Find the most common “word family” in the remaining words, remove all words from the word list that aren't in that family, and report the position of the letters (if any) to the user. If the word family doesn't contain any copies of the letter, subtract a remaining guess from the user.

6) If the player has run out of guesses, pick a word from the word list and display it as the word that the computer initially “chose.”

7) If the player correctly guesses the word, congratulate her.

It's up to you to think about how you want to partition words into word families. Think about what data structures would be best for tracking word families and the master word list. Would an associative array work? How about a stack or queue? Thinking through the design before you start coding will save you a lot of time and headache.

Schwartz's Advice, Tips, and Tricks

There is no “right way” to go about writing this program, but some design decisions are much better than others. Here are some general tips and tricks that might be useful:

1. Letter position matters just as much as letter frequency. When computing word families, it's not enough to count the number of times a particular letter appears in a word; you also have to consider their positions. For example, “BEER” and “HERE” are in two different families even though they both have two E's in them. Consequently, representing word families as numbers representing the frequency of the letter in the word will get you into trouble.

2. Watch out for gaps in the dictionary. When the user specifies a word length, you will need to check that there are indeed words of that length in the dictionary. You might initially assume that if the requested word length is less than the length of the longest word in the dictionary, there must be some word of that length. Unfortunately, the dictionary contains a few “gaps.” The longest word in the dictionary has length 29, but there are no words of length 26 or 27. Be sure to take this into account when checking if a word length is valid.

3. Don't explicitly enumerate word families. If you are working with a word of length n, then there are 2n possible word families for each letter. However, most of these families don't actually appear in the English language. For example, no English words contain three consecutive U's, and no word matches the pattern E-EE-EE--E. Rather than explicitly generating every word family whenever the user enters a guess, see if you can generate word families only for words that actually appear in the word list. One way to do this would be to scan over the word list, storing each word in a table mapping word families to words in that family.

=========================================================

The Assignment Definition, and Core Requirements:

Create a program that cheats by adaptively change what the word is so that it is a disadvantage of the human player (i.e., increase the likelihood that the human will lose).

Here are the requirements:

10 possible number of wrong Choices (if you mises the 10th guess that is a lose) .

Word length that the computer selects need to be vary between at least 3-12 letters long. You pick the length randomly.

The name file hat the algorithm use, should be a command line argument (sample word lists are proivded in the red resource box).

You may devise any algorithm, as long as your approach causes the human to make more guesses (on average) than if you picked one word and stuck with it.

You must print a running total of the number of words remaining in the word list to standard error. We will use it for testing (and grading!) your program, and that is why we will 'secretively' print it to stderr not to distract game play.

Must compile, and run on nike. You should develop it in your environment but as a last step make sure it runs on nike.

There will be additional requirments regarding what the output should look like so we can grade your program effieciently.

Challenge: Try to make your Hinky Hangman game as Hinky as possible. Try to come up with an algorithm that will make the player guess as many letters as possible before letting them win.

=========================================================

Submission:

You must submit the following files (i.e., all the files necessary to compile your program):
README.txt

Makefile (when typing make by itself it should generate the executable hinky.
hinky.cpp
hinky.h
. (x-tra files if needed, must be listed in README.txt)
.
README.txt how you run and compile your program.

To submit the files, you will need to use the submit program. Your files need to be under a common subdirectory called "1730_program7". If the1730_program7 subdirectory is directly under your home directory you execute the below command line while in your home directory:

submit 1730_program7 cs1730