Hangman Advisor

Hangman Advisor

Dr. Phillip M. Feldman


1. Introduction

The Hangman Advisor provides advice to someone who is trying to guess the hidden word in Hangman. It analyzes the template and recommends the next letter to be guessed, using four different algorithms described below. Note that each algorithm may produce multiple recommendations, and that the number that accompanies each recommendation may be interpreted as a level of confidence in that recommendation (higher numbers are better).

The program accepts the following inputs:

2. Run the Hangman Advisor Online

  Template:      Excluded letters:     

Small spelling dictionary Large spelling dictionary


3. Outputs

  1. The program finds all English words that (A) appear in the spelling dictionary, (B) match the template, and (C) do not contain any excluded characters. These words (up to the first 250) are reported by the program.

  2. The program generates a list of candidate letters for subsequent guesses. These are letters that appear in words that match the template but that do not appear in the template or in the list of excluded characters.

  3. If no letters have been guessed, the program will recommend the letter e.

  4. Otherwise, the following four algorithms are exercised:

Note:

4. Depth-1 Search

The following pseudo-code explains the operation of the depth-1 search, which generates the data for Algorithms 1 and 2:

   Find all words that match the template and that do not contain any excluded
   letters.  Store these as `words`.

   Find all letters that appear in `words` but not in the template.  Store these
   as `candidate_letters`.

   for letter in candidate_letters:

      for word in words:

         if the candidate letter appears in the word:

            Update the count of template positions filled (underscores replaced
            by letters) with this candidate letter.  Note that the same letter
            may appear in more than one position.

         else:

            Update the count of words on which a life was lost with this
            candidate letter.

   Report results.

5. Depth-2 Search

The following pseudo-code explains the operation of the depth-2 search, which generates the data for Algorithms 3 and 4:

   Find all words that match the template and that do not contain any excluded
   letters.  Store these as `words`.

   Find all letters that appear in `words` but not in the template.  Store these
   as `candidate_letters`.

   for letter1 in candidate_letters:

      candidate_letters2= candidate_letters with letter1 removed

      for word in words: # (all words that match the original template)

         if the candidate letter appears in the word:

            Update the count of template positions filled (underscores replaced
            by letters) with this candidate letter.  Note that the same letter
            may appear in more than one position.

         else:

            Update the count of words on which a life was lost with this
            candidate letter.

         for letter2 in candidate_letters2:

            Repeat above counting process for letter2.

         For letter1, record best results over all possible letter2s.

   Report results.

Expectations are taken over all words that match the original template.

The number of positions filled and the number of lives lost must count letter 1 and letter 2 together.

If a given letter 2 did not fill a position in any word, it represents a wasted guess that would not have been made by an intelligent player having knowledge of the dictionary against which he/she is playing.

6. Example and Conclusions

I ran the program with the template _a___, with the single excluded letter e, using the small spelling dictionary. The program reported 221 matching words. Results from algorithms 3 and 4 are moderately interesting.

Algorithm #3: Considering the expected number of template positions filled in
after two guesses, the following letter(s) are recommended: i (0.92), l (0.97),
n (0.93), r (1.00), s (1.00), t (0.99),

Algorithm #4: Considering the expected number of template positions filled in
divided by the expected number of lives lost, the following letter(s) are
recommended: l (0.85), r (0.93), s (0.93), t (0.89),

The seven English letters used with the greatest frequency are e, t, a, o, i, n, and s. e and a have already been guessed, so this leaves t, o, i, n, and s. Note that although o appears second in this list, algorithms 3 and 4 do not recommend it, which is consistent with the fact that o does not occur with high frequency in the words that match the template.

This example demonstrates that English letter frequencies are a reasonable starting point for deciding what letters to guess in Hangman, but that one can do better by performing a search against the dictionary of allowed words (assuming that such a dictionary exists and that the hidden word has been chosen at random from this dictionary).

Last update: 30 Mar., 2014