
Search

Support DFF
If you benefit from the website, in terms of
knowledge, entertainment value, or something otherwise useful,
consider making a donation via PayPal to help defray the
costs. (No PayPal account necessary to donate via credit
card.) Transaction is secure.
If you shop at Amazon anyway,
consider using this link. We receive a few cents from each purchase.
Thanks.

Contact
Feedback:
Send an e-mail with your
comments about this program (or anything else).

|
| |
Problem Description
This is the second installment of the word manipulation
series of programs. The first Wordstuff1 included
a basic set of dictionaries, a dictionary maintenance program and a word
completion program called CrosswordHelper.
Six programs are included:
Unscramble – an anagram solver that make one, two, or three-word
phrases from a set of scrambled letters.
Decrypt – a cryptogram solver. Cryptograms are messages encoded by
replacing each letter value in the message with a simple substitution code. In
this type of encrypting, a each letter is replaced with a fixed letter value.
I.e. all "a" with "f", all "b",
with
"k", etc.
Word Completion - (Crossword Helper). Give it a partial word with
unknown letters indicated by space or underscore characters, and it will return a
list of all words in it's dictionary which fit the "mask".
WordLadder - Change "dirt" to "gold" in four
steps. This program was originally
posted as a standalone program and is now also available as part of
Wordstuff.
Scrambled Pie: Supply the the missing letter to the four letter
groups in this "pie" and unscramble the resulting words.
WordStuff: A "wrapper" program that lets the user start any of the
other five.
Background & Techniques
Unscramble: The
trick to solving anagrams on the computer, once we have a dictionary
available, is to find letter combinations that match dictionary entries. Rather
that generate all possible letter combinations and see which are valid words, it’s
generally more efficient to check words in the dictionary and see which contain
the letters we’re matching. The complication here is that, in addition to
single words, we want to generate 2 and 3 word phrases that use all of the given
letters. The user provides minimum and maximum word lengths to search. Within
this range we generate a firstwords list of matching words and a
corresponding list of remainders, the unused letters. Then, for each of
the words in this list, we’ll repeat the search procedure with the remainder
list. If we’re looking for three word phrases, we repeat this one more time,
making a remainders2 list to check for 3rd
words.
A TWords object handles the actual checking of letters against the
dictionary. The length range is determined by the max and min lengths specified
by the user (except for the final search which uses a length equal to all
remaining letters). A "FirstLetters" string is built containing
the unique letters in the input letter string – used as starting points
for word retrieval from the dictionary.
Decrypt:
Cryptogram
solving is another example of
trial and error problem solving. There are, however, 26! (26 factorial =
26x25x24…x3x2x1 or about 4X1026) possible substitution codes. Thus
it is highly unlikely that we could decrypt the message by trying all codes. But
we can use the dictionary to prune the search space significantly. I sort the
encrypted words from longest to shortest on the (untested) theory that there are
fewer long words than short words in the dictionary, thus we should prune
invalid combinations quicker. The trystring string is a list of the
unique letters in order of appearance this sorted list. Rather than
generate a complete substitution code, we can start by creating one of the 26
single letter substitution codes. The order of the letters to translate to comes
from the tryorder
string. This list is currently defined as the frequency of occurrence of
letters in the English language.
Now if we substitute the first letter in tryorder for the first letter
in trystring everywhere it occurs and consider all other letters
unknown, we basically have the word completion problem. If there is at least one
possible dictionary word for each encrypted word, we can choose a second letter
from tryorder, try it, then a 3rd letter, etc. If it fails,
we just mark that letter as tried and go on to the next. By doing this
recursively in the FindNextLetters
function, we’ll eventually resolve all of the words or run out of letters
to try.
WordStuff:
This is an experimental example of creating a
"wrapper" program than manages several other programs. It takes care
of loading the dictionaries and calling one of three programs: CrosswordHelper,
Unscramble, or Decrypt, based on used selections. The code required to allow the
same unit to run as a main form or as a secondary form is fairly simple. Each of
the 3 programs has a test in its FormCreate method to se if it is the
main form (If applicaton.mainform=self). If it is the mainform,
the program knows that it is responsible for managing the dictionary itself.
Otherwise it can use the pubdic (and possibly privdic)
dictionaries defined in the Udict unit and preloaded by the calling
program.
Crossword Helper: This is a word complwtion program
implemented and described previously as
part of WordStuff1
Addendum Feb 8, 2003: I posted a new version
of Wordstuff today which incorporates a 4th program (WordLadder) under the Wordstuff wrapper as well as some
minor cleanup.
Addendum April 5, 2003:
A revised version of Decrypt was installed in Wordstuff
today. This version is a little smarter about deciphering encoded
messages. It builds a list of possible decipherments for long words in the
message and tests those letters first. Remaining letters are
tested in the order of frequency of occurrence in the message matched against
order of occurrence in the English language. Message words not
in the dictionary are a problem that still needs work. Manual intervention
has been the most effective solution so far, and I did add some "Hints and
Tips" to the Introduction page. A few
additions were made to the "Full.dic" dictionary based on decryption
testing. The dictionary contained "cryptogram" but
"cryptograms" for example. The dictionary structure
probably needs to be modified to better handle the whole problem of word
suffixes.
Addendum July 7, 2004:
In response to a user request, I've modified CrossWord Helper today to accept a
list of letters to be excluded from search results. This was to to help
him in deciphering phrases in a computer game called Flip Words.
Once some letters are known, the player may guess the phrase at any
time. Excluding letters known to be in the phrase (but not in
the word of interest) can reduce the number of potential
answers.
Addendum August 25, 2005:
A 5th program, Scrambled Pie,
previously posted, was added to Wordstuff today.
Addendum May 11,2006: An
update to Dicmaint was posted today with two changes:
- If the 'SaveAs' option was used to save the
dictionary as a text file, capitalization information was lost.
Capitalized words are now saved as capitalized and the characters
",C" are appended to match the ",A" and ",F" used to flag
abbreviations and foreign words.
- A user recent wrote about problems he was having trying to
create a dictionary from a file with 350,000 French words.
Dictmaint was taking a very long time to process the file. It
turned out that the file was in Unix format (no Carriage returns)
which made it look like a single record 3.7 million bytes in length
to any windows standard program. Dicmaint used a destructive
Getword routine - each word was deleted from the record in
memory as it was retrieved, a very slow process for a record of that
length. Getword now extracts words without modifying the
record. It turns out that using the
FileFix utility
to correct the input file was even more significant it getting run
time down to a few seconds. But I left the new Getword
routine in as an improvement anyway.
DFFLibV13
Addendum November 17, 2006: A few
enhancements to Decrypt, the decryption function of Wordstuff were posted today.
Users may now selectively use or omit words in the decrypted message during the
search for a solution. Also specified plaintext words may be ignored
and not counted as valid words in the deciphered message.
Addendum December 7, 2006: A small
fix to Dicmaint was made today : When saving a text version of a
dictionary, it should automatically have been saved as a text file. The
test for file extension was for '.ext' instead of '.txt' so a .'.txt'
dictionary would have been saved in compressed (.dic) format.
The change uncovered the fact that the dictionary unit, UDict,
which has been part of our DFF Library file some some time, was also included in
the source code zip files for Dicmaint and Wordstuff. As a
result, naturally, the UDict included with these programs was
not the latest version. As of today, UDict has been removed
from the program source files and users wishing to recompile these programs will
need the DFF library zip file, version DFFLibV08 or later.
Addendum July 8, 2007: A few small
minor program changes were posted today. Namely:
- Crossword Helper was enhanced to solved a variation of the word
completion problem. My Mensa Daily Puzzle calendar has a puzzle type
with a scrambled word that is missing one letter, kind of a combination of
the "Crossword Helper" and "Unscramble" programs. CrosswordHelper2
now uses some of the code from the Scrambled Pie solver to find the
word when there is a single letter missing but no solution is found
initially. For example, the previous version would have completed "b?by"
as "baby", but would have found no solution for "bby?".
The new version will ask if you want to search for anagrams and return
"baby" if you click the Yes button. It will also find the
solution for the "ARUI?YNGR" calendar puzzle that led to the
enhancement in the first place.
- Search logic used in Scrambled Pie was moved to a new unit "USearchAnagrams"
which is now used by both ScrambledPie and CrosswordHelper2 to
implement the enhancement described above.
- I add words to full.dic whenever it failed to contain the answer to some
word puzzle I'm working on. The Dictionaries.zip download file
was updated to contain the latest version of our full.dic dictionary
file.
Addendum August 10, 2007: Another small update to Crosswordhelper this
month. I'm not sure when it got broken, but a viewer recently pointed that
the "Excluded letters" option did not work. Crossword Helper displays a
list of valid words from the current dictionary when some of the letters
and their positions are known. The excluded letters list specifies
letters that are not to be used in completing the word, however the code
was checking upper case word letters against lower case excluded letters.
Not surprisingly, it never found a match and included all letters in the
resulting display. Now fixed
Addendum September 19, 2009: A recent laptop upgrade revealed some
screen layout problems with some of the DFF programs, include the WordStuff
modules described here. See the
DPI Scaling page for more
details. No significant changes otherwise.
Addendum April 25, 2010: Another minor
change today to fix a problem that has bugged me for a long time. When
entering a partial word to be analyzed in "Word Completion" or letters to be
analyzed in "Unscramble" it had been necessary to switch from keying to a mouse
click on the Search button to find solutions. Now, pressing the "Enter"
key also triggers the search. The simple code in the edit control's OnKeyPress exit looks like this (red text = comments):
begin
if ord(key)=13 then
{Enter key was pressed}
begin
key:=char(0);
{null out the key}
SolveBtnClick(sender);
{call the search button click routine}
end;
end;
Running/Exploring the Program
The Dictionaries and DicMaint downloads below duplicate the download on
the WordStuff 1 page.
The dictionaries were last updated on 7/3/07. If you have
downloaded dictionaries once since then, there is no need to download again. If
you have not previously downloaded the dictionaries, you will need to do
so, before running WordStuff 2.
Suggestions for Further Explorations
 |
Decrypt has
lots of room for study. Should trystring be built based on frequency
of letters in the message, or order of occurrence when words are sorted from
shortest to longest? Perhaps doubled letters should be tried first, or look
for digraphs (letter pairs that occur frequently), word endings, etc. The
first step would be to implement timing code so that the efficiency of various
techniques could be explored. |
 |
The current
version of Decrypt is "all or nothing", a
single encrypted word not in the dictionary will result in a "No solution
found" message. That's not a good thing. |
 |
Personally, I’m
going on to develop other word games using the dictionary. Next will be WordFinder –
find as many valid words as possible from a random array of letters by moving
horizontally, vertically, or diagonally from letter to letter. More path
searching for the auto-solve part - oh boy! |
| Created: February 4, 2001 |
Modified:
April 25, 2010
|
|