WordBox: Help

General Help

wordbox.zip contains all the source and makefiles for gcc required to build wordbox.prc. In addition, there is a 129,000 word dictionary file, dict.pdb, that must be installed for wordbox.prc to function properly. I'm still researching better dictionaries than the one I used for dict.pdb, and look for smaller versions that will run on Pilots with less memory in the future.

Rules

WordBox is a word search game played within a box 5 letters wide and 5 letters tall. Words may be formed by starting at any letter and tracing to any adjacent letter, including diagonally. The minimum word length is 4. Letters from the box may only be used once in each word. Scoring can be done in a total of four different ways: Exponential, Linear, All words, and Unique words:

Games can be played for lengths of 2 to 5 minutes in full minute increments.

The top 10 scores are maintained for each combination of scoring method used and game length.


Source code notes

wordbox.rcp is, unfortunately, NOT readable by PilotPro - yet. While developing WordBox I used PilotPro up to the point that I discovered several bugs in the .rcp parser which I am still working on correcting. If you examine the source code closely you'll see that in the course of it's 1500 or so lines I have included a grand total of maybe 10 comments. I am exceeding bad about documentation, I hate to do it. I apologize in advance to anyone hoping to see a thoroughly documented example program...the answer you seek might be in the source, but you'll have to seek them out for yourself for now! :)

The format of the dict.pdb file is fairly straightforward. Record 0 is an index record containing all the 3 letter prefixes for all words in the dictionary. For example, the first entry in record 0 is aar, for aardvark. The index position of each 3 letter prefix (i.e. the actual offset in the record divided by 3) corresponds to one less than the record number of that section of the dictionary. So all words starting with aar are located in record 1 of the dictionary.

Records 1 through the end of the pdb comprise the dictionary itself. Each record follows a rather bizarre encoding scheme that I designed to optimally efficient for the on-the-fly decompression needed by the word lookups. Firstly, each letter is lower case but is not stored as ASCII but by the the ASCII code minus 'a' to make the codes 0 based. This means each letter only occupies 5 bits in a byte. The upper 3 bits of each first byte in a word indicate the compression length used from the previous word in the dictionary. This is best illustrated by example:

(All numbers given are hex) aardvark = 0,0,11,03,15,00,11,0a

aardvarks = d2

The upper 3 bits of the first byte in a word are the number of chars to copy from the previous word minus 2. Values of 0 and 7 are reserved, 0 indicating that the byte is simply a letter in the word and 7 indicating that this word copies no letters from the previous word. That leaves a copy range of from 3 to 8 letters, which yield values of 1 to 6 in the upper 3 bits. This technique allows the beginning of a word to be recognized by the fact that the upper 3 bits will have a non-zero value, while all remaining letters in a word will have a 0 value in the upper 3 bits. Each record (not word) is terminated by a ff.

I know this compression technique can be improved upon but it yields a compression ratio of about 3.1 to 1, and with the help of record 0 the lookups are very fast.