                                MUSINGS
   Concerning relative vowel and consonant frequencies in the OSPD3, and 
       conclusions that may be drawn about rack balance therefrom.



This is not a formal research piece. It is easy enough to compile statistics 
on letter frequencies in the OSPD3 using the computer, but drawing useful 
conclusions from these statistics is another matter altogether. With that 
disclaimer out of the way, let us begin.

Let us call the proportion of vowels to total letters in a given word list, 
expressed as a percent, V%. For all the words in the OSPD3, V% = 38.56, that 
is, vowels make up 38.56% of all the letters in all the words in the OSPD3. 
Similarly, V% for the 1254 words newly added to the OSPD3 is 45.32. The words 
added to the OSPD3 have a rather higher proportion of vowels to consonants 
than does the typical run of words already present there.

Taking V% for a supplementary list of 133,282 words longer than eight 
letters, we get 41.73. A tentative conclusion is that lists of very short 
words and very long words have a higher V% than lists of intermediate length 
words. This can be tested.

Calculating V% for the OSPD3 according to words of given length, we get the 
following figures:

2-letter words:  53.57
3-letter words:  41.99
4-letter words:  38.92
5-letter words:  38.66
6-letter words:  39.14
7-letter words:  38.75
8-letter words:  38.91
9-letter words:  37.30
10-letter words: 38.82
11-and above letter words: too few words in OSPD3 to analyze meaningfully.

It appears that V% does indeed settle down to a figure in the range of 38 - 
39 for 4 to 10 letter words. What does this mean in terms of rack balance and 
playing strategy in a real world Scrabble (tm) game?

It would appear that a balanced rack (7-letters) should have about 3 vowels 
and 4 consonants (43%, the closest approach to a V% of 38-39). Of course this 
is scant consolation if you have a "balanced" rack of VWXZUUU. The trick is 
having the *right* consonants and vowels, and less critically the relative 
proportion.

=============================================================================

Words in the English language, and OSPD, are "random"* in the sense defined 
by the mathematician John Casti. This means that words cannot be reliably 
constructed by a formula or algorithm. For example, given the set of 
consonants, C{ b, c, d, f, g, h ... } and the set of vowels, V{ a, e, i, o, 
u, y }, try to find a method of creating English words, say by taking 3 from 
set C and 2 from set V. This approximates the V% found above. Most of the 
"words" formed by trial and error by this 3-to-2 rule will form strings of 
letters not found in any English language dictionary, nor in the OSPD3, 
non-words in other words.

Casti defines a "random" number as a real number whose shortest representa-
tion is itself. By the same token, I would say a "random"* word is likewise 
one whose simplest representation is itself. Therefore, =all= the words in 
the English language, and the OSPD, are "random". There is no mathematical 
formula for constructing words in any spoken / written language. This gives 
natural human languages their richness, complexity, diversity, and unpre- 
dictability.   Ain't language wonderful, hon! 

footnote:
--------
*You could also make a case that words are "chaotic" rather than "random", 
that is, falling into a pattern, but not one that is predictable or 
computable.

============================================================================
Addendum: A couple of interesting "imbalanced" vowel / consonant word lists.
         By increasing length, champion "imbalanced" words:

HIGH-CONSONANT LIST             
crwth
crwths
tsktsks
borschts
strengths
throttling
abstractest
backdropping
scratchbrush (not in OSPD3)

HIGH-VOWEL LIST
aalii
euouae (not in OSPD3)
yautia
ouguiya
aboideau
zoogloeae
homoiousia (not in OSPD3)
squeegeeing
housesitting



Scrabble and OSPD are trademarks of the Milton Bradley Co., Inc.


The above musings are the product of the demented mind of the author of the 
SCRABLST,WAK, and WORDY packages.

                               M\Cooper
                              PO Box 237
                       St. David, AZ 85630-0237
           ------------------------------------------------
                    E-mail: thegrendel@theriver.com
           Web: http://personal.riverusers.com/~thegrendel/
