For a full list of BASHing data blog posts see the index page.
A command-line "Countdown" (UK) companion
A UK reader recently wrote to me about a coding project:
There is a TV program titled COUNTDOWN on channel 4 here in UK. It is very popular and has been going on for over thirty years.
Nine letters are drawn randomly from two stacks of letter tiles. One stack contains vowels and the other consonants. The contestant who can construct the longest word (within 30 seconds) wins points. The word must be in the approved dictionary.
I wrote a python program that scans the dictionary and lists the longest words (using the nine letters) in the descending order. It does the job within a couple of seconds. So I am always drumming my fingers on the coffee table while the contestants are still sweating!
I bet this can be done by awk without a need for the python program.
Sure enough, within a few weeks the reader had a working alternative based on AWK and grep, and I started thinking about how the script could be improved.
Then I remembered a command-line utility that does most of the word-hunting and listing in one step, and returns longest words first. The utility is called an and was written by Paul Martin with help from work by Richard Jones and Julian Assange (name sound familiar?). If you don't already have an on your system, you can read the man page here.
an is very fast and uses /usr/share/dict/words as its default dictionary. Unfortunately, this dictionary includes many words that "Countdown" doesn't allow, according to the game show's Wikipedia page:
(1) Capitalised words, including proper nouns (e.g. "Jane" or "London")
(2) Hyphenated terms
(3) Words that are never used alone (e.g. "gefilte"; only used as part of "gefilte fish")
(4) American spellings of words (e.g. "flavour" and "signalled" are allowed, but "flavor" and "signaled" are not) even though they were allowed in earlier series.
The /usr/share/dict/words list also includes words containing an apostrophe. I couldn't do much about (3) and (4), but the an output can be filtered with AWK to include only lower-case letters and to prefix results with word length. Here's the full command written as the function "countdown", showing it at work on the string "golpralty" using a 4-letter minimum length:
countdown() { an -w "$1" -m 4 | awk '/^[a-z]*$/ {print length($0), $0}' | column; }
The "Countdown" TV show uses the Oxford English Dictionary as its word validator. I don't know if the OED has a simple wordlist online, but there's an English wordlist on GitHub with more than 370,000 entries. You can replace /usr/share/dict/words with the GitHub wordlist by adding a "- d" option for an: -d /path/to/wordlist.
"Countdown" also requires that the starting string of 9 letters includes at least 3 vowels and 4 consonants. Valid starting strings can be generated programmatically with this fairly ugly command:
cat <(for i in {1..3}; do echo "aeiou" | fold -w1 | shuf -n1; done) \
<(for i in {1..4}; do echo "bcdfghjklmnpqrstvwxyz" | fold -w1 | shuf -n1; done) \
<(for i in {1..2}; do echo {a..z} | tr -d " " | fold -w1 | shuf -n1; done) \
| shuf | paste -d"\0" -s
The command concatenates 3 vertical lists of letters, then shuffles the concatenated list (of 9 characters) and pastes the list into a single line with no separator between the letters.
Each vertical list is generated by echoing a group of letters, folding the group into a vertical list (with option "-w1), then shuffling the list and selecting the 1 letter at the top of the shuffled list (with the option "-n1". This action is repeated 3 times for the group of vowels, 4 times for the group of consonants and 2 times for the group of all lowercase letters. The result is 3 vowels, 4 consonants and 2 vowels-or-consonants.
In the TV game, vowels and consonants are selected from piles with letter frequencies weighted according to their frequencies in "natural English", and letter tiles are not replaced between selections. The command above generates 9-letter strings from unweighted letter groups and the selection happens with replacement (groups are constant). This makes for a tougher version of the "Countdown" letters game, which I'll call "autocountdown":
#!/bin/bash
string=$(cat <(for i in {1..3}; do echo "aeiou" | fold -w1 | shuf -n1; done) \
<(for i in {1..4}; do echo "bcdfghjklmnpqrstvwxyz" | fold -w1 | shuf -n1; done) \
<(for i in {1..2}; do echo {a..z} | tr -d " " | fold -w1 | shuf -n1; done) \
| shuf | paste -d"\0" -s)
echo "The starting string is \"$string\"."
read -p "Want to see the permutations? (y/n)"
echo
case $foo in
n) exit 0 ;;
y) an -w "$string" -m 4 | awk '/^[a-z]*$/ {print length($0), $0}' | column;;
esac
exit
Last update: 2019-09-27
The blog posts on this website are licensed under a
Creative Commons Attribution-NonCommercial 4.0 International License