In Arthur Conan Doyle's story "The Adventure of the Dancing Men," master detective Sherlock Holmes resolves a mystery and brings a murderer to justice by reverse-engineering a classic substitution cipher. The dancing men are rows of little stick figures doing ordinary things, such as walking and running, but also doing unusual things, such as standing on their heads, waving their arms and holding flags. Although they look like random graffiti -- a second means of concealment -- Holmes sees the possibility that they may represent an encoded message. The detailed descriptions of Holmes's analysis of these messages show us the techniques of its construction.
An alphabet is an ordered set of symbols. The English alphabet consists of the written letters A through Z, the Morse code alphabet uses written or audible dots and dashes and the Braille alphabet uses raised dots in tactile patterns; the Cyrillic, Hebrew and Arabic alphabets have their own sets of written symbols. Music has alphabets, including the familiar Western eight-tone scale as well as many other scales; even colors could be used to form an alphabet. The dancing men in the story are an alphabet of symbols used in a substitution cipher.
A substitution cipher is a means of concealing a message by substituting one letter or symbol for another. The key is consistency -- once a letter or symbol is used in place of the original, it must be used throughout the message or nonsense results. Holmes deciphered the dancing men messages by using frequency analysis; he began by counting the number of times each different dancing man was used in a message, and assigned it a letter of the English alphabet based on the frequency of that letter's use in English. The most-used letter in English is E, followed by T, N, O, A, R, I and S, while the least-used are J, K, Q, X and Z. Using this technique, he was able to begin to make sense of the messages and infer more letters as possible words became clear.
Orthographic means "written correctly," and these rules apply to any language or alphabet to determine how words are spelled and pronounced as well as how sentences are constructed and punctuated. The orthographic technique of Doyle's dancing men is very simple; there are no spaces between words, and the only punctuation is a flag in one hand of the final stick figure to indicate the end of a sentence.
A closely related technique not used in the story is the "Caesar shift" -- an arbitrary but consistent alteration in the selection of letters or symbols known only to those authorized to read the coded message. Once the basic transposition technique was known to all his generals, Caesar would send a separate message indication how it should be changed for a specific time period. Like a password to a sentry, this told the general that the next message, or those for the next day (or week, or month), should be decoded by counting so many symbols ahead of the usual symbol. Thus, if the written letter in a message was Q and this normally transposed as A, a shift of four would change it to E. Unless the shift was known, attempts to decipher messages by the basic transposition technique would result in gibberish.
- Rumkin: Substitution Cipher
- Troynovant: "The Adventure of the Dancing Men by A. Conan Doyle;" Robert Wilfred Franson, June 2008
- Trinity College Department of Computer Science: "Simple Substitution Ciphers;" C. Savarese and B. Hart, 1999
- Think Quest: Music Theory, Scales & Intervals
- Illuminations, Code Cracker: Caesar Cipher
- Glocalnet: "Strange Little Dancing Men Explained at Last;" Martin Bergman
- Dynamic Graphics Group/Dynamic Graphics Group/Getty Images