Chapter Two
Morse code was invented around 1837 by Samuel Finley Breese Morse (1791–1872), whom we shall meet more properly later in this book. It was further developed by others, most notably Alfred Vail (1807–1859), and it evolved into a couple of different versions. The system described in this book is more formally known as International Morse code.
The invention of Morse code goes hand in hand with the invention of the telegraph, which I’ll also examine in more detail later in this book. Just as Morse code provides a good introduction to the nature of codes, the telegraph includes hardware that can mimic the workings of a computer.
Most people find Morse code easier to send than to receive. Even if you don’t have Morse code memorized, you can simply use this table, which you saw in the previous chapter, conveniently arranged in alphabetical order:
Receiving Morse code and translating it back into words is considerably harder and more time consuming than sending because you must work backward to figure out the letter that corresponds to a particular coded sequence of dots and dashes. If you don’t have the codes memorized and you receive a dash-dot-dash-dash, you have to scan through the table letter by letter before you finally discover that it’s the letter Y.
The problem is that we have a table that provides this translation:
Alphabetical letter → Morse code dots and dashes
But we don’t have a table that lets us go backward:
Morse code dots and dashes → Letter of the alphabet
In the early stages of learning Morse code, such a table would certainly be convenient. But it’s not at all obvious how we could construct it. There’s nothing in those dots and dashes that we can put into alphabetical order.
So let’s forget about alphabetical order. Perhaps a better approach to organizing the codes might be to group them based on how many dots and dashes they have. For example, a Morse code sequence that contains just one dot or one dash can represent only two letters, which are E and T:
A combination of exactly two dots or dashes provides four more letters—I, A, N, and M:
A pattern of three dots or dashes gives us eight more letters:
And finally (if we want to stop this exercise before dealing with numbers and punctuation marks), sequences of four dots and dashes allow 16 more characters:
Taken together, these four tables contain 2 plus 4 plus 8 plus 16 codes for a total of 30 letters, 4 more than are needed for the 26 letters of the Latin alphabet. For this reason, you’ll notice that 4 of the codes in the last table are for accented letters: three with umlauts and one with a cedilla.
These four tables can certainly help when someone is sending you Morse code. After you receive a code for a particular letter, you know how many dots and dashes it has, and you can at least go to the right table to look it up. Each table is organized methodically starting with the all-dots code in the upper left and ending with the all-dashes code in the lower right.
Can you see a pattern in the size of the four tables? Each table has twice as many codes as the table before it. This makes sense: Each table has all the codes in the previous table followed by a dot, and all the codes in the previous table followed by a dash.
We can summarize this interesting trend this way:
Each of the four tables has twice as many codes as the table before it, so if the first table has 2 codes, the second table has 2 × 2 codes, and the third table has 2 × 2 × 2 codes. Here’s another way to show that:
Once we’re dealing with a number multiplied by itself, we can start using exponents to show powers. For example, 2 × 2 × 2 × 2 can be written as 24 (2 to the 4th power). The numbers 2, 4, 8, and 16 are all powers of 2 because you can calculate them by multiplying 2 by itself. The summary can also be shown like this:
This table has become very simple. The number of codes is simply 2 to the power of the number of dots and dashes:
Powers of 2 tend to show up a lot in codes, and particularly in this book. You’ll see another example in the next chapter.
To make the process of decoding Morse code even easier, you might want to draw something like this big treelike diagram shown here.
This diagram shows the letters that result from each particular consecutive sequence of dots and dashes. To decode a particular sequence, follow the arrows from left to right. For example, suppose you want to know which letter corresponds to the code dot-dash-dot. Begin at the left and choose the dot; then continue moving right along the arrows and choose the dash and then another dot. The letter is R, shown next to the third dot.
If you think about it, constructing such a table was probably necessary for defining Morse code in the first place. First, it ensures that you don’t make the silly mistake of using the same code for two different letters! Second, you’re assured of using all the possible codes without making the sequences of dots and dashes unnecessarily long.
At the risk of extending this table beyond the limits of the printed page, we could continue it for codes of five dots and dashes. A sequence of exactly five dots and dashes gives us 32 (2 × 2 × 2 × 2 × 2, or 25) additional codes. Normally that would be enough for the ten numbers and 16 punctuation symbols defined in Morse code, and indeed, the numbers are encoded with five dots and dashes. But many of the other codes that use a sequence of five dots and dashes represent accented letters rather than punctuation marks.
To include all the punctuation marks, the system must be expanded to six dots and dashes, which gives us 64 (2 × 2 × 2 × 2 × 2 × 2, or 26) additional codes for a grand total of 2 + 4 + 8 + 16 + 32 + 64, or 126, characters. That’s overkill for Morse code, which leaves many of these longer codes undefined, which used in this context refers to a code that doesn’t stand for anything. If you were receiving Morse code and you got an undefined code, you could be pretty sure that somebody made a mistake.
Because we were clever enough to develop this little formula,
we could continue figuring out how many codes we get from using longer sequences:
Fortunately, we don’t have to actually write out all the possible codes to determine how many there would be. All we have to do is multiply 2 by itself over and over again.
Morse code is said to be binary (literally meaning two by two) because the components of the code consist of only two things—a dot and a dash. That’s similar to a coin, which can land only on the head side or the tail side. Coins that are flipped ten times can have 1024 different sequences of heads and tails.
Combinations of binary objects (such as coins) and binary codes (such as Morse code) are always described by powers of two. Two is a very important number in this book.