Pages

Tuesday 3 February 2015

A look at the second Feynman cipher - fractionated morse [part 4]

The Feynman ciphers are a set of 3 ciphers given to Richard Feynman, the first of which has been solved, but the second 2 remain unsolved. This is part 4 of a discussion about the second cipher.

I have talked about this cipher three times previously in Part 1, Part 2 and Part 3. In part 1 I was playing around with vigenere type ciphers, in part 2 I decided to step back a little bit a figure out what the ciphertext is not i.e. identify the cipher algorithms it can't be. Part 3 looked at the 3x3 Hill cipher and some of its variants.

So far I have been working my way through the cipher types I think it may be. In part 2 I identified fractionated morse as a candidate cipher. It turns out breaking fractionated morse is not too difficult, the procedure is similar to breaking standard substitution ciphers. I will start with a short description of Fractionated morse, then we will look at how to break it.

Description of Fractionated Morse

There are many descriptions of the algorithm around, the ACA has a good description, I will reproduce a part of G. worley's description. It assumes you know morse code.


If Alice wanted to send Bob the
message "I love you." she would tap over the telegraph:

Pt: I love you.
Et: ..xx.-..x---x...-x.xx-.--x---x..-xx.-.-.-

After encoding the plaintext in Morse Code, the encoded text is broken into
n-graphs, usually trigraphs because there are 26 Morse Code trigraphs (you never
have three dividers in a row, so you don't count that one). Here's a sample
trigraph mapping (using a keyword):

Keyword: LOOKINGGLASS

L O K I N G A S B C D E F H J M P Q R T U V W X Y Z
. . . . . . . . . - - - - - - - - - x x x x x x x x
. . . - - - x x x . . . - - - x x x . . . - - - x x
. - x . - x . - x . - x . - x . - x . - x . - x . -

So, breaking our message into trigraphs we get:

. x . - x . . - - - . x - .
. . . - . - x . x - . x . -
x - x - . x x - - x - . - x
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Ct: K T K H R G B D P J O Y D G

Cracking this is much harder because the dividers are obfuscated. The only way
to attack this is to do a statistical analysis for common Morse Code trigraphs.

The cipher assumes 'x' is placed between each letter and 'xx' is placed between each word.

Cryptanalysis of Fractionated Morse

Lets look at the following Fractionated Morse table:


A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
. . . . . . . . . - - - - - - - - - x x x x x x x x
. . . - - - x x x . . . - - - x x x . . . - - - x x
. - x . - x . - x . - x . - x . - x . - x . - x . -

If you look at the table above, you can see that certain ciphertext letter combinations are not possible e.g. RS, RT, ... RZ cannot occur because we can't have xxx in the plaintext Morse. Also IS, IT, ... IZ can't happen for the same reason. Other impossible combinations: CY, CZ, FY, FZ, OY, OZ etc. We also can't have e.g. ABD because there are no Morse letters that go '.....-.-.'. You should be able to come up with many other impossible combinations. We can also use frequency information, since THE occurs very often in English, we expect 'xx-x....x.xx'='ZSCI' to be quite common in the ciphertext (with this particular key, if you look at the counts provided down below, we see ZSCI is actually the most common quadgram). This means certain keys will be very unlikely if they lead to impossible transitions, others more likely if they lead to likely transitions.

The trick to breaking Fractionated Morse is that finding the key can be done completely independently of the English->Morse translation. Knowing the statistics of a particular Fractionated Morse key means we can just find the key that translates to it, without ever checking to see if the putative plain text even decodes properly. The Idea is to start with a random key, then continuously swap characters in the key trying to make the ciphertext look like it has been enciphered with the key "ABCDEFGHIJKLMNOPQRSTUVWXYZ". Once this is done, we can just decrypt it. To do this we need quadgram statistics for a lot of English text that has been enciphered with the key "ABCDEFGHIJKLMNOPQRSTUVWXYZ". I did this for 40 million sentences, or around 1 billion characters. The results are linked below.

Code for breaking Fractionated Morse ciphers is here, with the required statistics here. I have run the code on many possible routes through the second Feynman cipher, and nothing resulted in any good looking plaintexts. I would expect it to work, since breaking length 100 Fractionated Morse ciphertexts is not too difficult using the code above, and F2 is 261 characters. This leads me to believe that F2 is not a Fractionated Morse cipher.

From here I'll try to write an efficient running key solver for vigenere, beaufort, variant and porta rules and see how that goes.