So, here’s a taught.
The Z408 was solved very quickly and the newspapers all explained that the double L in kill was the key to solving it If Zodiac saw this in the papers (and I can’t imagine he didn’t) he would have felt the need to make a superior cipher. The Z340 is obviously his answer to the papers calling his Z408 a simple puzzle solved by some hobbyists.
So he did the most easy thing to make a homophonic cipher more secure and added more symbols to the mix.
He also did the one thing that will make any kind of cipher more secure. He shortened the message.
So far all standard improvements he could find in just about any book on ciphers.
But what if Zodiac did this and then looked at his cipher, holding it next to his plaintext. And what do you know, even after the extra improvements, he produced a cipher where some double letters are encoded with the same bigrams. Heck, even the two instances of “kill” end with the same bigram.
If this is a plausible scenario, the best way to defeat this weakness is to do a columnar transposition. This way you can separate the double letters that created the bigram, and create new “false” bigrams to throw people off.
I know this is a lot of what ifs and guessing, but this is where my mind is at the moment.
So my question is if anyone tested this hypothesis? For instance if we do columnar transposition on Z408, is there a reasonable method of retrieving the correct ordering of the columns? I mean, there are over 355 trillion ways to order the columns. And that’s assuming that he used 17 columns (God help us if he didn’t).
If no hypothesis testing was done for a method like this, i would like to dig into that. But if there already has been some work done i would rather not waste my time repeating it.
When I read your post I had to wonder if you were reading my mind. This has been precisely my hypothesis for several years. Only in the past couple of months have I been able to perform some realistic experimentation on it.
I built this project which successfully solves the z408 and also supports keyed columnar transposition. There is a README which explains how to use it, but I am also happy to help if you are interested in using it.
https://github.com/beldenge/zenith
I also created the module zenith-mutation-search to do something similar to what you are describing. It performs hill climbing on random column transposition keys to find column orderings which seem likely. I have found some interesting results with keys of length 23 or more — but I have not found any valid solutions yet. I listed some of the more interesting keys in my recent post here:
viewtopic.php?f=81&t=4469#p73317
I intend to do more experimentation myself, but I think it’s a promising area for other people to research as well.
AZdecrypt has a solver for keyed columnar transposition and columnar rearrangement and can solve the 408 with a 17 column keyed columnar transposition (use bigrams turned on, search depth 5) in a long run. It is a different story on the 340 however, the increased multiplicity here is a big deal and bigrams cannot be assumed. The hypothesis is on my to do list but more compute power is needed.
For instance if we do columnar transposition on Z408, is there a reasonable method of retrieving the correct ordering of the columns?
Yes, bigram assisted hill climbing.
I am in the process of counting the distance between copies of the same symbol.
The left column is the positions in the cipher, middle column is the distances between those positions and in the last column I wrote what those distances can be divided by.
I so far found not much use for this, only the > symbol shows some promise by having 4 positions that all have a distance from eachother which can be divided by 5. How and if we can use this still remains to be seen, but for now I am interested in the probability of this happening in a 340 character long cipher with 63 different symbols.
My probability skills are not so good, so I hope somebody can explain it in easy terms. I was guessing it would be around 1/125 probability so not extremely probable, but also not so rare that you would not expect it in Z340 at all.
I so far found not much use for this, only the > symbol shows some promise by having 4 positions that all have a distance from eachother which can be divided by 5.
Also potentially interesting is that the difference between row numbers for the first 2 occurrences of ">" is 5, and between row numbers for the last 2 occurrences of ">" is also 5.
AZdecrypt has a solver for keyed columnar transposition and columnar rearrangement and can solve the 408 with a 17 column keyed columnar transposition (use bigrams turned on, search depth 5) in a long run. It is a different story on the 340 however, the increased multiplicity here is a big deal and bigrams cannot be assumed. The hypothesis is on my to do list but more compute power is needed.
For instance if we do columnar transposition on Z408, is there a reasonable method of retrieving the correct ordering of the columns?
Yes, bigram assisted hill climbing.
Do you mean that AZdecrypt hill climbs the homophonic substitution key for each transposition or that it can search through transpositions without hill climbing the homophonic substitution key?
The latter. It does a bigram depth search to find a good candidate for the homophonic substitution solver. That is done so for each CPU thread individually so it is efficient. The bigram depth search can be replaced by the homophonic substitution solver itself by changing the setting "(Substitution + columnar transposition & rearrangement) Use bigrams: 1" to 0. Then you need a NASA computer.
If you want to try the solver in 1.15 then you just set the cipher dimensions to the amount of columns you suspect (of the columnar transposition, 17 columns etc). Also, if your CPU can handle it, a higher "(Substitution + columnar transposition & rearrangement) Search depth: 3" setting is better.
Jarlve,
Would you be able to explain how the bigram depth search works or point me to some information on that? I have recently been using bigram repeats along with the cycle score to try and find good columnar transposition keys but have not had much luck with that. I am also not finding any information on bigram depth search online. Any help is much appreciated.
Jarlve,
Would you be able to explain how the bigram depth search works or point me to some information on that? I have recently been using bigram repeats along with the cycle score to try and find good columnar transposition keys but have not had much luck with that. I am also not finding any information on bigram depth search online. Any help is much appreciated.
Hey beldenge, with depth search I mean the following (example is depth 2):
1. Random change to key
==> 1A. Random change to key on top of change 1
==> 1B. Random change to key on top of change 1
2. Random change to key
==> 2A. Random change to key on top of change 2
==> 2B. Random change to key on top of change 2
Take best bigram result from these 4 (1A, 1B, 2A and 2B).
With something like columnar transposition it is crucial that the search can look ahead because the hill to climb is very spikey (telephone pole problem).
How are you using the cycle score? Do you mean to find improved cycle scores after transposition? We have done allot of work on this and it seems likely that the transposition was done prior to the homophonic layer.
Jarlve,
Thank you very much for your reply. I must admit I am still a little confused on a few things.
First of all I am unclear what 1A, 1B, 2A, 2B represent. I think you are trying to illustrate the search depth, but I am not sure I understand the approach. For depth of 2, is it essentially holding one of the key values constant and then searching the remaining columns?
For example with key length = 4 and depth = 2:
First keep the first column constant:
[0, 1, 2, 3]
[0, 1, 3, 2]
[0, 2, 1, 3]
[0, 2, 3, 1]
[0, 3, 1, 2]
[0, 3, 2, 1]
Then move to the second column and keep it constant:
[1, 0, 2, 3]
[1, 0, 3, 2]
[2, 0, 1, 3]
[2, 0, 3, 1]
[3, 0, 1, 2]
[3, 0, 2, 1]
And so on with the 3rd and 4th column? Is this close at all or am I way off?
When you say "best bigram result" are you referring to counts of bigram repeats in the ciphertext? Or do you mean substituting plaintext and finding the best score of plaintext bigram probabilities?
And to attempt to answer your question, I am using the cycle score after transposition to evaluate the column key. I have mutated the z408 with a columnar transposition and am trying to use a hill climber to find the correct column key. Of course, when applying this method to the z340, if the transposition was done prior to homophonic substitution then this method would not be that useful. But I would like to try to get a working implementation if possible so that I can rule it out.
I really appreciate your time helping me understand this approach. My motivation is that for the z340 I am seeing outliers in cycle scores for columnar transposition keys of length 26 using double transposition, but I am not able to converge on a single key. And since my approach doesn’t work on the z408 I can’t be sure of my findings just yet.
These changes are random.
Start or previous best key: [0,1,2,3]
1. Random change to key: [0,1,3,2] (swap 2 and 3)
==> 1A. Random change to key on top of change 1: [3,1,0,2] (swap 0 and 3)
==> 1B. Random change to key on top of change 1: [0,2,3,1] (swap 1 and 2)
2. Random change to key: [0,3,2,1] (swap 1 and 3)
==> 2A. Random change to key on top of change 2: [1,3,2,0] (swap 0 and 1)
==> 2B. Random change to key on top of change 2: [0,2,3,1] (swap 3 and 2)
Scan the cipher text with transformations 1A, 1B, 2A and 2B for bigram counts and take the highest one and process it through your solver and/or more expensive measurements.
Search depth 3 would look like this:
1. Random change to key
==> 1A. Random change to key on top of change 1
====> 1AA. Random change to key on top of change 1A
====> 1AB. Random change to key on top of change 1A
====> 1AC. Random change to key on top of change 1A
==> 1B. Random change to key on top of change 1
====> 1BA. Random change to key on top of change 1B
====> 1BB. Random change to key on top of change 1B
====> 1BC. Random change to key on top of change 1B
==> 1C. Random change to key on top of change 1
====> 1CA. Random change to key on top of change 1C
====> 1CB. Random change to key on top of change 1C
====> 1CC. Random change to key on top of change 1C
2. Random change to key, etc…
3. Random change to key, etc…
Scan for bigrams 1AA, 1AB, 1AC, 1BA, 1BB, all 27 variations etc…
And to attempt to answer your question, I am using the cycle score after transposition to evaluate the column key. I have mutated the z408 with a columnar transposition and am trying to use a hill climber to find the correct column key. Of course, when applying this method to the z340, if the transposition was done prior to homophonic substitution then this method would not be that useful. But I would like to try to get a working implementation if possible so that I can rule it out.
I really appreciate your time helping me understand this approach. My motivation is that for the z340 I am seeing outliers in cycle scores for columnar transposition keys of length 26 using double transposition, but I am not able to converge on a single key. And since my approach doesn’t work on the z408 I can’t be sure of my findings just yet.
Okay I see. No doubt that you will learn allot from your project. What you are attempting is very difficult!
Jarlve,
Thank you again for taking the time to explain this to me. Just one last question — when you say bigram counts, are you referring to repeated bigrams in the ciphertext? Or is it the number of unique bigrams in the ciphertext? Or something else entirely?
I ask because when I count repeated bigrams in the ciphertext, the hill climber finds many scores which are actually higher than the correct column key.
Repeated bigrams, for validation, I get 62 for the standard 408 and 25 for the standard 340.
I am not sure what the odds are for 62 bigrams in the 408 but say it is 1 in 1000000. And if the search space is 1000000000000 (many times larger) then yes, the hill climber will find more significant bigram peaks because the search space is so large that these can/will occur. But the bigrams can still act as a filter that discard the uninteresting results.
Gotcha, that make sense. I’ll see what I can conjure out of the ciphertext with that approach.
–