I have an interesting new theory about how Zodiac did the encoding for the 340 based on new observations I made with unigram repeats. The measurement sums the unigram repeats of fixed length parts of a cipher. If the length is 17 then the repeats of the first 17 characters of the cipher will be summed and then the next 17 and so on. So from 1 to 17, 18 to 35, etc.
What I found is that both ciphers score best at a length of 17, indicating that the Zodiac checked not to repeat, line per line (visually).
And because it is stronger for the 340 I came up with a new hypothesis what is going on with the 340 encoding in relation to the 408. It’s amazingly simple, I believe Zodiac just didn’t care much about cycling in the 340 and checked line per line not to repeat anything (if possible according to the key). And simply that created the randomness in the encoding we observe. This is also backed up by a sharp peak in my non-repeats measurement at 17, which then drops rather sharply (finally I came up with an explanation for that!). It makes so much sense and also explains the lack of 3, 4+ symbol cycles. Occam’s razor?
Here is the data, note that these numbers are percentages checked versus different directions (verticals, diagonals). And that with a random string they should be about 25%. Because the horizontal directions make up 1/4th of the total directions checked. A lower number is more indicative of cyclic encoding. Note that the shorter lengths have small counts and therefore are more prone to outliers.
408:
Length 2: 24.5
Length 3: 20.8
Length 4: 16.4
Length 5: 14.3
Length 6: 17.7
Length 7: 16.2
Length 8: 16.2
Length 9: 16.1
Length 10: 16.8
Length 11: 17.9
Length 12: 16.7
Length 13: 16.6
Length 14: 17.9
Length 15: 15.8
Length 16: 16.5
Length 17: 15.4 <—
Length 18: 17.6
Length 19: 16.8
Length 20: 16.8
340:
Length 2: 22.8
Length 3: 18.0
Length 4: 14.9
Length 5: 11.8
Length 6: 11.8
Length 7: 13.9
Length 8: 12.7
Length 9: 12.7
Length 10: 12.9
Length 11: 14.8
Length 12: 12.4
Length 13: 13.3
Length 14: 15.7
Length 15: 13.9
Length 16: 14.9
Length 17: 11.8 <—
Length 18: 15.0
Length 19: 14.1
Length 20: 15.0
Lengths up to 34 were also checked and no significant drops were noticed (wanted to save myself some input work).
It’s really strange though, I thought the directions would act as randomization but just did another test with proper randomization and get different results not in line with my post above. With the new test there is a peak at 5 and 10 (low unigram repeats) for the 340 and the 408 looks fairly normal.
340: 2 1.69 3 2.16 4 2.40 5 4.12 <--- 6 2.55 7 2.98 8 3.00 9 2.91 10 3.86 <--- 11 2.18 12 3.14 13 2.57 14 2.25 15 2.45 16 2.50 17 2.45 18 2.02 19 2.04 20 2.12 21 2.03 22 1.88 23 1.67 24 1.83 25 1.51 26 1.67 27 1.71 28 1.70 29 1.61 30 1.65 31 1.70 32 1.60 33 1.46 34 1.45 35 1.47 36 1.52 37 1.32 38 1.42 39 1.49 40 1.51 41 1.54 42 1.45 43 1.45 44 1.39 45 1.24 46 1.22 47 1.17 48 1.14 49 1.15 50 1.12 408: 2 1.35 3 1.52 4 1.54 5 2.55 6 2.06 7 1.92 8 1.83 9 1.82 10 1.93 11 1.80 12 1.68 13 1.96 14 1.97 15 2.00 16 1.97 17 1.89 18 1.89 19 1.86 20 1.86 21 1.79 22 1.73 23 1.77 24 1.94 25 1.60 26 1.58 27 1.73 28 1.57 29 1.72 30 1.69 31 1.60 32 1.61 33 1.57 34 1.50 35 1.48 36 1.50 37 1.53 38 1.52 39 1.41 40 1.29 41 1.39 42 1.45 43 1.47 44 1.43 45 1.32 46 1.28 47 1.26 48 1.26 49 1.33 50 1.31
The unigram repeats seem really interesting, the mirrored 340 has a dip at 6 instead of 5 for the normal 340 (see post above). I generated some images and at least one of them is interesting in relation to the pivots.
340 by parts of 5 (here is where there is a dip in the unigram repeats):
340 by parts of 10 (also a dip here):
340 mirrored (dip at 6):
I find this one (image below) interesting in relation to the pivots because you can see that the pivots tend to follow the pattern. I can’t explain it though. The repeating symbols of the pivots have the same position in the parts of 6! EDIT: that is always the case in that scheme but repeats do seem the be alternating between white and gray.
EDIT
I was seeing things, diagonal repeats always share the same positions in that scheme, how dumb! But they do seem to be alternating between white and gray, not sure if that is significant.
Here is the data relating to the mirrored 340. I switched to percentages, lower is less unigrams (versus the randomized average). I also included a continuous test, which increments the position of the cipher by 1 at a time and it does not show these dips.
340 Unigram test: Parts: ----------------- 2 55.0% 3 46.3% 4 41.0% 5 48.8% 6 26.2% <--- 7 55.2% 8 50.5% 9 43.3% 10 55.4% 11 69.1% 12 40.6% 13 53.8% 14 70.3% 15 52.2% 16 54.6% 17 40.5% 18 52.4% 19 56.8% 20 61.1% 21 61.3% 22 71.9% 23 70.0% 24 61.5% 25 69.7% 26 70.7% 27 68.3% 28 73.6% 29 70.9% 30 66.6% 31 67.1% 32 68.1% 33 75.0% 34 68.7% 35 73.7% 36 71.0% 37 74.8% 38 71.0% 39 62.5% 40 70.7% 41 71.9% 42 77.8% 43 83.9% 44 83.7% 45 82.5% 46 81.1% 47 79.2% 48 82.4% 49 88.0% 50 87.9% Continuous: ----------------- 2 59.5% 3 45.0% 4 44.2% 5 42.6% 6 43.7% 7 45.1% 8 48.6% 9 49.9% 10 51.7% 11 52.5% 12 53.2% 13 53.9% 14 55.2% 15 57.0% 16 58.4% 17 59.5% 18 61.0% 19 64.1% 20 64.7% 21 65.4% 22 66.9% 23 68.7% 24 70.1% 25 70.5% 26 71.8% 27 73.7% 28 74.9% 29 74.6% 30 76.5% 31 76.3% 32 76.2% 33 78.0% 34 78.0% 35 78.0% 36 79.8% 37 79.9% 38 81.1% 39 80.6% 40 81.4% 41 82.1% 42 83.0% 43 83.2% 44 83.5% 45 84.4% 46 84.7% 47 85.1% 48 86.0% 49 85.4% 50 85.8%
EDIT
I was seeing things, diagonal repeats always share the same positions in that scheme, how dumb! But they do seem to be alternating between white and gray, not sure if that is significant.
Dam pivots.. I Spent all day trying to find a scheme for them to fit into..
EDIT
I was seeing things, diagonal repeats always share the same positions in that scheme, how dumb! But they do seem to be alternating between white and gray, not sure if that is significant.
Dam pivots.. I Spent all day trying to find a scheme for them to fit into..
I was looking at the 340 yesterday, just visually, trying to come up with an explanation for the pivots. And thought it may have something to do with unigram repeats. Well, you can see the pivots as repeating trigrams or as unigram repeats. There might be something going on with the unigram repeats though (according to my data), but seems I was wrong about a clear connection with the pivots. And I’m not sure what these dips in the unigram repeats could mean.
I will do a little more experimenting soon. I am just wondering if false cycles and true cycles have slightly different characteristics with regards to the spacing between ciphertext. Is there a way to tell the difference?
Great analysis – thanks for posting it. And an interesting question. If you discover a way to distinguish between true and false cycles, it will be very valuable!
I will do a little more experimenting soon. I am just wondering if false cycles and true cycles have slightly different characteristics with regards to the spacing between ciphertext. Is there a way to tell the difference?
Great analysis – thanks for posting it. And an interesting question. If you discover a way to distinguish between true and false cycles, it will be very valuable!
Thanks. I have been taking it easy these last few days. But I have more ideas that I am working on.
So the idea that Zodiac treated the top half and the bottom half differently has been discussed for a long time. And I don’t know if he did, but was thinking about the general concept of two different keys. One for the top half and one for the bottom half. Some detection would need to be done, and I have been thinking about how to do that. But here is how it would look in a transposition context.
Blue shaded is Key 1 and tan is Key two, before untransposition. The A and B are there to show period 19, and how they will be connected after untransposition.
I untransposed it by re-drafting a message into 19 columns, so that the repeat symbols line up vertically. Then rotated the message 90 degrees and re-drafted it again into 17 columns. The same as scytale. The Key 1 symbols and Key 2 symbols have a different arrangement. Trying to solve the message with only one key would be impossible, but the reason why is very simple.
To solve you would have to correctly guess where, if anywhere, Zodiac changed his key. Then make new symbols to reflect the new key. Symbol 1 for Key 1 could become symbol 101 for Key 2, or whatever. Then guess at how to untranspose it correctly and use the solver program.
The basic idea is that he may have used multiple keys, and that is what is disrupting the cycles. And untransposing mixes up the multiple key ciphertext.
Doranchak, what about L=3 for the first 170 ciphertext of the 408 versus the first 17 ciphertext of the 340?
I like the concept of the two halves.. possibly its two scytales, and that will make it a different cipher again i will divide it and make up two on the weekend, Smokie when you say "same as scytale" i am not sure if it is an exact scytale that you are making. remember a scytale is read columnar when it is unraveled but it is written by rows. its a strange egg of a transposition method.
Does anyone have a numerical version of the 408?
Does anyone have a numerical version of the 408?
Here you go —
01 02 03 04 05 04 06 07 02 08 09 10 11 12 13 11 07 14 15 16 17 18 19 20 21 01 22 03 23 24 25 26 19 17 27 28 19 29 06 30 08 31 26 32 33 34 35 19 36 37 38 39 40 04 01 41 07 03 09 10 42 06 02 43 10 44 26 45 08 29 46 27 05 28 47 48 49 12 20 22 15 14 17 50 19 23 16 26 18 36 01 24 30 38 21 26 13 50 37 51 39 40 10 34 33 30 19 45 44 09 50 26 18 07 32 35 39 41 07 46 47 04 03 41 07 23 13 26 45 22 27 06 29 10 10 08 52 05 24 26 12 30 38 14 26 25 50 37 46 27 48 01 41 07 03 36 10 16 53 11 21 49 34 40 17 45 06 22 08 20 05 52 12 09 15 14 30 37 16 33 46 38 44 29 10 21 22 30 01 36 10 54 32 19 48 49 47 17 04 23 13 28 35 42 03 37 27 50 10 06 33 02 46 38 34 15 45 24 22 11 18 48 30 25 28 08 37 01 50 46 27 44 34 42 38 05 40 03 51 06 12 08 42 01 41 07 15 14 49 16 15 32 33 09 03 29 11 39 48 44 43 06 17 21 31 36 51 18 02 02 30 27 34 08 38 39 52 45 04 01 02 02 05 43 42 03 41 07 15 12 17 13 26 14 26 54 20 41 50 52 16 23 01 42 01 07 02 09 32 37 10 06 52 16 54 47 19 26 54 29 39 26 14 15 05 17 18 19 24 45 54 32 19 42 01 02 41 46 33 54 22 25 20 07 13 01 51 13 42 36 47 49 31 46 25 11 26 54 17 47 41 41 21 17 37 03 09 10 13 35 20 02 18 52 05 23 28 32 33 26 54 50 28 30 16 48 07 03 35 14 21 15 45 13 48 01 14 30 21 26 45 22 27 38 11 06 30 08
Does anyone have a numerical version of the 408?
Here you go —
01 02 03 04 05 04 06 07 02 08 09 10 11 12 13 11 07 14 15 16 17 18 19 20 21 01 22 03 23 24 25 26 19 17 27 28 19 29 06 30 08 31 26 32 33 34 35 19 36 37 38 39 40 04 01 41 07 03 09 10 42 06 02 43 10 44 26 45 08 29 46 27 05 28 47 48 49 12 20 22 15 14 17 50 19 23 16 26 18 36 01 24 30 38 21 26 13 50 37 51 39 40 10 34 33 30 19 45 44 09 50 26 18 07 32 35 39 41 07 46 47 04 03 41 07 23 13 26 45 22 27 06 29 10 10 08 52 05 24 26 12 30 38 14 26 25 50 37 46 27 48 01 41 07 03 36 10 16 53 11 21 49 34 40 17 45 06 22 08 20 05 52 12 09 15 14 30 37 16 33 46 38 44 29 10 21 22 30 01 36 10 54 32 19 48 49 47 17 04 23 13 28 35 42 03 37 27 50 10 06 33 02 46 38 34 15 45 24 22 11 18 48 30 25 28 08 37 01 50 46 27 44 34 42 38 05 40 03 51 06 12 08 42 01 41 07 15 14 49 16 15 32 33 09 03 29 11 39 48 44 43 06 17 21 31 36 51 18 02 02 30 27 34 08 38 39 52 45 04 01 02 02 05 43 42 03 41 07 15 12 17 13 26 14 26 54 20 41 50 52 16 23 01 42 01 07 02 09 32 37 10 06 52 16 54 47 19 26 54 29 39 26 14 15 05 17 18 19 24 45 54 32 19 42 01 02 41 46 33 54 22 25 20 07 13 01 51 13 42 36 47 49 31 46 25 11 26 54 17 47 41 41 21 17 37 03 09 10 13 35 20 02 18 52 05 23 28 32 33 26 54 50 28 30 16 48 07 03 35 14 21 15 45 13 48 01 14 30 21 26 45 22 27 38 11 06 30 08
Thanks a lot. I am taking another look at the high scoring 340 L=2 cycles and where they start and where they stop to see if there is any pattern.
I have wondered before if Zodiac may have cycled his key somehow. Not just cycling his symbols, but some other cycle as well.
I made my cycle spreadsheet find the start and end positions of all of the cycles, and below is a chart of the 340 top 33 L=2 cycles, all with at least 8 consecutive alternations. The left two columns are the symbols. Column A is the number of consecutive alternations. Column B is the start position, C is the end position, and D is the total number of positions between A and B. I sorted them according to B, the start position. The chart has 340 positions, which are not labeled because they are too small to show. But I eyeballed the halfway and quarter way points and drew rough red lines. You can see that a lot of the cycles start in the second quarter of the message, and there are a few, but not many, that start in the third quarter of the message.
Below is the 408, top 33 L=2 cycles for comparison. Most of them start in the first quarter, but there are a handful that start in the second quarter.
So I wonder if Zodiac cycled his cycles, and that is why there is no solution yet. Some of the longer cycles start in low position and carry through to high positions, such as 6-37, which has 13 consecutive alternations. But do they map to the same plaintext throughout, or do they map to two different plaintext, with the change somewhere in the message?
What about an analysis of where the cycles end? I wonder about the period 19 repeats. Can a message with a cycled key produce so many repeats? Or, since the repeats also have start and end positions similar to the cycles, is there any correlation between the cycle symbols, total number of positions that the cycle covers, and period repeat symbols, and total number of positions that the period repeat covers?
Shuffle tests show that most of these cycles are true. So, since he started some of the cycles in the second quarter of the message, did he cycle his key?
EDIT: I did not include any symbols that were not part of a series of consecutive alternations. In other words, with ABBAABABABAB, I did not include the first four symbols.