Nevertheless, there are possible transpositions that are compatible with the observations. Let’s say z340 would be a simple homophonic substitution in which the lower half was rotated 180 degrees. Let’s also say that P19 has nothing to do with transposition, but comes from a strongly repeating plain text. All this just to give an example. In this case, rotating the lower half would neither affect P19 nor the "rows without repeated symbols" statistics. What would become apparent in this case, however, would be a cycle sequence interrupted in the middle. Sure, if z340 were built that way, it would have been solved long ago. I just want to show that simple transpositions are quite possible.
Let me illustrate something. The 340 has 26 length 17 unique sequences/strings. And after mirroring the lower half as you suggest this observation drops to 15 length 17 unique sequences. doranchak calculated the odds of 26 length 17 appearing randomly to about 1 in 600,000,000. It narrows down the transposition options even further.
Here is a picture. The red line are the unique sequence length frequencies. The other colors are different directions and green is the 340 fully mirrored. To interpret it differently, the red "mountain" is right shifted from the green "mountain". And the more the mountain is right shifted the better it looks as considering the direction as the actual sequential homophonic substitution direction. Mirroring the lower half left shifts the mountain. This sharp peak at 17 with a steep drop off is actually a bit weird. Though it would be weird if it would be normal, it is the 340 after all.
Hi,
I implemented the scoring shown by Jarlve and did some experiments. I remembered that sometimes shifting rows or the whole cipher leads to better period n peaks. By shifts of rows I mean the following:
original row: HERabcdVPeIfLTGgh row shifted by 3: GghHERabcdVPeIfLT
So why not test whether the 2-cycle-score also increases significantly? I wrote a little function that tries different row shifts. Here is the best result so far:
Shifting amounts (rows 1-20):
3, 15, 15, 1, 11, 4, 11, 7, 16, 0, 16, 5, 15, 13, 2, 0, 2, 16, 0, 6
Resulting Cipher:
GghHERabcdVPeIfLT +BjkOlDWYmnoKpqNb stM+UZGWjqLkuHJBr gSbbvdcwoVxbO++RK 2hI7FP+34e5yzM+u1 agDjbwRdFcO-ohCeF 8gUtXGVmuLIk7+KQl NYu+9LzjGgJp2kO+y nM+0+ZRgFBtrA#4Kh -ucUV+dJ+ObvnFBr- +R571EIDYBb0TMKOU m+3BFgntcRJIo7T4M zSrk+NI7FBtj8wRu# dp7g40mtV41++cGFN -+rBXfos4zCEaVUZ7 ItmxuBKjObdmpMQGg qLRtT+Lf#Cn+FcWBI +qWCuWtPOSHT5jqb+ IFehWnv1ByYOBo-Ct wAIK8+aMDHNbeSuZO
The 2-Cycle score is 2433, an increase of 13.9% over the original z340. Is that significant? I honestly have no idea.
Compared to z340, the perfect n-cycles have also increased (obviously).
However, I have not yet made a comparison with other ciphers. Perhaps these observations are based on quite normal behavior.
I’ll try to include the unique sequences length stuff shown in the last chart, so I can make such a comparison too.
Perfect 4-symbol cycles: PlsxPlsxP (30) fsxQfsxQf (30) lsxQlsxQ (20) l23Xl23X (20) l23Ql23Q (20) l2XQl2XQ (20) l3XQl3XQ (20) sxQAsxQA (20) v3XQv3XQv (30) v8XQv8XQv8 (42) 23XQ23XQ (20) 3XQA3XQA (20) 58XQ58XQ58 (42) y8XQy8XQy8 (42) 8XQA8XQA8 (30) Perfect 5-symbol cycles: l23XQl23XQ (30)
The 2-Cycle score is 2433, an increase of 13.9% over the original z340. Is that significant? I honestly have no idea.
Compared to z340, the perfect n-cycles have also increased (obviously).However, I have not yet made a comparison with other ciphers. Perhaps these observations are based on quite normal behavior.
I’ll try to include the unique sequences length stuff shown in the last chart, so I can make such a comparison too.
Each row has 17 unique offsets which give 4,064,231,406,647,572,522,401,601 possibilities. I do not think 2433 is significant in such a space. When comparing to other ciphers make sure to test some that are about as cyclic as the 340 since it is easier to improve upon cycles that are more randomized.
You can test the 340 versus itself for such a scheme (row offset shifts). Make random row shifts and count the number of times the 2-symbol cycle score improved. If the improvement ratio is like 20% or greater then that may be something. This improvement ratio idea (borrowed from smokie) only works when the space is sufficiently large since it is prone to outliers if for instance there are only 10 possibilities.
I have the impression that many operations will produce large increases in cycle scores.
Jarlve once found that deleting row 14 dramatically increases the 4-cycle score: viewtopic.php?p=54990#p54990
I looked for row swaps that improve cycle scores too. If you allow arbitrary re-arrangement of rows, you can get a lot of cycle improvements. I found a simpler rearrangement that improves cycles but I can’t remember how much improved they are:
Thank you Jarlve and Doranchak for the explanations.
My idea to search for better cycles with shifts was probably not well planned and a bit thoughtless. But you never know…
I did the shifting with some test ciphers (25% random encoding). The result is similar to z340: The cycles are improved quite quickly. Nothing special indeed.
Thank you Jarlve and Doranchak for the explanations.
My idea to search for better cycles with shifts was probably not well planned and a bit thoughtless. But you never know…
I did the shifting with some test ciphers (25% random encoding). The result is similar to z340: The cycles are improved quite quickly. Nothing special indeed.
And now we are back to the question: what is the nature/cause of this randomization in the 340?
I really can’t think… this heat wave is abnormal and seems to have no end. It’s not a record heat anymore, but it’s quite humid. From Friday there is improvement in sight! (I’m just looking for an excuse why I can’t solve z340 right away )
Jarlve, I just reviewed your chart. Am I getting this right? There are 26 different strings, each consisting of 17 characters without repetition? Or am I misinterpreting the unique sequences? If I’m right, can the sequences overlap in this case? Must be…
The whole observation of cycles and sequences without repeated letters is only important if Zodiac aimed for a perfect homophonic encryption and the cipher is not transposed after substitution, isn’t it? An ideal encryption is perfectly cyclic and the letter distribution is absolutely smooth. However, this was not at all the case with z408. Quite the opposite: in the last thrid of the cipher he broke many cycles. Either intentional or by sloppyness. Apparently Zodiac had used the general letter distribution in English texts. Ideally he should have used the distribution of in plaintext of z408 for the key. But there were enough cycles that made z408 vulnerable. The Hardens had started right where bigrams repeated themselves. Zodiac sure as hell read about it in the paper. So it is only logical for him to avoid such cycles in the future. The easiest way to do this is to be careful when substituting and to deviate from the cycle at the appropriate points. Actually a plausible explanation, especially since 25% are quite compatible with it. Let’s say for fun, that’s exactly what happened (regardless of whether there was a transposition underneath). How can you tell if arbitrary cycles have been avoided? In my opinion, we can only use falsification here. However, this is only possible if z340 either is solved or at least a pattern is detected that interrupts the cycles regularly. Be it through nulls, routes or whatever.
An example:
Take a plaintext consisting of 330 letters. Now transpose it. Now you draw a grid with 17×20 and leave 10 squares free, which form a pattern. The letter Z, a crosshair or whatever. Now substitute the 330 characters and fill them into the grid, leaving out the 10 fields of the pattern. Then a short message with 10 characters is filled into the free space. (Cyclically and linearly). As long as such a pattern is not too invasive, it can lead to the behavior we observe: Period n because of transposition + simultaneous slightly interrupted cyclical substitution. However, a transposition solver will not succeed because the 10 characters will cause too much interference (e.g. with diagonal transposition). Only after removing the 10 characters, you get a solution.
In short: To determine what the interrupted cycles are all about, you need to know which symbols represent the same plaintext character and/or determine to 100% where cycles are interrupted. I don’t know…it just seems to me that there are too many possibilities to search effectively for the reason of the interrupted cycles. I gladly let myself be convinced of the opposite, maybe I just do not see clearly enough.
The steep drop on the chart can have several causes. One possibility is a highly repetitive plaintext. This is also possible during transposition if the plaintext length correlates with the transposition and the key. At a certain point, there are inevitably no more choices.
Translated with http://www.DeepL.com/Translator
Jarlve, I just reviewed your chart. Am I getting this right? There are 26 different strings, each consisting of 17 characters without repetition? Or am I misinterpreting the unique sequences? If I’m right, can the sequences overlap in this case? Must be…
Yes, sequences can overlap since they are counted from every position in the cipher.
The whole observation of cycles and sequences without repeated letters is only important if Zodiac aimed for a perfect homophonic encryption and the cipher is not transposed after substitution, isn’t it? An ideal encryption is perfectly cyclic and the letter distribution is absolutely smooth. However, this was not at all the case with z408. Quite the opposite: in the last thrid of the cipher he broke many cycles. Either intentional or by sloppyness. Apparently Zodiac had used the general letter distribution in English texts. Ideally he should have used the distribution of in plaintext of z408 for the key. But there were enough cycles that made z408 vulnerable. The Hardens had started right where bigrams repeated themselves. Zodiac sure as hell read about it in the paper. So it is only logical for him to avoid such cycles in the future. The easiest way to do this is to be careful when substituting and to deviate from the cycle at the appropriate points. Actually a plausible explanation, especially since 25% are quite compatible with it. Let’s say for fun, that’s exactly what happened (regardless of whether there was a transposition underneath). How can you tell if arbitrary cycles have been avoided? In my opinion, we can only use falsification here. However, this is only possible if z340 either is solved or at least a pattern is detected that interrupts the cycles regularly. Be it through nulls, routes or whatever.
Good reasoning. This gives me some ideas.
In short: To determine what the interrupted cycles are all about, you need to know which symbols represent the same plaintext character and/or determine to 100% where cycles are interrupted. I don’t know…it just seems to me that there are too many possibilities to search effectively for the reason of the interrupted cycles. I gladly let myself be convinced of the opposite, maybe I just do not see clearly enough.
Sometimes you need to dream a bit and try anyway. The potential reward is worth the shot.
The steep drop on the chart can have several causes. One possibility is a highly repetitive plaintext.
Unless the plaintext is really odd I do not think that it could affect the unique sequences in such a manner. I found one possible correlation: if during the encoding process, one does not try to repeat symbols – to some extent – in a certain view window, the cipher generally ends up with some unusual high peaks in the unique sequences. Such as 26 length 17 sequences in the 340. Which is about a 1 in a 500,000,000 observation. I consider this correlation as a possible element.
Such as 26 length 17 sequences in the 340. Which is about a 1 in a 500,000,000 observation. I consider this correlation as a possible element.
Did you ever plot the start position of all 26 sequences? Just to see if there are any interesting patterns?
Such as 26 length 17 sequences in the 340. Which is about a 1 in a 500,000,000 observation. I consider this correlation as a possible element.
Did you ever plot the start position of all 26 sequences? Just to see if there are any interesting patterns?
Yeah. These images are from December 2014. But have not tried to interpret patterns:
And from that same time period, the average "mountains" from 100 ciphers. Bright red graph is the normal direction and dark red is the mirrored direction, the other colors are by columns and diagonal. It shows how the system can be used to determine the direction further. The mountain most offset to the right is generally the true direction of the sequential homophonic substitution.
To give context, it relates back to the "mountains" of the 340. Where the red graph is the normal direction and the green is the mirrored direction.
Sorry if I ask again:
By "unique string" you mean that a string does not contain repeated symbols, right?
Example:
Cipher = "ABCDEFEG"
length unique string = 3
Tested strings:
ABC = unique
BCD = unique
CDE = unique
DEF = unique
EFE = not unique
FEG = unique
frequency = 5
If this is correct, how can it be that z340 only has 6 unique strings of length 2? I guess I didn’t get it somehow.
This is what I get:
My bad. Only count the longest unique string from every position.
So "ABCDEFG" = 7, 6, 5, 4, 3, 2, 1.
Thanks for the hint, now it works. I have tested this method a few times, but not enough to draw further conclusions. It’s definitely interesting. The 17 steep seems to come from the upper half, but I can’t say exactly yet. Here is a simple comparison:
Upper half of 340:
Lower half of 340:
Nevertheless, there are possible transpositions that are compatible with the observations. Let’s say z340 would be a simple homophonic substitution in which the lower half was rotated 180 degrees. Let’s also say that P19 has nothing to do with transposition, but comes from a strongly repeating plain text. All this just to give an example. In this case, rotating the lower half would neither affect P19 nor the "rows without repeated symbols" statistics. What would become apparent in this case, however, would be a cycle sequence interrupted in the middle. Sure, if z340 were built that way, it would have been solved long ago. I just want to show that simple transpositions are quite possible.
Let me illustrate something. The 340 has 26 length 17 unique sequences/strings. And after mirroring the lower half as you suggest this observation drops to 15 length 17 unique sequences
I think you misunderstood me on this one. I didn’t mean to mirror the lower half, but to rotate 180 degrees. In other words Flip+Mirror, or just reverse. In this case, the curve hardly changes at all. But the perfect-4 cycles have disappeared, whereas the periodic 2-cycles score rises to 2180. Even the P19 spike is preserved. Look:
340 with reversed lower half:
I wanted to demonstrate that there are certain transpositions that can hide very well from the statistics. Sure…z340 must have more to offer than just a rotated lower half. The transposition solver of AZDecrypt would surely have landed a hit a long time ago. Nevertheless, you can see that simple routes or transpositions are also possible during substitution and are not directly uncovered by statistics. Maybe it’s actually easier than we thought, just we think too complicated…
Lower half reversed:
HERabcdVPeIfLTGgh Nb+BjkOlDWYmnoKpq BrstM+UZGWjqLkuHJ SbbvdcwoVxbO++RKg yzM+u12hI7FP+34e5 bwRdFcO-ohCeFagDj k7+KQl8gUtXGVmuLI jGgJp2kO+yNYu+9Lz hnM+0+ZRgFBtrA#4K -ucUV+dJ+ObvnFBr- +8KIAwOZuSebNHDMa tC-oBOYyB1vnWheFI bqj5THSOPtWuCWq++ LqIBWcF+nC#fL+TtR gGQMpmdbOjKBuxmtI +-7ZUVaECz4sofXBr ++14Vtm04g7pdNFGc Rw8jtBF7IN+krSz#u FB3+mM4T7oIJRctng OKMT0bBYDIE175R+U
Translated with http://www.DeepL.com/Translator
Glad you got it working Largo. The steepness seems to be there for the upper half as well if you ask me. I did a per 10 row slide somewhere and the peak at 17 is pretty much consistent throughout the cipher. Still not sure what to think of it but will try to find out how the ioc affects the mountain.
I think you misunderstood me on this one. I didn’t mean to mirror the lower half, but to rotate 180 degrees. In other words Flip+Mirror, or just reverse. In this case, the curve hardly changes at all. But the perfect-4 cycles have disappeared, whereas the periodic 2-cycles score rises to 2180. Even the P19 spike is preserved
Oh I see. Well, in general the operation does not improve things on my end.
I wanted to demonstrate that there are certain transpositions that can hide very well from the statistics.
Yes. But I think we would then be looking for transpositions that do not transpose as much. Sorry to be so persistent on the matter.