Some thoughts on Z340 (dual-alphabet?)

darrenc · 2018-07-29T13:30:04Z

Hi forum! I have been having some thoughts on how the Zodiac may have enciphered Z340 after the quick break of Z408. I think by now everyone agrees it is not a vanilla homophone cipher - but I also don't believe it is going to be overly complicated. He would obviously want the cipher to be stronger but would presumably be limited to pen and paper. I have seen some discussion on the odd/even properties of Z340 but I don't think anyone has (yet) suggested that perhaps Z340 was enciphered with a dual-alphabet system - one for the odds and one for the evens. This would have been relatively straightforward for the Zodiac to implement as he would just have to double-up his efforts from the Z408 - the overall system would remain the same. This could also help explain some of the odd properties of Z340. I believe some of the patterns we can see may actually be intentional red-herrings on the part of Z. I also believe the the "ZODAIK" signature is legitimate ciphertext but misspelled because that was the closest he could get with the symbols he had to chose from. Anyway, I decided to test the dual-alphabet idea by passing Z340 through a remapping process which reassigned every second symbol to a new alphabet, effectively creating a new version of Z340 which contains 112 unique symbols. I could then process this in zkdecrypto as normal. The modified message is at the bottom of this post. Splitting the symbols does flatten the frequency curve a little (14, 10, 8, 7, ...) It can be further flattened at the high end by assuming the set (B, - and mirrored-D) are homophones (I have good reasons for suspecting this but that is a subject for another day). Unfortunately it also creates 22 single-occurrence symbols. Running through zkdecrypto produces the usual english-looking-rubbish (ELR), but with much fewer "breaks" in the flow than what I get with vanilla Z340. This may be simply due to having more symbols available to assign. I decided to test further by creating my own known plaintext cipher of a similar length using the same dual-alphabet approach. I also found this message produced nothing but ELR in zkdecrypto - only one word was correct (YOUR). If I had not known the plaintext I would have had no hope of breaking it. I am not sure where to proceed from here. I am just putting this out here for any interested parties. Here is the modified Z340: H!R"Ð#^$P%I&L'G(Ä )Ð*B,·.º0W1•2»3Æ4 B5„6M*u7G8¢4L9¤:J ;Ð=½?Ì@»$´=O*+AK( ¸QM*¤UÊYI[F+]³%/ =¾A^_Ì.-`ÄaË_>(D, ·[+3Ñbƒ(u6XcVd¤eI ,G(JfÊ9O*¸)yg+hLQ Ä2M*Â*ZA±_B6Ÿi°jK k¤#u$+?J*O=½2FlŸk u*RmµUEnD1B=Â'M3O (<6ÌAJn»[TjMd+]B_ ¤o¼;Ÿ9+)I[FlÃ,ƒ@R #G_N?Æ[±jÂdÃ$³U+* ŸlX&»p³QC!>$u7µk+ nÃd´gB3¢.Ð?•fMqG( R6T*L&°a<*F#WlI4L *+4Wa¤8ÃO;H'/,£= I_ËYW2½UBry.B`-aÃ "M0H)Ð%SgZ.¾iI3ƒ*

Jarlve

(@jarlve)

Posts: 2547

Famed Member

Nevertheless, there are possible transpositions that are compatible with the observations. Let’s say z340 would be a simple homophonic substitution in which the lower half was rotated 180 degrees. Let’s also say that P19 has nothing to do with transposition, but comes from a strongly repeating plain text. All this just to give an example. In this case, rotating the lower half would neither affect P19 nor the "rows without repeated symbols" statistics. What would become apparent in this case, however, would be a cycle sequence interrupted in the middle. Sure, if z340 were built that way, it would have been solved long ago. I just want to show that simple transpositions are quite possible.

Let me illustrate something. The 340 has 26 length 17 unique sequences/strings. And after mirroring the lower half as you suggest this observation drops to 15 length 17 unique sequences. doranchak calculated the odds of 26 length 17 appearing randomly to about 1 in 600,000,000. It narrows down the transposition options even further.

Here is a picture. The red line are the unique sequence length frequencies. The other colors are different directions and green is the 340 fully mirrored. To interpret it differently, the red "mountain" is right shifted from the green "mountain". And the more the mountain is right shifted the better it looks as considering the direction as the actual sequential homophonic substitution direction. Mirroring the lower half left shifts the mountain. This sharp peak at 17 with a steep drop off is actually a bit weird. Though it would be weird if it would be normal, it is the 340 after all.

AZdecrypt

Posted : August 1, 2018 8:37 pm

Largo

(@largo)

Posts: 454

Honorable Member

Hi,

I implemented the scoring shown by Jarlve and did some experiments. I remembered that sometimes shifting rows or the whole cipher leads to better period n peaks. By shifts of rows I mean the following:

original row:
HERabcdVPeIfLTGgh

row shifted by 3:
GghHERabcdVPeIfLT

So why not test whether the 2-cycle-score also increases significantly? I wrote a little function that tries different row shifts. Here is the best result so far:

Shifting amounts (rows 1-20):
3, 15, 15, 1, 11, 4, 11, 7, 16, 0, 16, 5, 15, 13, 2, 0, 2, 16, 0, 6

Resulting Cipher:

GghHERabcdVPeIfLT
+BjkOlDWYmnoKpqNb
stM+UZGWjqLkuHJBr
gSbbvdcwoVxbO++RK
2hI7FP+34e5yzM+u1
agDjbwRdFcO-ohCeF
8gUtXGVmuLIk7+KQl
NYu+9LzjGgJp2kO+y
nM+0+ZRgFBtrA#4Kh
-ucUV+dJ+ObvnFBr-
+R571EIDYBb0TMKOU
m+3BFgntcRJIo7T4M
zSrk+NI7FBtj8wRu#
dp7g40mtV41++cGFN
-+rBXfos4zCEaVUZ7
ItmxuBKjObdmpMQGg
qLRtT+Lf#Cn+FcWBI
+qWCuWtPOSHT5jqb+
IFehWnv1ByYOBo-Ct
wAIK8+aMDHNbeSuZO

The 2-Cycle score is 2433, an increase of 13.9% over the original z340. Is that significant? I honestly have no idea.
Compared to z340, the perfect n-cycles have also increased (obviously).

However, I have not yet made a comparison with other ciphers. Perhaps these observations are based on quite normal behavior.
I’ll try to include the unique sequences length stuff shown in the last chart, so I can make such a comparison too.

Perfect 4-symbol cycles:
PlsxPlsxP (30)
fsxQfsxQf (30)
lsxQlsxQ (20)
l23Xl23X (20)
l23Ql23Q (20)
l2XQl2XQ (20)
l3XQl3XQ (20)
sxQAsxQA (20)
v3XQv3XQv (30)
v8XQv8XQv8 (42)
23XQ23XQ (20)
3XQA3XQA (20)
58XQ58XQ58 (42)
y8XQy8XQy8 (42)
8XQA8XQA8 (30)

Perfect 5-symbol cycles:
l23XQl23XQ (30)

Posted : August 4, 2018 7:47 pm

Jarlve

(@jarlve)

Posts: 2547

Famed Member

The 2-Cycle score is 2433, an increase of 13.9% over the original z340. Is that significant? I honestly have no idea.
Compared to z340, the perfect n-cycles have also increased (obviously).

However, I have not yet made a comparison with other ciphers. Perhaps these observations are based on quite normal behavior.
I’ll try to include the unique sequences length stuff shown in the last chart, so I can make such a comparison too.

Each row has 17 unique offsets which give 4,064,231,406,647,572,522,401,601 possibilities. I do not think 2433 is significant in such a space. When comparing to other ciphers make sure to test some that are about as cyclic as the 340 since it is easier to improve upon cycles that are more randomized.

You can test the 340 versus itself for such a scheme (row offset shifts). Make random row shifts and count the number of times the 2-symbol cycle score improved. If the improvement ratio is like 20% or greater then that may be something. This improvement ratio idea (borrowed from smokie) only works when the space is sufficiently large since it is prone to outliers if for instance there are only 10 possibilities.

AZdecrypt

Posted : August 5, 2018 11:16 am

doranchak

(@doranchak)

Posts: 2614

Member Admin

I have the impression that many operations will produce large increases in cycle scores.

Jarlve once found that deleting row 14 dramatically increases the 4-cycle score: viewtopic.php?p=54990#p54990

I looked for row swaps that improve cycle scores too. If you allow arbitrary re-arrangement of rows, you can get a lot of cycle improvements. I found a simpler rearrangement that improves cycles but I can’t remember how much improved they are:

http://zodiackillerciphers.com

Posted : August 6, 2018 6:29 pm

Largo

(@largo)

Posts: 454

Honorable Member

Thank you Jarlve and Doranchak for the explanations.

My idea to search for better cycles with shifts was probably not well planned and a bit thoughtless. But you never know…
I did the shifting with some test ciphers (25% random encoding). The result is similar to z340: The cycles are improved quite quickly. Nothing special indeed.

Posted : August 7, 2018 7:50 pm

Jarlve

(@jarlve)

Posts: 2547

Famed Member

Thank you Jarlve and Doranchak for the explanations.

My idea to search for better cycles with shifts was probably not well planned and a bit thoughtless. But you never know…
I did the shifting with some test ciphers (25% random encoding). The result is similar to z340: The cycles are improved quite quickly. Nothing special indeed.

And now we are back to the question: what is the nature/cause of this randomization in the 340?

AZdecrypt

Posted : August 8, 2018 12:10 am

Largo

(@largo)

Posts: 454

Honorable Member

I really can’t think… this heat wave is abnormal and seems to have no end. It’s not a record heat anymore, but it’s quite humid. From Friday there is improvement in sight! (I’m just looking for an excuse why I can’t solve z340 right away :lol: )

Jarlve, I just reviewed your chart. Am I getting this right? There are 26 different strings, each consisting of 17 characters without repetition? Or am I misinterpreting the unique sequences? If I’m right, can the sequences overlap in this case? Must be…

The whole observation of cycles and sequences without repeated letters is only important if Zodiac aimed for a perfect homophonic encryption and the cipher is not transposed after substitution, isn’t it? An ideal encryption is perfectly cyclic and the letter distribution is absolutely smooth. However, this was not at all the case with z408. Quite the opposite: in the last thrid of the cipher he broke many cycles. Either intentional or by sloppyness. Apparently Zodiac had used the general letter distribution in English texts. Ideally he should have used the distribution of in plaintext of z408 for the key. But there were enough cycles that made z408 vulnerable. The Hardens had started right where bigrams repeated themselves. Zodiac sure as hell read about it in the paper. So it is only logical for him to avoid such cycles in the future. The easiest way to do this is to be careful when substituting and to deviate from the cycle at the appropriate points. Actually a plausible explanation, especially since 25% are quite compatible with it. Let’s say for fun, that’s exactly what happened (regardless of whether there was a transposition underneath). How can you tell if arbitrary cycles have been avoided? In my opinion, we can only use falsification here. However, this is only possible if z340 either is solved or at least a pattern is detected that interrupts the cycles regularly. Be it through nulls, routes or whatever.

An example:
Take a plaintext consisting of 330 letters. Now transpose it. Now you draw a grid with 17×20 and leave 10 squares free, which form a pattern. The letter Z, a crosshair or whatever. Now substitute the 330 characters and fill them into the grid, leaving out the 10 fields of the pattern. Then a short message with 10 characters is filled into the free space. (Cyclically and linearly). As long as such a pattern is not too invasive, it can lead to the behavior we observe: Period n because of transposition + simultaneous slightly interrupted cyclical substitution. However, a transposition solver will not succeed because the 10 characters will cause too much interference (e.g. with diagonal transposition). Only after removing the 10 characters, you get a solution.

In short: To determine what the interrupted cycles are all about, you need to know which symbols represent the same plaintext character and/or determine to 100% where cycles are interrupted. I don’t know…it just seems to me that there are too many possibilities to search effectively for the reason of the interrupted cycles. I gladly let myself be convinced of the opposite, maybe I just do not see clearly enough.

The steep drop on the chart can have several causes. One possibility is a highly repetitive plaintext. This is also possible during transposition if the plaintext length correlates with the transposition and the key. At a certain point, there are inevitably no more choices.

Translated with http://www.DeepL.com/Translator

Posted : August 8, 2018 9:25 pm

Jarlve

(@jarlve)

Posts: 2547

Famed Member

Jarlve, I just reviewed your chart. Am I getting this right? There are 26 different strings, each consisting of 17 characters without repetition? Or am I misinterpreting the unique sequences? If I’m right, can the sequences overlap in this case? Must be…

Yes, sequences can overlap since they are counted from every position in the cipher.

The whole observation of cycles and sequences without repeated letters is only important if Zodiac aimed for a perfect homophonic encryption and the cipher is not transposed after substitution, isn’t it? An ideal encryption is perfectly cyclic and the letter distribution is absolutely smooth. However, this was not at all the case with z408. Quite the opposite: in the last thrid of the cipher he broke many cycles. Either intentional or by sloppyness. Apparently Zodiac had used the general letter distribution in English texts. Ideally he should have used the distribution of in plaintext of z408 for the key. But there were enough cycles that made z408 vulnerable. The Hardens had started right where bigrams repeated themselves. Zodiac sure as hell read about it in the paper. So it is only logical for him to avoid such cycles in the future. The easiest way to do this is to be careful when substituting and to deviate from the cycle at the appropriate points. Actually a plausible explanation, especially since 25% are quite compatible with it. Let’s say for fun, that’s exactly what happened (regardless of whether there was a transposition underneath). How can you tell if arbitrary cycles have been avoided? In my opinion, we can only use falsification here. However, this is only possible if z340 either is solved or at least a pattern is detected that interrupts the cycles regularly. Be it through nulls, routes or whatever.

Good reasoning. This gives me some ideas.

In short: To determine what the interrupted cycles are all about, you need to know which symbols represent the same plaintext character and/or determine to 100% where cycles are interrupted. I don’t know…it just seems to me that there are too many possibilities to search effectively for the reason of the interrupted cycles. I gladly let myself be convinced of the opposite, maybe I just do not see clearly enough.

Sometimes you need to dream a bit and try anyway. The potential reward is worth the shot.

The steep drop on the chart can have several causes. One possibility is a highly repetitive plaintext.

Unless the plaintext is really odd I do not think that it could affect the unique sequences in such a manner. I found one possible correlation: if during the encoding process, one does not try to repeat symbols – to some extent – in a certain view window, the cipher generally ends up with some unusual high peaks in the unique sequences. Such as 26 length 17 sequences in the 340. Which is about a 1 in a 500,000,000 observation. I consider this correlation as a possible element.

AZdecrypt

Posted : August 9, 2018 12:57 am

doranchak

(@doranchak)

Posts: 2614

Member Admin

Such as 26 length 17 sequences in the 340. Which is about a 1 in a 500,000,000 observation. I consider this correlation as a possible element.

Did you ever plot the start position of all 26 sequences? Just to see if there are any interesting patterns?

http://zodiackillerciphers.com

Posted : August 9, 2018 1:06 am

Jarlve

(@jarlve)

Posts: 2547

Famed Member

Such as 26 length 17 sequences in the 340. Which is about a 1 in a 500,000,000 observation. I consider this correlation as a possible element.

Did you ever plot the start position of all 26 sequences? Just to see if there are any interesting patterns?

Yeah. These images are from December 2014. But have not tried to interpret patterns:

AZdecrypt

Posted : August 9, 2018 10:51 am

Jarlve

(@jarlve)

Posts: 2547

Famed Member

And from that same time period, the average "mountains" from 100 ciphers. Bright red graph is the normal direction and dark red is the mirrored direction, the other colors are by columns and diagonal. It shows how the system can be used to determine the direction further. The mountain most offset to the right is generally the true direction of the sequential homophonic substitution.

To give context, it relates back to the "mountains" of the 340. Where the red graph is the normal direction and the green is the mirrored direction.

AZdecrypt

Posted : August 9, 2018 11:02 am

Largo

(@largo)

Posts: 454

Honorable Member

Sorry if I ask again:

By "unique string" you mean that a string does not contain repeated symbols, right?

Example:

Cipher = "ABCDEFEG"
length unique string = 3

Tested strings:
ABC = unique
BCD = unique
CDE = unique
DEF = unique
EFE = not unique
FEG = unique

frequency = 5

If this is correct, how can it be that z340 only has 6 unique strings of length 2? I guess I didn’t get it somehow.

This is what I get:

Posted : August 9, 2018 10:36 pm

Jarlve

(@jarlve)

Posts: 2547

Famed Member

My bad. Only count the longest unique string from every position.

So "ABCDEFG" = 7, 6, 5, 4, 3, 2, 1.

AZdecrypt

Posted : August 10, 2018 12:04 am

Largo

(@largo)

Posts: 454

Honorable Member

Thanks for the hint, now it works. I have tested this method a few times, but not enough to draw further conclusions. It’s definitely interesting. The 17 steep seems to come from the upper half, but I can’t say exactly yet. Here is a simple comparison:

Upper half of 340:

Lower half of 340:

Nevertheless, there are possible transpositions that are compatible with the observations. Let’s say z340 would be a simple homophonic substitution in which the lower half was rotated 180 degrees. Let’s also say that P19 has nothing to do with transposition, but comes from a strongly repeating plain text. All this just to give an example. In this case, rotating the lower half would neither affect P19 nor the "rows without repeated symbols" statistics. What would become apparent in this case, however, would be a cycle sequence interrupted in the middle. Sure, if z340 were built that way, it would have been solved long ago. I just want to show that simple transpositions are quite possible.

Let me illustrate something. The 340 has 26 length 17 unique sequences/strings. And after mirroring the lower half as you suggest this observation drops to 15 length 17 unique sequences

I think you misunderstood me on this one. I didn’t mean to mirror the lower half, but to rotate 180 degrees. In other words Flip+Mirror, or just reverse. In this case, the curve hardly changes at all. But the perfect-4 cycles have disappeared, whereas the periodic 2-cycles score rises to 2180. Even the P19 spike is preserved. Look:

340 with reversed lower half:

I wanted to demonstrate that there are certain transpositions that can hide very well from the statistics. Sure…z340 must have more to offer than just a rotated lower half. The transposition solver of AZDecrypt would surely have landed a hit a long time ago. Nevertheless, you can see that simple routes or transpositions are also possible during substitution and are not directly uncovered by statistics. Maybe it’s actually easier than we thought, just we think too complicated…

Lower half reversed:

HERabcdVPeIfLTGgh
Nb+BjkOlDWYmnoKpq
BrstM+UZGWjqLkuHJ
SbbvdcwoVxbO++RKg
yzM+u12hI7FP+34e5
bwRdFcO-ohCeFagDj
k7+KQl8gUtXGVmuLI
jGgJp2kO+yNYu+9Lz
hnM+0+ZRgFBtrA#4K
-ucUV+dJ+ObvnFBr-
+8KIAwOZuSebNHDMa
tC-oBOYyB1vnWheFI
bqj5THSOPtWuCWq++
LqIBWcF+nC#fL+TtR
gGQMpmdbOjKBuxmtI
+-7ZUVaECz4sofXBr
++14Vtm04g7pdNFGc
Rw8jtBF7IN+krSz#u
FB3+mM4T7oIJRctng
OKMT0bBYDIE175R+U

Translated with http://www.DeepL.com/Translator

Posted : August 10, 2018 8:14 pm

Jarlve

(@jarlve)

Posts: 2547

Famed Member

Glad you got it working Largo. The steepness seems to be there for the upper half as well if you ask me. I did a per 10 row slide somewhere and the peak at 17 is pretty much consistent throughout the cipher. Still not sure what to think of it but will try to find out how the ioc affects the mountain.

I think you misunderstood me on this one. I didn’t mean to mirror the lower half, but to rotate 180 degrees. In other words Flip+Mirror, or just reverse. In this case, the curve hardly changes at all. But the perfect-4 cycles have disappeared, whereas the periodic 2-cycles score rises to 2180. Even the P19 spike is preserved

Oh I see. Well, in general the operation does not improve things on my end.

I wanted to demonstrate that there are certain transpositions that can hide very well from the statistics.

Yes. But I think we would then be looking for transpositions that do not transpose as much. Sorry to be so persistent on the matter.

AZdecrypt

Posted : August 10, 2018 11:04 pm

Zodiac Discussion Forum