Some thoughts on Z340 (dual-alphabet?)

darrenc · 2018-07-29T13:30:04Z

Hi forum! I have been having some thoughts on how the Zodiac may have enciphered Z340 after the quick break of Z408. I think by now everyone agrees it is not a vanilla homophone cipher - but I also don't believe it is going to be overly complicated. He would obviously want the cipher to be stronger but would presumably be limited to pen and paper. I have seen some discussion on the odd/even properties of Z340 but I don't think anyone has (yet) suggested that perhaps Z340 was enciphered with a dual-alphabet system - one for the odds and one for the evens. This would have been relatively straightforward for the Zodiac to implement as he would just have to double-up his efforts from the Z408 - the overall system would remain the same. This could also help explain some of the odd properties of Z340. I believe some of the patterns we can see may actually be intentional red-herrings on the part of Z. I also believe the the "ZODAIK" signature is legitimate ciphertext but misspelled because that was the closest he could get with the symbols he had to chose from. Anyway, I decided to test the dual-alphabet idea by passing Z340 through a remapping process which reassigned every second symbol to a new alphabet, effectively creating a new version of Z340 which contains 112 unique symbols. I could then process this in zkdecrypto as normal. The modified message is at the bottom of this post. Splitting the symbols does flatten the frequency curve a little (14, 10, 8, 7, ...) It can be further flattened at the high end by assuming the set (B, - and mirrored-D) are homophones (I have good reasons for suspecting this but that is a subject for another day). Unfortunately it also creates 22 single-occurrence symbols. Running through zkdecrypto produces the usual english-looking-rubbish (ELR), but with much fewer "breaks" in the flow than what I get with vanilla Z340. This may be simply due to having more symbols available to assign. I decided to test further by creating my own known plaintext cipher of a similar length using the same dual-alphabet approach. I also found this message produced nothing but ELR in zkdecrypto - only one word was correct (YOUR). If I had not known the plaintext I would have had no hope of breaking it. I am not sure where to proceed from here. I am just putting this out here for any interested parties. Here is the modified Z340: H!R"Ð#^$P%I&L'G(Ä )Ð*B,·.º0W1•2»3Æ4 B5„6M*u7G8¢4L9¤:J ;Ð=½?Ì@»$´=O*+AK( ¸QM*¤UÊYI[F+]³%/ =¾A^_Ì.-`ÄaË_>(D, ·[+3Ñbƒ(u6XcVd¤eI ,G(JfÊ9O*¸)yg+hLQ Ä2M*Â*ZA±_B6Ÿi°jK k¤#u$+?J*O=½2FlŸk u*RmµUEnD1B=Â'M3O (<6ÌAJn»[TjMd+]B_ ¤o¼;Ÿ9+)I[FlÃ,ƒ@R #G_N?Æ[±jÂdÃ$³U+* ŸlX&»p³QC!>$u7µk+ nÃd´gB3¢.Ð?•fMqG( R6T*L&°a<*F#WlI4L *+4Wa¤8ÃO;H'/,£= I_ËYW2½UBry.B`-aÃ "M0H)Ð%SgZ.¾iI3ƒ*

Largo

(@largo)

Posts: 454

Honorable Member

Oh I see. Well, in general the operation does not improve things on my end.

That’s not the goal either. A demonstration, nothing more. But I have some ideas based on "interlocked" ciphers. Let’s see how much time I can invest the next few days.

Sorry to be so persistent on the matter.

That’s exactly what we need…different points of view! If we all had the same view, this would not be the most efficient way to look for a solution. Perhaps I am completely barking up the wrong tree, but I may also find a new clue tomorrow. Who knows?
Thinking up possibilities and discussing them is what takes us forward. And that’s exactly why I appreciate this forum

Posted : August 10, 2018 11:44 pm

Jarlve

(@jarlve)

Posts: 2547

Famed Member

Indeed. I really hope that you find something new!

8-)

AZdecrypt

Posted : August 11, 2018 12:28 pm

Jarlve

(@jarlve)

Posts: 2547

Famed Member

I took a look at the effects of the ioc and encoding randomization on the unique sequences. Each graph/mountain is the average of 10,000 sequential homophonic substitution ciphers. None of the options listed is a good match for the 340. Either something funny is going on in the 340 or it is a significant outlier.

1) A higher ioc compresses the mountain horizontally:

2) More encoding randomization also compresses the mountain horizontally:

AZdecrypt

Posted : August 11, 2018 8:01 pm

doranchak

(@doranchak)

Posts: 2614

Member Admin

Yeah. These images are from December 2014. But have not tried to interpret patterns:

Nice – have you done a similar plot of Z408?

http://zodiackillerciphers.com

Posted : August 12, 2018 1:25 am

doranchak

(@doranchak)

Posts: 2614

Member Admin

The mountain most offset to the right is generally the true direction of the sequential homophonic substitution.

And this can be generalized to this aspect of the substitution: When encoding, symbols undergo a "spreading out" effect (in the normal reading direction) due to cycling of symbols within groups of homophones. Is that correct?

Does the effect still occur for homophonic substitution that is completely randomized (i.e., no cycles)? You probably already answered that long ago.
EDIT: Just studied your "average mountain" plots and I would interpret your results like this:

1) A cipher with more repetition of symbols (higher IoC) will have a harder time maintaining longer non-repeating sequences.
2) A cipher with more randomization in homophone groups will also have a harder time maintaining longer non-repeating sequences, since members of the groups are allowed to repeat sooner than if they were fully sequential.

So this makes some intuitive sense. Does that match your interpretations?

http://zodiackillerciphers.com

Posted : August 12, 2018 1:31 am

Jarlve

(@jarlve)

Posts: 2547

Famed Member

Yeah. These images are from December 2014. But have not tried to interpret patterns:

Nice – have you done a similar plot of Z408?

Here are some new ones.

340:

18 17 16 15 14 29 28 27 26 25 24 23 22 21 20 19 18
17 16 15 14 22 21 20 19 18 17 18 17 16 15 14 13 12
19 18 17 16 15 14 13 12 11 10 9  8  7  6  5  4  3
2  1  8  10 9  8  7  6  5  4  3  2  1  7  15 14 13
12 11 10 9  17 16 15 14 13 12 11 17 16 15 14 13 13
12 11 10 9  8  19 18 17 16 15 14 13 12 11 10 19 18
18 17 16 15 14 13 12 11 10 9  8  7  17 16 15 15 14
13 12 11 10 9  8  7  6  5  11 10 9  8  7  8  7  6
5  4  3  2  18 17 19 18 17 16 15 14 13 12 11 10 9
8  7  6  5  4  3  12 11 10 18 17 16 15 14 13 23 22
21 20 19 20 19 18 17 16 18 17 16 15 14 13 14 25 24
23 22 21 20 19 18 17 16 15 14 13 12 11 10 13 12 11
19 18 17 16 15 14 13 12 11 10 9  17 16 17 16 15 14
13 12 11 10 9  8  7  6  5  7  6  5  4  3  2  1  17
22 21 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12
11 10 9  23 22 21 20 19 18 17 16 15 14 13 12 11 10
9  8  7  6  12 12 11 10 9  8  8  7  6  5  4  3  2
1  5  4  3  17 16 15 21 20 19 19 18 17 16 15 14 13
12 11 10 9  8  7  6  5  4  18 17 16 22 21 20 19 18
17 16 15 14 13 12 11 10 9  8  7  6  5  4  3  2  1

408:

5  4  3  2  11 10 9  8  7  6  5  4  3  19 18 17 16
15 14 13 12 11 10 13 12 11 10 9  8  7  6  5  4  14
13 12 11 24 23 25 24 23 22 21 20 19 18 17 16 15 14
13 12 11 10 9  8  7  6  5  27 26 25 24 23 22 21 28
27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11
10 9  8  17 16 15 14 13 17 16 15 14 13 18 17 16 17
16 15 14 13 12 11 10 9  8  7  6  5  9  8  7  6  6
15 14 13 12 11 10 9  8  7  6  5  4  3  2  1  10 9
8  7  6  5  32 34 33 32 33 32 31 30 30 29 28 27 26
25 24 23 22 21 27 26 25 25 24 23 22 21 20 19 19 18
17 16 15 14 13 12 11 13 12 11 10 9  8  7  6  23 22
21 20 19 18 30 29 28 27 29 28 27 26 25 24 23 24 23
22 21 22 21 21 20 19 18 17 19 18 26 25 24 23 22 21
20 19 18 17 16 16 15 14 13 12 11 10 9  16 15 14 13
12 11 10 9  8  7  6  5  4  22 21 20 19 18 17 16 15
14 13 12 11 10 9  8  7  6  5  4  3  2  1  11 11 10
9  8  7  6  5  4  3  2  1  13 12 11 10 9  8  7  6
5  4  3  2  11 10 9  8  7  6  5  4  3  2  15 14 13
12 11 10 9  8  7  6  5  4  6  5  4  12 13 12 11 10
9  8  7  6  5  11 10 9  14 13 12 11 12 11 10 9  8
7  6  5  4  3  14 13 12 11 10 9  10 9  8  7  6  5
4  3  2  1  21 20 19 18 17 16 15 14 13 12 11 10 9
8  7  6  17 16 15 14 13 12 11 10 9  10 9  8  7  8
9  8  13 12 11 10 9  10 9  8  7  6  5  4  3  2  1

AZdecrypt

Posted : August 12, 2018 11:09 am

Jarlve

(@jarlve)

Posts: 2547

Famed Member

The mountain most offset to the right is generally the true direction of the sequential homophonic substitution.

And this can be generalized to this aspect of the substitution: When encoding, symbols undergo a "spreading out" effect (in the normal reading direction) due to cycling of symbols within groups of homophones. Is that correct?

Yes. Symbols spread out evenly throughout and around the middle point of the cipher. It is very easy to see and determine the properties of sequential homophonic substitution by scaling down the problem. For instance "ABCABCABC" versus "CAABACBCB". Of course all these properties are emergent from one single phenomena. But there seems to be no one single measurement/property that perfectly captures the phenomena in practice because of the randomness of the underlying plaintext.

Does the effect still occur for homophonic substitution that is completely randomized (i.e., no cycles)? You probably already answered that long ago.

No. These are quite random.

EDIT: Just studied your "average mountain" plots and I would interpret your results like this:

1) A cipher with more repetition of symbols (higher IoC) will have a harder time maintaining longer non-repeating sequences.
2) A cipher with more randomization in homophone groups will also have a harder time maintaining longer non-repeating sequences, since members of the groups are allowed to repeat sooner than if they were fully sequential.

So this makes some intuitive sense. Does that match your interpretations?

Yes. That is a very good interpretation and it is indeed as intuitive and simple as that.

AZdecrypt

Posted : August 12, 2018 11:24 am

Jarlve

(@jarlve)

Posts: 2547

Famed Member

Unique sequences, 340 versus 408:

408 without the last 8 rows, this graph/mountain looks allot more natural:

408 with and without the last 8 rows superimposed on top of each other, the area colored red is the difference between the two. This area is more left shifted and less significant, the last 8 rows do not add very long sequences to the result. Just a simple example of the system:

It is odd that the 340 has such a high spike that is so much shifted to the right.

AZdecrypt

Posted : August 12, 2018 11:39 am

Jarlve

(@jarlve)

Posts: 2547

Famed Member

It is odd that the 340 has such a high spike that is so much shifted to the right.

Within a sequential homophonic substitution 25% encoding randomization hypothesis the length 17 spike on its own it is a ~2.5 sigma observation. Not strong but I still hold some value to it.

AZdecrypt

Posted : August 12, 2018 11:53 am

doranchak

(@doranchak)

Posts: 2614

Member Admin

But there seems to be no one single measurement/property that perfectly captures the phenomena in practice because of the randomness of the underlying plaintext.

What about unigram distance? http://zodiackillersite.com/viewtopic.p … 902#p53902
It seems to capture two phenomena: Anomalous gaps between symbols, and the overall spreading out of symbols.
My own implementation of the measurement in randomization tests shows a 2.6 sigma for Z408 and 4.4 for Z340.

http://zodiackillerciphers.com

Posted : August 12, 2018 1:14 pm

Jarlve

(@jarlve)

Posts: 2547

Famed Member

Cycles have many properties. Unigram distance does not capture the goodness of the cycles nor the increase of bigrams. If unigram distance is considered as a single sum, then there is the problem that the ratio of 1) Anomalous gaps and 2) Overall symbol spread is unknown. We know that in the case of the 340 the high unigram distance is caused by the anomalous large gaps of some symbols + sequential homophonic substitution. But a high unigram distance is possible without anomalous large gaps.

AZdecrypt

Posted : August 12, 2018 2:20 pm

Largo

(@largo)

Posts: 454

Honorable Member

Just for my general understanding:
We use the following pseudo-source code to generate these statistics:

for i = 0 to 339
  length = getLongestUniqueSequence(i)
  table[length] += 1

We determine the longest sequence for each position in the cipher and put the result in a table. The further the position progresses, i.e. the larger the variable "i" becomes, the shorter the maximum achievable sequence lengths. With 63 different symbols there can theoretically also be a maximum unique sequence length of 63. However, this length can only be reached at positions 0 to 276. From position 277 only a maximum unique sequence length of 62 is possible, from position 278 only 61 and so on.

This in turn means that towards the end of the cipher the short sequences also increase. Theoretically, therefore, most of the "left half of the mountain" would represent the end of the cipher.
Shouldn’t this be taken into account in the calculation by means of a weighting?

Translated with http://www.DeepL.com/Translator

Posted : August 12, 2018 2:54 pm

Jarlve

(@jarlve)

Posts: 2547

Famed Member

This in turn means that towards the end of the cipher the short sequences also increase. Theoretically, therefore, most of the "left half of the mountain" would represent the end of the cipher.
Shouldn’t this be taken into account in the calculation by means of a weighting?

You could exclude sequences that terminate at the end of the cipher but I do not think it is an issue.

AZdecrypt

Posted : August 12, 2018 3:55 pm

Jarlve

(@jarlve)

Posts: 2547

Famed Member

I am considering excluding sequences that terminate at the end of the cipher because their length is uncertain. Good point Largo. Here is the difference:

AZdecrypt

Posted : August 12, 2018 4:11 pm

doranchak

(@doranchak)

Posts: 2614

Member Admin

I am considering excluding sequences that terminate at the end of the cipher because their length is uncertain. Good point Largo.

What about normalizing the ones at the end by the fraction of the max possible length?
For example, Z340 has a max possible unique length of 63 due to its alphabet.
Towards the beginning, a unique string of length 10, out of a possible 63, would score: 63/63 = 1. Add 1 to the "10" column in the histogram.
But let’s say towards the end, a unique string of length 10 is found, but a max length of only 40 is possible. So it would score: 40/63 = 0.63. Add 0.63 to the "10" column in the histogram.

Not sure how much it would matter.

http://zodiackillerciphers.com

Posted : August 12, 2018 9:32 pm

Zodiac Discussion Forum