Hi forum!
I have been having some thoughts on how the Zodiac may have enciphered Z340 after the quick break of Z408. I think by now everyone agrees it is not a vanilla homophone cipher – but I also don’t believe it is going to be overly complicated. He would obviously want the cipher to be stronger but would presumably be limited to pen and paper.
I have seen some discussion on the odd/even properties of Z340 but I don’t think anyone has (yet) suggested that perhaps Z340 was enciphered with a dual-alphabet system – one for the odds and one for the evens. This would have been relatively straightforward for the Zodiac to implement as he would just have to double-up his efforts from the Z408 – the overall system would remain the same. This could also help explain some of the odd properties of Z340. I believe some of the patterns we can see may actually be intentional red-herrings on the part of Z. I also believe the the "ZODAIK" signature is legitimate ciphertext but misspelled because that was the closest he could get with the symbols he had to chose from.
Anyway, I decided to test the dual-alphabet idea by passing Z340 through a remapping process which reassigned every second symbol to a new alphabet, effectively creating a new version of Z340 which contains 112 unique symbols. I could then process this in zkdecrypto as normal. The modified message is at the bottom of this post. Splitting the symbols does flatten the frequency curve a little (14, 10, 8, 7, …) It can be further flattened at the high end by assuming the set (B, – and mirrored-D) are homophones (I have good reasons for suspecting this but that is a subject for another day). Unfortunately it also creates 22 single-occurrence symbols.
Running through zkdecrypto produces the usual english-looking-rubbish (ELR), but with much fewer "breaks" in the flow than what I get with vanilla Z340. This may be simply due to having more symbols available to assign. I decided to test further by creating my own known plaintext cipher of a similar length using the same dual-alphabet approach. I also found this message produced nothing but ELR in zkdecrypto – only one word was correct (YOUR). If I had not known the plaintext I would have had no hope of breaking it.
I am not sure where to proceed from here. I am just putting this out here for any interested parties.
Here is the modified Z340:
H!R"Ð#^$P%I&L'G(Ä )Ð*B,·.º0W1•2»3Æ4 B5„6M*u7G8¢4L9¤:J ;Ð=½?Ì@»$´=O*+AK( ¸QM*¤UÊYI[F+]³%/ =¾A^_Ì.-`ÄaË_>(D, ·[+3Ñbƒ(u6XcVd¤eI ,G(JfÊ9O*¸)yg+hLQ Ä2M*Â*ZA±_B6Ÿi°jK k¤#u$+?J*O=½2FlŸk u*RmµUEnD1B=Â'M3O (<6ÌAJn»[TjMd+]B_ ¤o¼;Ÿ9+)I[FlÃ,ƒ@R #G_N?Æ[±jÂdÃ$³U+* ŸlX&»p³QC!>$u7µk+ nÃd´gB3¢.Ð?•fMqG( R6T*L&°a<*F#WlI4L *+4Wa¤8ÃO;H'/,£= I_ËYW2½UBry.B`-aà "M0H)Ð%SgZ.¾iI3ƒ*
Welcome darrenc,
About different keys for odds and evens:
Me and smokie have looked at this back in 2016. Our findings were that such a scheme is statistically detectable. One way to do that is to look at sequential homophones or other stats for odds and evens separated.
Though, with periodic schemes, as always, a null or a skip interrupting the pattern can cause significant complications.
Cipher smokie6: https://drive.google.com/file/d/0B5r0rD … x0TGs/view
CIpher smokie8: https://drive.google.com/file/d/0B5r0rD … Q1VDg/view
I do not remember managing to solve any of these. It is not just the multiplicity but also the interlacing of 2 different keys that makes such ciphers incredibly difficult to solve.
Hi darrenc,
some time ago I made such a cipher too. Jarvle quickly figured out that two alphabets were used for odd/even positions:
Jarvis, Largo, thanks for the welcome! Very interesting thread you linked. You weren’t kidding, Largo, about how quickly Jarvis worked it out!
Hmm… back to the drawing board for now. I can’t help but think Z340 must be a "fair" cipher (no nasty tricks to make it virtually crack proof) as Z wouldn’t have got satisfaction out of that. I thought this idea was promising. Oh well!
Hmm… back to the drawing board for now. I can’t help but think Z340 must be a "fair" cipher (no nasty tricks to make it virtually crack proof) as Z wouldn’t have got satisfaction out of that. I thought this idea was promising. Oh well!
The 408 is a sequential homophonic substitution cipher. That seems likely for the 340 as well, though there is about 25% randomization within the sequential homophones. How does this supposed randomization exist in the cipher? That for me is the one million dollar question. It could be nulls, skips, polyphones, transposition, random homophone selection or something else entirely. I do not think that anyone can outright solve the 340 without answering this question first.
A bit of background on how this 25% randomization is assumed. Sequential homophonic substitution can be measured very well with statistics such as 2-symbol cycles. Without going into depth of how this 2-symbol cycle measurement works consider that the 340 has a 2-symbol score of 2236. That score is high enough to understand that it is unlikely to be a random feature of the 340. To illustrate, within 1,000,000 full cipher randomizations of the 340 the best score was 1896. While simulating a 340 like cipher assuming a normal plaintext and perfect sequential homophonic substitution, the average 2-symbol score is 3378. And while adding 24% random homophone selection this score comes down to 2208. Very close to that of the 340.
Sequential homophonic substitution can be measured very well with statistics such as 2-symbol cycles. Without going into depth of how this 2-symbol cycle measurement works consider that the 340 has a 2-symbol score of 2236.
Is there a thread here in the forum with a more detailed description of how this calculation works? Sorry, I couldn’t find anything during the search. Somewhere there was a linked paper from you describing this. Unfortunately, I don’t find that either.
If there is no description of the calculation method yet, could you please explain it briefly? How do you know, for example, whether two perfectly alternating symbols really represent the same plain text letter? There can only be guesswork here, right?
The only thing I did in the direction of pattern-searching was calculating the Levenshtein distance.
Translated with www.DeepL.com/Translator
Is there a thread here in the forum with a more detailed description of how this calculation works? Sorry, I couldn’t find anything during the search. Somewhere there was a linked paper from you describing this. Unfortunately, I don’t find that either.
If there is no description of the calculation method yet, could you please explain it briefly?
It is somewhere but I’ll repeat. Here is my derivative of smokie ‘s 2-symbol cycle measurement.
1) Extract all unique 2-symbol cycles from the cipher excluding symbols that occur only once.
2) Score each cycle and sum the scores.
– My current cycle scoring formula: score+=(cycle_length-1)*((alternations/(cycle_length-1))^5)
– The 5 here acts as a weight. Setting it higher will give more importance to more perfect cycles.
Example cycle: ABABABBBABBABAA (15) Alternations: 011111001101110 (10)
The cycle_length is 15 and it has 10 alternations. Given the scoring formula this cycle is worth about 2.60.
How do you know, for example, whether two perfectly alternating symbols really represent the same plain text letter? There can only be guesswork here, right?
Good question. It is a problem I had to solve for AZdecrypt’s Merge sequential homophones function. In short, the most optimal fit for all cycles to some amount of target letters will have the most likely cycles. It works very well when there is little encoding randomization but from about 25% and onwards it gets uncertain. It appears that the 340 has just the right amount of encoding randomization.
Here is the result of the Merge homophone function on the 340 using a non-aggressive reduction to 36 target symbols/letters:
Input to output key: -------------------------------------------------------- H: HHHH (12) E: E1:E1:E1 (90) R: RKRKRKRKRKRRKRK (312) >: >S>S>S>S (84) p: ppppppppppp (110) l: lMlMlMlMlMlMlM (312) ^: ^*^*^*^*^*^* (220) V: VVVVVV (30) P: P73P73P7 (90) k: kddkdkdkdk (112) |: |c|c|c|c||cc|cc|c|c| (480) L: LLLLLL (30) T: TTTTT (20) G: G(G((G(G(G(G( (220) 2: 2z2z22z2z2z2z2z2zz (420) N: NNNNN (20) +: ++++++++++++++++++++++++ (552) B: BBBBBBBBBBBB (132) #: ##### (20) O: OOOOOOOOOO (90) j: %j&q%j&q (80) /: DYZ/DYZ/DYZ/YDZ (440) W: WWWWWW (30) .: ...... (30) -: <-<-<-<-<<- (144) t: f9tf9t9ft9ft (216) ): ))))) (20) 4: y4y4y4y44y4 (144) U: UJUJUJUJU (112) 8: 8_8_8_8 (60) 5: 5555555 (42) F: FFFFFFFFFF (90) C: CCCCC (20) ;: ;XA;XA; (60) @: @ (0) b: b6b6b6 (40) -------------------- Cycle score: 4884
Thank you for the description! I’ll implement that as soon as I find some more time.
Have you ever examined the "good" but not perfect cycles? Say, all 2-symbol cycles that have a randomness between >0 and <= 25? Then look at these to see if the positions of the breaks have anything in common? To put it naively, for example, the following would be a great result: All 25% cycles break by x% of their length or something like that.
Translated with http://www.DeepL.com/Translator
Hey Largo,
As far as cycle examination goes I was able to identify the use of most cycle types in the thread: http://zodiackillersite.com/viewtopic.php?f=81&t=3616
Here’s a good summation. The 340 seems to exhibit increasingly random cycles just as the 408:
Talking about cycle types, I have done allot of work on it the last week. It seems that the 340 just as the 408 is suffering from increasingly random cycles. My routine has detection for random, offset, palindromic, shortened, lengthened, anti and pattern cycles and these do not look interesting. This detection works on all the test ciphers in the main post. No detection yet for regional cycles. It is still a question for me where the extra randomness in the 340 comes from but could it be as simple that the 340 just starts out more randomly?
340:
2-symbol cycles:
Cycles: 6.37 sigma
Increasingly random cycles: 2.17 sigma
Decreasingly random cycles: -1.36 sigma3-symbol cycles:
Cycles: 6.28 sigma
Increasingly random cycles: 4.40 sigma
Decreasingly random cycles: 0.02 sigma4-symbol cycles:
Cycles: 7.03 sigma
Increasingly random cycles: 6.21 sigma
Decreasingly random cycles: 0.77 sigma5-symbol cycles:
Cycles: 7.43 sigma
Increasingly random cycles: 6.95 sigma
Decreasingly random cycles: 1.13 sigma408:
2-symbol cycles:
Cycles: 9.74 sigma
Increasingly random cycles: 4.13 sigma
Decreasingly random cycles: -1.79 sigma3-symbol cycles:
Cycles: 13.30 sigma
Increasingly random cycles: 9.40 sigma
Decreasingly random cycles: -0.44 sigma4-symbol cycles:
Cycles: 18.98 sigma
Increasingly random cycles: 16.74 sigma
Decreasingly random cycles: 0.59 sigma5-symbol cycles:
Cycles: 26.70 sigma
Increasingly random cycles: 27.12 sigma
Decreasingly random cycles: 1.33 sigmaExample of a increasingly random 5-symbol cycle in the 340: MUJ_9MUJ_9MUJUMJM99UM_M
I long thought that the strange symbol he does in the Halloween card could be two keys, they kind of look like keys of two different types to me. There is also that photo that was sent in years later of two keys, though I am not sure that is a confirmed Zodiac correspondence?
Then there are four dots, perhaps suggesting that the 340 is in four parts of 5 lines each, only jumbled up in the current wrong order. The first cipher he did in 3 parts so it’s possible he had a similar line of thinking.
So to make the 340 harder than the 408 he –
Uses more symbols
Perhaps uses a key(s) somehow to change cipher encryption type
Splits the message into four parts instead of 3, and arranges them on paper out of order
I could see him doing something relatively simple like this to make the cipher harder to crack than the last one, and evolve his method. Picking up ideas out of his ‘code breaking for noobs’ book he got from the library.
As far as cycle examination goes I was able to identify the use of most cycle types in the thread:
It’s hard to find an idea you or smokie didn’t have before
I can still think of two things:
– Have you ever examined the cycles separately for the upper and lower half? The MUJ_9MUJ_9MUJUMJM99UM_M example looks as if it were interrupted approximately in the middle. Some time ago I experimented with the perfect-cycle feature of AZDecrypt, but I had the feeling that 170 characters (one half of the cipher) are simply not enough analysis material.
– Would it make sense to use cycles instead of ngram scores to evaluate your transposition solver? So that it’ looking for transpositions that result in better cycles? In this case, however, only minimally invasive transpositions should be used, i.e. simple region-based flips and mirrors.
However, both ideas would probably come into conflict with the following:
Uses more symbols
Perhaps uses a key(s) somehow to change cipher encryption type
Splits the message into four parts instead of 3, and arranges them on paper out of orderI could see him doing something relatively simple like this to make the cipher harder to crack than the last one, and evolve his method. Picking up ideas out of his ‘code breaking for noobs’ book he got from the library.
That would be a realistic possibility and something similar has already been discussed frequently. I agree with you that this could be a good explanation for the "next step" from z408 to z340. However, there is a problem: z340 has statistically very significant peaks at period 19 and at mirror/period 15.
This is the biggest indicator that z340 is based on transposition. I think that these peaks also occur when a plain text is transposed and then encrypted with several keys. That’s something to test. In this case, however, it is extremely difficult to find a solution, as the key and plain text do not correlate.
I really can’t think right now. The heat wave here is just annoying. In the evening it is still 38 degrees. Just now two Hobbits have thrown a ring through my window and muttered something about the "cracks of doom".
Translated with http://www.DeepL.com/Translator
1) Have you ever examined the cycles separately for the upper and lower half? The MUJ_9MUJ_9MUJUMJM99UM_M example looks as if it were interrupted approximately in the middle. Some time ago I experimented with the perfect-cycle feature of AZDecrypt, but I had the feeling that 170 characters (one half of the cipher) are simply not enough analysis material.
2) Would it make sense to use cycles instead of ngram scores to evaluate your transposition solver? So that it’ looking for transpositions that result in better cycles? In this case, however, only minimally invasive transpositions should be used, i.e. simple region-based flips and mirrors.
1) Yeah, but nothing too extensive. That’s how I discovered the MUJ_9 cycle initially. It is indeed interrupted in the middle and it goes along really well with period 19! Perfect cycles are more prone to outliers so it is better to use the regular n-symbol cycles measurements instead.
2) In that case a 2-symbol cycles measurement would be better suited. So far I have found only a few anomalies: the 340 has a fairly high period 2 transposed cycle score and also some of the diagonal readings are fairly high, I think these are correlated. It is a 2 sigma observation so it is nothing to write home about. A high period 2 transposition cycle score can indicate 2 different substitution keys for the top and bottom half, though I consider that hypothesis to be unlikely. But please, when you get the n-symbol cycles score system up and running consider your doing your own analysis. I am very skeptic towards transposition after or during substitution but maybe you can find something I have not.
Good luck with the heat. We had the 38 degrees here in Belgium some days ago but since then it has gotten allot better. I recommend to eat lightly.
Yeah we have explored the cycles quite extensively. From all four corners reading in all directions, top half versus bottom half, rearranging rows to increase cycles, detecting areas of the message and positions where cycles are disrupted and comparing those with the pivots, compared transposition + homophonic cycles versus homophonic cycles + transposition, isomorphic cycle patterns, etc. etc. The top half is somewhat more cyclic than the bottom half. We have discussed the mechanics of cyclic encoding with pen and paper, and how a person could start with cycles and gradually become more random. As far as we can tell the message was transposed before encoding, if it was transposed, and encoded LRTB. L=2 cycles are easier to detect, and we found that there are alternation cycles like ABABABAB and there is also a spike at ABAABAABA which does not appear in test messages with random cycling or with alternation cycles. Oh, and we explored pandromic EDIT: palindromic cycles. Doranchak did a lot of work with us on most of the above, and Moonrock gave us some ideas about cycle patterns. I probably left something out, as I do not have the memory that Jarlve has.
I am very skeptic towards transposition after or during substitution but maybe you can find something I have not.
I agree with you in principle, after all we have enough indications against transposition after / during substitution. For example, the fact that there are 9 rows without repeated symbols. Nevertheless, there are possible transpositions that are compatible with the observations. Let’s say z340 would be a simple homophonic substitution in which the lower half was rotated 180 degrees. Let’s also say that P19 has nothing to do with transposition, but comes from a strongly repeating plain text. All this just to give an example. In this case, rotating the lower half would neither affect P19 nor the "rows without repeated symbols" statistics. What would become apparent in this case, however, would be a cycle sequence interrupted in the middle. Sure, if z340 were built that way, it would have been solved long ago. I just want to show that simple transpositions are quite possible. However, this cannot be the only reason why z340 has not yet been solved.
At the moment I am unfortunately very busy and with the actuel heat wave I don’t really want to program in the evening. But I will do some cycle analyses soon, maybe I really notice something new.
Yeah we have explored the cycles quite extensively.
…
We have discussed the mechanics of cyclic encoding with pen and paper, and how a person could start with cycles and gradually become more random.
…
I probably left something out, as I do not have the memory that Jarlve has.
Yes, I remember that. At that time, I had the following idea:
viewtopic.php?f=81&t=3337
It may sound arrogant, but for me it was the most logical explanation for the interrupted cycles in z408 and thus also in z340 at that time, so I hadn’t studied this topic intensively. But in the meantime I see it a little different and would like to take another look at the strange cycles of z340. Maybe it’s a good thing that I haven’t read all the threads on the topic yet, so I can approach the matter in an unbiased way.
Translated with http://www.DeepL.com/Translator