Zodiac Discussion Forum

Homophonic substitu…
 
Notifications
Clear all

Homophonic substitution

1,434 Posts
21 Users
0 Reactions
305.4 K Views
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

Interesting analysis and questions, smokie. I think they will be difficult to answer without producing test ciphers that implement the key cycling you are describing.

I also wonder how the other bigram biases (even and odd positions, top and bottom half) fit in with all this.

We can’t even fully rule out simple columnar transposition + substitution yet, since we don’t yet have a sufficiently strong solver to conquer all the test ciphers made with that scheme. At least, I don’t think we can, unless you know of some analysis that really rules it out.

http://zodiackillerciphers.com

 
Posted : May 12, 2016 5:30 am
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Interesting analysis and questions, smokie. I think they will be difficult to answer without producing test ciphers that implement the key cycling you are describing.

Yes, some test messages would be a good idea. I have been thinking about that. Making test messages is a good way to figure out what would have been easy or difficult to do.

I also wonder how the other bigram biases (even and odd positions, top and bottom half) fit in with all this.

Perhaps changing the cycles could help explain some of the top and bottom half biases.

We can’t even fully rule out simple columnar transposition + substitution yet, since we don’t yet have a sufficiently strong solver to conquer all the test ciphers made with that scheme. At least, I don’t think we can, unless you know of some analysis that really rules it out.

I don’t.

 
Posted : May 12, 2016 5:51 am
bmichelle
(@bmichelle)
Posts: 273
Reputable Member
 

Does anyone have a numerical version of the 408?

Can someone please briefly explain this numbered system on the 408?
Please.

The Best Mystery Is An Unsolved Mystery….

 
Posted : May 12, 2016 6:42 am
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Does anyone have a numerical version of the 408?

Can someone please briefly explain this numbered system on the 408?
Please.

He numbered the symbols for me so that I could work with the message a lot easier in my spreadsheet.

 
Posted : May 12, 2016 1:48 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

I have an interesting new theory about how Zodiac did the encoding for the 340 based on new observations I made with unigram repeats. The measurement sums the unigram repeats of fixed length parts of a cipher. If the length is 17 then the repeats of the first 17 characters of the cipher will be summed and then the next 17 and so on. So from 1 to 17, 18 to 35, etc.

I’m trying to reproduce your measurement. For Z340 with length 17, I have this output:

position #uniques #repeats
0 17 0
17 17 0
34 17 0
51 14 3
68 16 1
85 16 1
102 17 0
119 16 1
136 16 1
153 15 2
170 17 0
187 17 0
204 17 0
221 15 2
238 17 0
255 16 1
272 15 2
289 14 3
306 16 1
323 17 0

Sum of repeats: 18.0. When I normalize that by dividing by 340, it comes out to 5.3%. How are you getting 11.8?

http://zodiackillerciphers.com

 
Posted : May 12, 2016 2:32 pm
ophion1031
(@ophion1031)
Posts: 1798
Noble Member
 

Every time I open this thread I get the strangest deja vu and I have no idea why. I wonder if I have posted these same words in this thread before.

A few minutes ago on a toilet not very far, far away….

 
Posted : May 12, 2016 2:40 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Every time I open this thread I get the strangest deja vu and I have no idea why. I wonder if I have posted these same words in this thread before.

Yeah, me too.

Here are seven consecutive alternations in the bottom quarter of the 340. That many alternations can be created with shuffles, not too difficult, but I wonder how probable this is when taking distance between first position and last position into account.

 
Posted : May 12, 2016 4:01 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

@doranchak, please ignore these posts.

@smokie, great work, let me know if I can help you with anything.

AZdecrypt

 
Posted : May 12, 2016 4:47 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

@doranchak, please ignore these posts.

@smokie, great work, let me know if I can help you with anything.

I will. See the 10th post on page 120 of this thread. At some point I may make a message with transposed plaintext and then different keys for different parts of the message. The solver would have to work with fragments of the un-transposed parts of the message, but we have done similar things before with some success.

 
Posted : May 12, 2016 5:00 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

So I wonder if Zodiac cycled his cycles, and that is why there is no solution yet.

I have given it some thought and find it an excellent hypothesis. I have seen references to "second order homophonic substitution" and wonder if that is the same thing. It would be very valuable to prove or disprove it to some degree. I wonder if the difference versus the 408 in your test could come from regular cycle randomization, and I guess you do as well.

This cycling of cycles would accomplish a few things. Additional suppression of frequencies, suppression of cycles and probably making the cipher unsolveable as when it’s being treated as regular homophonic substitution. But how exactly would it have been implemented and could we then find alternation of cycles?

AZdecrypt

 
Posted : May 12, 2016 5:18 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

Doranchak, what about L=3 for the first 170 ciphertext of the 408 versus the first 17 ciphertext of the 340?

Not sure what you are asking here. And do you mean first 170 of the 340 instead of first 17?

Do you want me to produce a comparison of the statistical significance of L=3 cycles found for the first 170 of each cipher?

http://zodiackillerciphers.com

 
Posted : May 13, 2016 1:05 am
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Doranchak, what about L=3 for the first 170 ciphertext of the 408 versus the first 17 ciphertext of the 340?

Not sure what you are asking here. And do you mean first 170 of the 340 instead of first 17?

Do you want me to produce a comparison of the statistical significance of L=3 cycles found for the first 170 of each cipher?

Yes, I meant the first 170 of the 340. I am wondering if Zodiac may have changed his key at certain position(s) during encoding. But there are several high scoring L=2 cycles that start at about 1/4th of the way through the message, and some that start at the halfway point. So I don’t know if the first 170 is appropriate now, given my findings. I am just wondering if the 340 and 408 would have more comparable L=3 stats if analyzed in smaller chunks.

Looking at the 340 as a whole, we can find many examples of where A cycles with B, and B cycles with C, but A doesn’t cycle with C. So I am thinking about looking at regional biases for the cycles to reconcile cycle conflicts.

I have to take a closer look at things, but here are a couple of interesting examples. See the orange box. Cycle 7-47 starts at position 56; cycle 30-48 starts at position 59; cycle 30-47 starts at position 59, and 8-47 starts at position 60. All have eight consecutive alternations.

Now check out the dark brown box. 23-51 starts at position 90; 23-55 starts at position 92; 23-38 starts at position 92. All have eight consecutive alternations.

In both examples, there are shared symbols and a lot of them ( maybe all I still have to take a closer look ) have already appeared before the start positions of the cycles. And they all start at about the same position. With the 408, the top 33 scoring cycles mostly start closer to the beginning of the message. But with the 340, there are cycles that start not so close to the beginning of the message. Note the shift in positions, a step, between the orange and dark brown rectangles.

In the picture, the two symbols for the cycle are on the left, column A is the count of consecutive alternations, column B is the start position for the cycle, column C is the end position for the cycle, and column D is the total count of positions that the cycle covers.

Do some cycles begin where others end? I am not sure. This picture shows only the top 33 scoring L=2 cycles. If some cycles begin where other cycles end, then it would be closer to the end of the message, and those new cycles may have lower scores.

Could he have changed his entire key at certain positions, or could he have cycled parts of his key at certain positions? Is there a simple way to detect either?

 
Posted : May 13, 2016 4:14 am
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

Seems like a difficult question to answer, especially since nice looking cycles are often false positives. Here’s a spreadsheet of all cycles detected in Z408 for L=2, L=3 and L=4, with true and false positives highlighted:

https://docs.google.com/spreadsheets/d/ … sp=sharing

Excerpts:
L=2

L=3

L=4

Still, strong false cycles are probably co-indicators of real cycling, since they probably appear more often when strong true cycles are present. I’m not sure what you could conclude from regional biases without having some good examples of how they might be related to specific encoding schemes. Your identification of "clustered" cycles (in the same regions, involving shared symbols, and having similar numbers of consecutive alternations) might be a good basis for a "confirmation" measurement that can group together low-order cycles into higher-order cycles (such as L=3, L=4, etc).

You came up with a good example of how a "two key" scenario might have taken place, where the cipher key changes halfway through, but then the cipher text is transposed, mixing up the two keys. I think this would make it even more difficult to detect key-specific cycles, since you’d have to take all the different possibilities into account. All the more reason to come up with very targeted test ciphers to see what happens, IMO.

http://zodiackillerciphers.com

 
Posted : May 13, 2016 3:53 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

OK I did a quick regional analysis based on your region size of 170 for Z340 and Z408, for L=2, 3, and 4. But I used my "perfect cycle score" measurement, which I have to explain first.

Consider L=2 for Z408. First, the measurement looks for all cycles, and discards any that have runs less than 3. I use "run" to mean "consecutive alternations". Example: "AB AB AB BA AB" has a run of 3.

The measurement comes up with a ratio that indicates how "perfect" the run is. In Z408, the cycle "PU" has a run of 9 but only covers 86% of the full sequence ([PU] [PU] [PU] [PU] [PU] [PU] [PU] [PU] [PU] P [PU]). Then, the measurement exaggerates imperfections by raising the percentage by a power of 3. So, 86% ^ 3 = 63%.

The measurement then tracks sums by run length. For example, the percentages for all the cycles with run=9 are added to a slot marked "9". The percentages for all the cycles with run=8 are added to a slot marked "8", and so on.

For Z408, the slots look like this after all the cycles are detected:

run 3: 22.8
run 4: 25.1
run 5: 16.6
run 6: 11.3
run 7: 3.04
run 8: 3.13
run 9: 1.00

The values in each slot are combined together by giving more weight to longer runs, like this:

Score = (run3)*2^1 + (run4)*2^2 + (run5)*2^3 + … + (run9)*2^7

In this way, the score gives more reward to longer runs. Here are the final scores for Z340 and Z408 for L=2,3,4:

z340 L=2: 247.8
z408 L=2: 884.9

z340 L=3: 62.4
z408 L=3: 418.9

z340 L=4: 9.0
z408 L=4: 187.6

You can see how drastically better Z408 is than Z340.

So, to explore your regional bias question, I looped through each cipher, positioned a length=170 "window" at every offset, and computed perfect cycle scores for just the cipher text covered by the window. Here are plots of the results:

Z408:

Z340:

The X-axis is the start position of the length=170 window. The Y axis is the perfect cycle score. The L=2 line is red, the L=3 line is green, and the L=4 line is blue.

Notice how in the graph for Z408, each line peaks dramatically in the very beginning of the cipher. The L=2 line is fairly smooth throughout but gradually drops. L=3 and L=4 drop off more suddenly, indicating loss of regularity earlier in the cipher text.

But in the graph for Z340, the peaks occur a bit later in the cipher text. I think this is consistent with what you’ve noticed.

The other unusual thing is that the lines in the Z408 graph cross each other, but the ones in the Z340 graph do not. I don’t know what (if anything) that signifies.

http://zodiackillerciphers.com

 
Posted : May 13, 2016 4:36 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

I like those graphs.

408: The L=3 and L=4 lines have similar shapes. L=2 has a different shape.

340: L=2 and L=3 have similar shapes. A lot of the peaks and valleys coincide with each other.

So far I am not convinced that Zodiac changed his key throughout the message. I am thinking about making some test messages. I want to look at this some more. I started shuffling the 340 this morning just to see how my chart changes, and I am making new charts. I am wondering about shuffle tests, start position, and score. And about high scoring cycles that start toward the end of the message. For me, ABABAB is five consecutive alternations, just like if you are flipping a coin six times, but don’t count the first flip as an alternation. Nevertheless, it is the same concept.

I am wondering if there is a way to detect changes in a key by making X = 2 to 339. Then taking a measurement of cycles to the left of X and to the right of X. Find a place where there is a maximum score of some sort.

 
Posted : May 13, 2016 5:00 pm
Page 81 / 96
Share: