Evolve the order of rearranged columns

Largo

(@largo)

Posts: 454

Honorable Member

Topic starter

Yesterday I dug out some old source code for genetic algorithms and experimented a bit with it. If you are not familiar with genetic algorithms you may have a look at this link:

https://en.wikipedia.org/wiki/Genetic_algorithm

Some people think that Z maybe just rearranged the columns to make the cipher harder to crack. We all know that it is not possible to brute force all 17! (355.687.428.096.000) possibilities. If Z rearranged the columns *after* he applied a cyclic homophonic substitution to the plaintext it should be possible to recover the original order of the columns. So I tried to evolve the original order by using a fitness function which calculates the "cyclic score". To determine how cyclic a specific solution is I used Jarlves method which he has described in the following document:

https://docs.google.com/document/d/1LnQy1fiyxHs95jK-25GerTDuoQqqZt70reYw1zSYfhg/mobilebasic?pli=1

Obviously I have not found the solution to z340 but I don’t want to give up this idea. I am not sure if Jarlves method works in every case since I have created some test cases with z408. When I rearrange some columns in z408 and calculate the cyclic score then sometimes I get a higher score than with the original version. Maybe there is a bug in my implementation or maybe Jarlves method does not work in every case.
Does anyone knows other methods to determine how cyclic a specific homophonic substitution is? Has anyone encountered same problems with the method described in the link above? I hope that my code can "break" test ciphers with rearranged columns soon.

Posted : October 3, 2016 9:50 pm

doranchak

(@doranchak)

Posts: 2614

Member Admin

I think "false positives" is a common problem with cycle measurement scores, due to the fact that many seemingly regular cycles will appear spontaneously even in random shuffles of the cipher text. For instance, the very best 2-symbol cycle sequence in the unmodified Z340, which repeats perfectly 7 times, occurs spontaneously in 7 out of 10,000 shuffles of the Z340 cipher text. So, Z340 has signs of homophonic cycles but they are much weaker than Z408. Perhaps the columnar transposition you propose will restore much stronger cycles.

To subtract out the "false positives" when measuring cycles, I think you should do significance testing to compare your actual measurements against the expected ones from random shuffle tests. Here is some analysis along those lines, where I compare cycle significance between Z408, Z340 and a test cipher by smokie: http://zodiackillersite.com/viewtopic.p … 686#p47686 The big takeaway from there is how high the significance (sigma) is for Z408 compared to the other ciphers. Especially for cycles of 4 symbols! When you play around with transpositions, I think you should look for cycle scores that correspond to high sigmas (i.e., a lot of standard deviations from the mean score of random shuffles).

That post also mentions a few different ways to measure the cycles – here are some other posts which describe them in more detail:

Jarlve’s non-repeat measurement (2nd method): viewtopic.php?p=42120#p42120
Jarlve’s m_2s_cycles score: viewtopic.php?p=43770#p43770
My "perfect cycle" score: viewtopic.php?p=47926#p47926

Your columnar transpositions may indeed reveal a more significant amount of cycling for some column arrangement. A while ago, I did a similar experiment, using genetic algorithms to explore a space of transpositions. Here is a big list of results: https://docs.google.com/spreadsheets/d/ … sp=sharing I have sorted it so the results having higher cycle scores appear at the top. Many of them exceed every kind of cycle measurement. But they are most (if not all) certainly false positives. I need to repeat the experiment with a more refined approach that incorporates significance testing.

Also, I think if you permute a subset of the 17 columns, you may still be able to find peaks in cycles and repeating ngrams. If we assume that a correct reordering of columns will increase the repeating ngram counts, then the same should be true if we find a partially correct reordering. 13! is only about 6.2 billion possibilities, which can be easily brute forced. 14! is 87 billion which is still somewhat reasonable (24 hours if you can process 1,000,000 cycle computations per second). With brute force, you’d be guaranteed to explore the correct orderings of 14 columns (which have 4 different ways to appear within the 17 columns). And hopefully, your measurements would peak once they are found. But, I think there will be many false positives. A known cipher with a random scramble of columns would be a good test for this approach.

http://zodiackillerciphers.com

Posted : October 4, 2016 8:15 pm

masootz

(@masootz)

Posts: 415

Reputable Member

apologies if this has been covered. i tend to read all of these decryption threads but a bit of it is over my head. here goes –

when you talk about columnar transposition is it possible to just work on one row to determine column order? take whichever row has the most repeated characters and try to find the order that creates the most/best words. the idea being that you should at least get part of a sentence. once you determine the best fit expand it to another row.

i realize as i type this that it’s probably more complex than that but i keep coming back to the idea of not analyzing the entire thing but smaller parts that may reveal something that could be extended instead of brute forcing trillions of combinations.

Posted : October 5, 2016 7:33 pm

smokie treats

(@smokie-treats)

Posts: 1626

Noble Member

apologies if this has been covered. i tend to read all of these decryption threads but a bit of it is over my head. here goes –

when you talk about columnar transposition is it possible to just work on one row to determine column order? take whichever row has the most repeated characters and try to find the order that creates the most/best words. the idea being that you should at least get part of a sentence. once you determine the best fit expand it to another row.

i realize as i type this that it’s probably more complex than that but i keep coming back to the idea of not analyzing the entire thing but smaller parts that may reveal something that could be extended instead of brute forcing trillions of combinations.

I agree with your general idea. It is possible that the message was transposed in smaller chunks of plaintext rather than the entire plaintext all at once. Maybe not by rows of 17, but possibly by another chunk size. like 25. The message can be re-drafted into any size or shape, and could be re-drafted, for example, into 25 columns. Then rearrange the columns to try to get better cycle scores or solutions. Or it could be 19 columns, 26 columns, 39 columns, or horizontally mirror the message and try 15 or 29 columns, or whatever. I just picked 25 because the old cryptography books described route transposition with pictures of 5 x 5 plaintext. I think that what you are talking about is a good idea.

Posted : October 6, 2016 3:30 am

Zodiac Discussion Forum

Evolve the order of rearranged columns