And here is the 408.
ABABAB: 777
BABABA: 643
ABAABA: 495
ABABBA: 491
AABABA: 461
ABABAA: 455
ABBABA: 415
BABBAB: 389
BABAAB: 374
BABABB: 352
Interesting that consecutive alternations shows up at spot 1 and 2, which is exactly what he did. The test worked here.
Okay,
Here are the 2-symbol cycle ngram frequencies from 2 to 7 for the 340. I could have posted more but prefer to keep things manageable and investigate the ABAABA repeats. Will now work to get the sigma of each repeat versus randomizations.
AAAAAA: 398
ABAABA: 381
ABAAAA: 377
AABAAA: 374
AAABAA: 325
ABABAB: 316
ABABAA: 307
AAAABA: 292
ABAAAB: 279
AABAAB: 277
Here the top spot only has one symbol, so ABAABA actually comes out on top. Interesting that there are no BABBAB near the top. You guys are really tearing this up, hope you are having fun.
Will now work to get the sigma of each repeat versus randomizations.
Here they are for the 340.
To clarify, the ngrams are compared to 5000 full cipher randomizations. And as smokie mentioned, A and B are determined by order of appearance. And a positive sigma means that the ngram is observed more versus randomizations and a negative sigma means it is observed less.
ABABAB: 7.22
ABAABA: 7.01
ABABBB: 4.34
ABABAA: 4.17
AAABBA: -3.95
BABABA: 3.45
ABABBA: 3.34
ABAAAB: 3.33
BBABAA: -3.29
AAAABB: -3.25
AZdecrypt cycle ngram stats for: 340.txt -------------------------------------------------------- 2-symbol cycles, 3-grams, sigma: -------------------------------------------------------- ABA: 6.79 AAA: -5.93 BAB: 3.19 BBA: -2.19 BAA: -1.51 ABB: 1.07 BBB: -0.52 AAB: -0.41 2-symbol cycles, 4-grams, sigma: -------------------------------------------------------- ABAB: 6.78 ABAA: 4.62 AAAA: -4.08 AAAB: -3.76 AABB: -3.53 BAAA: -3.11 BBAA: -2.82 BABA: 2.52 AABA: 2.27 BAAB: 1.40 ABBA: 1.37 BABB: 1.36 ABBB: 1.33 BBBA: -1.03 BBAB: 0.98 BBBB: 0.26 2-symbol cycles, 5-grams, sigma: -------------------------------------------------------- ABABA: 6.23 ABAAB: 5.77 AAABB: -4.47 ABABB: 4.19 BABAB: 3.94 BBAAA: -3.59 AAAAA: -3.51 ABBAB: 3.07 AABAA: 2.98 AABBA: -2.70 ABAAA: 2.50 BABBB: 2.47 AAAAB: -2.44 BBBAA: -2.20 AABBB: -1.93 BAABA: 1.86 BAAAB: -1.65 BAAAA: -1.63 ABBBB: 1.56 BAABB: -1.39 BBBAB: 1.04 ABBBA: 0.79 BABAA: -0.75 BBAAB: -0.65 BBBBB: 0.64 BBABB: 0.51 AAABA: -0.45 BABBA: 0.41 AABAB: 0.15 BBABA: 0.11 ABBAA: 0.03 BBBBA: -0 2-symbol cycles, 6-grams, sigma: -------------------------------------------------------- ABABAB: 7.22 ABAABA: 7.01 ABABBB: 4.34 ABABAA: 4.17 AAABBA: -3.95 BABABA: 3.45 ABABBA: 3.34 ABAAAB: 3.33 BBABAA: -3.29 AAAABB: -3.25 BBAAAB: -3.25 ABBBAB: 3.12 AABAAB: 3.04 ABBAAB: 2.84 AAAAAA: -2.84 BAAABB: -2.78 BAABAB: 2.73 ABBABB: 2.72 BBBAAA: -2.68 AAABBB: -2.43 AAAAAB: -2.31 ABAAAA: 2.27 BBAAAA: -2.26 AABAAA: 2.20 AAABAB: -2.19 BAAAAA: -2.14 ABBABA: 2.02 BBABAB: 2 BBAABA: -1.93 BBBAAB: -1.91 BABBAB: 1.90 AABBAB: -1.77 AABBAA: -1.75 BABBAA: -1.74 BABAAA: -1.74 BABBBB: 1.72 BBABBB: 1.56 BABBBA: 1.53 AABABA: 1.52 BABABB: 1.51 AABBBA: -1.47 ABBAAA: -1.46 AABABB: -1.40 ABBBAA: -1.37 ABBBBB: 1.23 ABBBBA: 1.20 BAABBB: -1.17 BAABBA: -1.07 BABAAB: 1.07 BBBBBB: 0.94 AABBBB: -0.91 BAABAA: 0.75 BBBBAA: -0.74 AAABAA: 0.67 BBBABA: 0.67 BBBBAB: 0.64 BBBBBA: 0.46 AAAABA: -0.35 ABAABB: -0.31 BBABBA: -0.17 BBBABB: 0.13 BBAABB: 0.09 BAAABA: 0.07 BAAAAB: 0.04 2-symbol cycles, 7-grams, sigma: -------------------------------------------------------- ABAABAB: 6.98 ABABABA: 6.63 ABAABAA: 5.57 ABAAABA: 5.32 ABABAAB: 5.15 BABABAB: 4.71 ABBABBB: 4.30 ABABABB: 4.22 ABABBAB: 3.72 ABAAAAB: 3.70 ABABBBA: 3.60 ABBABAB: 3.32 BBABAAA: -3.26 AAAABBA: -3.17 AAABBAA: -3.09 AABAABA: 3.07 BBAAABA: -3.06 BBBAAAB: -3 BABBBAB: 2.89 AABBABB: -2.86 ABABBBB: 2.77 BABBAAA: -2.76 BBBAABA: -2.70 BBBABAA: -2.69 BBABBAA: -2.68 ABBAABB: 2.65 AAABBAB: -2.63 AAAAABB: -2.60 AABABAB: 2.47 AAAAAAA: -2.45 AABAAAB: 2.45 BAABABA: 2.38 BBAABAA: -2.36 ABABAAA: 2.34 BAAAABB: -2.33 BAAABBA: -2.32 AABAAAA: 2.32 BBAAAAA: -2.32 BAABAAB: 2.19 ABBBABB: 2.16 ABBBABA: 2.15 ABBBAAA: -2.14 BBBABAB: 2.09 AAABABB: -2.03 AAABBBA: -2 AAAABAB: -2 BABBABB: 1.86 AAAAAAB: -1.86 AABBAAA: -1.79 BBAAABB: -1.77 AAABBBB: -1.74 ABABBAA: 1.70 BABBBBB: 1.63 BAAABBB: -1.62 AAAABBB: -1.61 BBBAAAA: -1.58 BABABBB: 1.55 BAAAAAA: -1.53 BBABBAB: 1.50 BBABAAB: -1.50 ABBBBAB: 1.47 BBBBAAA: -1.43 ABBAABA: 1.41 ABBBBBB: 1.40 BBAAAAB: -1.36 AABBBAB: -1.31 BBABBBB: 1.31 BABABBA: 1.29 ABAAABB: -1.27 BABAAAB: -1.27 AABBBBB: -1.25 BBBBBBB: 1.24 BBBBAAB: -1.23 BBABABA: 1.21 AAABAAA: 1.17 BABAABA: 1.16 AABABAA: 1.12 BAABBAA: -1.11 BAAAABA: 1.06 ABBBBBA: 1.05 BBBBBAB: 1.05 BABAABB: -1.04 ABBABAA: -1.03 AABBBAA: -1 BAABBBA: -1 BBBBBBA: 0.98 BABAAAA: -0.97 BABBBAA: -0.96 BAABABB: 0.95 BBABBBA: 0.91 BAAAAAB: -0.86 ABBAAAB: -0.86 BBBAABB: -0.85 BBBBABA: 0.85 AAABABA: -0.85 ABAAAAA: 0.83 ABBABBA: 0.82 ABBAAAA: -0.82 AAAABAA: 0.79 BABBBBA: 0.74 AABABBB: -0.73 BABABAA: 0.73 BBBABBA: 0.66 BAABAAA: -0.64 BABBABA: 0.59 AABBAAB: -0.58 BABBAAB: 0.51 BBAABAB: -0.49 AABAABB: -0.44 BBBBABB: -0.41 AAAAABA: -0.39 AABABBA: -0.38 AABBABA: 0.37 ABAABBB: -0.30 ABBBAAB: -0.29 BAAABAB: -0.25 ABBBBAA: 0.25 BBABABB: 0.22 BBBBBAA: -0.19 AABBBBA: -0.18 BBBABBB: -0.16 ABAABBA: 0.14 AAABAAB: 0.11 BBAABBB: -0.11 BAAABAA: -0.10 BBAABBA: -0.09 BAABBBB: -0.06 BAABBAB: -0.06
And for the 408.
ABABAB: 15.93
BABABA: 11.97
ABABBA: 9.94
BABBAB: 8.94
BABABB: 7.89
ABBABA: 7
ABAABA: 6.65
AABABA: 6.19
ABABAA: 5.93
BABAAB: 5.18
AZdecrypt cycle ngram stats for: 408.txt -------------------------------------------------------- 2-symbol cycles, 3-grams, sigma: -------------------------------------------------------- ABA: 9.89 BAB: 9.51 AAA: -6.91 BAA: -4.77 BBB: -4.01 BBA: -3.04 AAB: -2.94 ABB: 0.60 2-symbol cycles, 4-grams, sigma: -------------------------------------------------------- ABAB: 12.76 BABA: 9.30 BAAA: -6.56 AAAB: -6.02 BABB: 5.48 BBAA: -5.38 AAAA: -4.79 BBBA: -4.44 AABB: -3.87 ABAA: 3.17 ABBB: -3.07 ABBA: 2.89 BBAB: 2.82 BBBB: -2.54 AABA: 1.75 BAAB: 0.32 2-symbol cycles, 5-grams, sigma: -------------------------------------------------------- ABABA: 13.04 BABAB: 11.77 ABABB: 8.23 BABBA: 6.70 ABBAB: 6.07 BBAAA: -5.28 ABAAB: 5.18 BAAAA: -5.13 AAAAB: -5 BBBAA: -4.93 AABAB: 4.61 AAABB: -4.04 AABBB: -3.82 AAAAA: -3.55 BBBBA: -3.27 BBAAB: -3.12 BBABA: 3.07 BABAA: 2.92 BAAAB: -2.70 ABBBB: -2.59 AAABA: -2.42 BAABA: 2.15 AABBA: -2.11 BBABB: 1.54 ABBAA: -1.51 BAABB: -1.50 BBBBB: -1.44 ABBBA: -1.39 AABAA: -0.89 BBBAB: -0.60 ABAAA: -0.45 BABBB: 0.07 2-symbol cycles, 6-grams, sigma: -------------------------------------------------------- ABABAB: 15.93 BABABA: 11.97 ABABBA: 9.94 BABBAB: 8.94 BABABB: 7.89 ABBABA: 7 ABAABA: 6.65 AABABA: 6.19 ABABAA: 5.93 BABAAB: 5.18 BBABAB: 4.65 BBAAAA: -4.44 BBBAAB: -4.30 BAAAAB: -3.92 BBBAAA: -3.83 BAABAB: 3.80 AAAABB: -3.71 BBAAAB: -3.61 AAAAAB: -3.59 BBBBAA: -3.49 BAAAAA: -3.37 AAABBB: -3.25 ABBAAA: -3.17 ABBBAA: -3.13 AAAABA: -2.85 AABBBB: -2.81 ABBABB: 2.74 AABBBA: -2.68 AAABBA: -2.67 AAAAAA: -2.67 ABBBBA: -2.66 BBAABA: -2.66 AABBAA: -2.57 BAABBB: -2.50 AAABAA: -2.33 BAAABB: -2.27 BBAABB: -2.15 BBABBA: 1.89 BABBBA: 1.80 ABBBAB: 1.74 BBBBBA: -1.72 ABAAAA: -1.62 BABBBB: -1.54 ABAAAB: 1.44 BAAABA: -1.44 BBBABA: -1.37 ABAABB: 1.36 BBBBAB: -1.24 ABBBBB: -1.21 AABBAB: -1.14 ABABBB: 1.04 BABAAA: -1.02 BBBABB: 0.94 AABAAA: -0.81 BBBBBB: -0.80 BAABBA: -0.75 AAABAB: -0.65 BABBAA: 0.48 BBABBB: 0.42 AABABB: 0.37 AABAAB: -0.20 BBABAA: -0.19 BAABAA: 0.18 ABBAAB: 0.06 2-symbol cycles, 7-grams, sigma: -------------------------------------------------------- ABABABA: 18.01 BABABAB: 15.54 ABABBAB: 11.69 ABABABB: 9.93 BABABBA: 9.14 BABBABA: 8.96 ABABAAB: 8.69 ABAABAB: 7.79 ABBABAB: 7.69 BABAABA: 6.54 BABBABB: 5.84 AABABAB: 5.49 BABABAA: 5.31 BAABABA: 5.28 BBABBAB: 4.06 BBBAABA: -4.05 BBAAAAB: -3.91 BBAAABA: -3.69 ABAABAA: 3.64 BBABABB: 3.62 BBBBAAB: -3.36 ABABBAA: 3.32 BABBBAB: 3.20 BBBAAAA: -3.18 AAAABBB: -3.10 ABBAAAA: -3.08 AABBBAA: -3.06 BAAAABA: -3.05 AAAAABB: -3.05 BBBAAAB: -2.97 AABBAAA: -2.97 ABABBBA: 2.96 ABBBBAA: -2.94 BBAAAAA: -2.93 ABBABAA: 2.93 ABBBAAA: -2.81 ABBABBA: 2.78 BBBBAAA: -2.77 BAAAAAB: -2.76 AAABBAB: -2.69 AABABAA: 2.65 AAABBBA: -2.64 AAAAAAB: -2.62 BBABABA: 2.61 BAAAAAA: -2.60 ABAAABA: 2.58 AAAABAA: -2.45 BBAABAA: -2.39 ABBBABB: 2.30 BBBAABB: -2.30 BABBAAB: 2.29 AAAABBA: -2.28 AAABBBB: -2.28 ABBBAAB: -2.27 AABBBBA: -2.25 AAAAAAA: -2.25 BBAABBB: -2.21 AABBAAB: -2.21 AAAAABA: -2.20 AAABBAA: -2.19 ABAABBA: 2.18 BABBBBA: -2.17 BAAAABB: -2.13 BAAABAA: -2.13 BAABBBB: -2.13 BBBBBAA: -2.13 BAAABBA: -2.03 BBAABBA: -2.03 BBAAABB: -2 BBBABAA: -1.98 BABAAAA: -1.96 BAAABBB: -1.83 AABBBBB: -1.79 AAABABB: -1.78 AABBABB: -1.75 ABBAAAB: -1.70 BAABBAA: -1.70 AAABAAA: -1.69 BAABBBA: -1.68 BABABBB: 1.68 AAABAAB: -1.64 BBBBABA: -1.60 AABABBA: 1.48 ABAAAAA: -1.42 ABBABBB: 1.42 AAAABAB: -1.39 ABABBBB: -1.38 BABBAAA: -1.32 BBABBAA: -1.30 BBBABBA: 1.29 ABBBBBA: -1.26 BABAABB: 1.23 AABAABA: 1.18 AABAAAA: -1.11 BBBBBBA: -1.06 BBABBBA: 0.97 AABAABB: -0.93 BBAABAB: -0.88 BABBBAA: -0.86 AABABBB: -0.86 AAABABA: 0.80 ABBAABA: 0.79 AABBBAB: -0.77 BABAAAB: 0.76 BAABABB: 0.71 ABBBABA: 0.65 ABAAAAB: -0.60 ABAABBB: -0.60 ABBBBAB: -0.59 AABBABA: 0.59 ABBAABB: -0.58 BBBBBBB: -0.58 BBBBBAB: -0.57 BABBBBB: -0.56 BBABAAB: 0.55 BBABAAA: -0.50 BAABBAB: 0.49 BAABAAB: 0.47 ABBBBBB: -0.45 ABAAABB: -0.41 ABABAAA: 0.38 BAAABAB: 0.26 AABAAAB: 0.25 BBBABAB: 0.24 BBBABBB: 0.17 BBBBABB: 0.16 BBABBBB: -0.16 BAABAAA: 0.11
One thing that I realized this morning. I assigned A or B depending on the symbol’s order of appearance. But ABAABA for 121121 is the same thing as BABBAB for 323323.
I do this to. Would it be a problem? Perhaps each cycle string could be inverted also.
One thing that I realized this morning. I assigned A or B depending on the symbol’s order of appearance. But ABAABA for 121121 is the same thing as BABBAB for 323323.
I do this to. Would it be a problem? Perhaps each cycle string could be inverted also.
Simple example.
Let’s say the message has two of ABAABA and only one of BABBAB. You would have to flip both of them, then add them all up, and then remove duplicate stats.
ABAABA
ABAABA
BABBAB
Then flip.
BABBAB
BABBAB
ABAABA
Add them all up.
ABAABA 3
BABBAB 3
Since the numbers equal each other, and one is the flip of the other, then just remove one.
ABAABA 3
I kicked off an L=3 shuffle test for isomorphic sequences, limited to those of lengths between 6 and 15. It will take a few hours. My test has a variation which is to consider all substrings of a given sequence. For example, if it finds a sequence AAAAAAAAAABCABCAAAAAAAAAA, it will consider sequences AAAAAA as well as ABCABC, in addition to all other substrings of lengths between 6 and 15. I did that because I figured Z408’s cycles have a lot of imperfections.
I think we are still faced with the problem of separating true cycles from false ones. I’m wondering what other input is needed from the ciphertext stats which could help make the distinction, if any such input actually exists. Perhaps it is a problem that could be solved with machine learning (i.e., give an algorithm many examples of true and false cycles to train with). Such algorithms are capable of deriving the inputs needed to make the decision.
Let’s say the message has two of ABAABA and only one of BABBAB. You would have to flip both of them, then add them all up, and then remove duplicate stats.
Need to think about it.
I think we are still faced with the problem of separating true cycles from false ones. I’m wondering what other input is needed from the ciphertext stats which could help make the distinction, if any such input actually exists. Perhaps it is a problem that could be solved with machine learning (i.e., give an algorithm many examples of true and false cycles to train with). Such algorithms are capable of deriving the inputs needed to make the decision.
One thing I noted is that you either say a cycle is true or false while it actually is more fuzzy. A false cycle by your definition could have 3 out of 4 symbols correct. I wonder how many cycles that you have at the top of your list that are completely false. Though I also recognize this problem and my solution was to come up with a cycle merging hill climber (the one in AZdecrypt). True cycles will more likely be part of the best fit total merge. Other clues could be derived from the properties of sequential homophonic substitution. And we could be looking for the wrong cycle type.
Talking about cycle types, I have done allot of work on it the last week. It seems that the 340 just as the 408 is suffering from increasingly random cycles. My routine has detection for random, offset, palindromic, shortened, lengthened, anti and pattern cycles and these do not look interesting. This detection works on all the test ciphers in the main post. No detection yet for regional cycles. It is still a question for me where the extra randomness in the 340 comes from but could it be as simple that the 340 just starts out more randomly?
340:
2-symbol cycles:
Cycles: 6.37 sigma
Increasingly random cycles: 2.17 sigma
Decreasingly random cycles: -1.36 sigma
3-symbol cycles:
Cycles: 6.28 sigma
Increasingly random cycles: 4.40 sigma
Decreasingly random cycles: 0.02 sigma
4-symbol cycles:
Cycles: 7.03 sigma
Increasingly random cycles: 6.21 sigma
Decreasingly random cycles: 0.77 sigma
5-symbol cycles:
Cycles: 7.43 sigma
Increasingly random cycles: 6.95 sigma
Decreasingly random cycles: 1.13 sigma
408:
2-symbol cycles:
Cycles: 9.74 sigma
Increasingly random cycles: 4.13 sigma
Decreasingly random cycles: -1.79 sigma
3-symbol cycles:
Cycles: 13.30 sigma
Increasingly random cycles: 9.40 sigma
Decreasingly random cycles: -0.44 sigma
4-symbol cycles:
Cycles: 18.98 sigma
Increasingly random cycles: 16.74 sigma
Decreasingly random cycles: 0.59 sigma
5-symbol cycles:
Cycles: 26.70 sigma
Increasingly random cycles: 27.12 sigma
Decreasingly random cycles: 1.33 sigma
Example of a increasingly random 5-symbol cycle in the 340: MUJ_9MUJ_9MUJUMJM99UM_M
Thank you for your detailed explanations, smokie and Jarlve! Maybe I also have some ideas and can contribute something as soon as I find some time.
One thing I noted is that you either say a cycle is true or false while it actually is more fuzzy. A false cycle by your definition could have 3 out of 4 symbols correct.
That’s a great point. I should alter my measurement to permit partial credit. It does seem like many of the "false" cycles are simply multiple true cycles that have gotten mixed together.
Here are the 3-symbol cycle ngrams. Thank you smokie for this wonderful idea, I hope you do not mind my take on it?
https://drive.google.com/open?id=1mANVn … XAaqQb9nfS
EDIT: added missing ngrams
For the 340 one ngram stood out to me:
ABCABAB: 14.25
ABACABA: 12.53
ABCABCB: 12.49
ABCABAC: 12.13
ABCAABC: 11.63
ABCACAC: 11.39
BCABABC: 10.67
ABCBAAC: 10.46
ABCACBC: 10.39
ABCBAAB: 10.02
It looks like this for the 408:
ABCABCA: 37.43
BCABCAB: 31.12
ABCABCB: 28.62
CABCABC: 25.49
ABCABCC: 23.98
ABCACBA: 22.20
ABCABAB: 20.44
BCACBAC: 19.75
ABCABAC: 19.73
BCABCAC: 19.41
Questions. Why would ABCABAB almost be 2 sigma higher than all the other ngrams? ABCABCA is not even in the top 10 for the 340?
Questions. Why would ABCABAB almost be 2 sigma higher than all the other ngrams? ABCABCA is not even in the top 10 for the 340?
I think it’s because the real cycles, if they do exist, are perturbed by some step in the encipherment. The sigmas you got for Z408 are so much higher than the top ones for Z340. It makes me think that we are seeing a lot of noise in Z340’s cycles. The cycle isomorphism test produces very many samples, so we do expect to see many outliers since we’re getting many samples from the tail ends of the distribution.
Another point of evidence towards the "perturbed cycles" hypothesis might be the fact that in cycle shuffling experiments, I can easily increase the mean sigma of all normal cycles (ABAB, ABCABC, etc) by exchanging rows of Z340. The "best" cycles in the rearrangement don’t look better, but the rearrangement produces many more improved cycles overall. Can you run your L=2 and L=3 shuffles on this and see how it compares to the unmodified Z340?
HER>pl^VPk|1LTG2d Np+B(#O%DWY.<*Kf) 2<clRJ|*5T4M.+&BF z69Sy#+N|5FBc(;8R lGFN^f524b.cV4t++ |FkdW<7tB_YOB*-Cc >MDHNpkSzZO8A|K;+ (G2Jfj#O+_NYz+@L9 d<M+b+ZR2FBcyA64K -zlUV+^J+Op7<FBy- U+R/5tE|DYBpbTMKO By:cM+UZGW()L#zHJ Spp7^l8*V3pO++RK2 _9M+ztjd|5FP+&4k/ yBX1*:49CE>VUZ5-+ |c.3zBK(Op^.fMqG2 RcT+L16C<+FlWB|)L ++)WCzWcPOSHT/()p p8R^FlO-*dCkF>2D( #5+Kq%;2UcXGV.zL|
z340:
jarlve nonrepeat 1: 4462
jarlve nonrepeat 2: 1599
pcs2: 247.85
pcs3: 62.36
jarlve m_2s_cycles: 2150.72
mean l2 sigma: 0.21
z340_best_l2:
jarlve nonrepeat 1: 4282
jarlve nonrepeat 2: 1600
pcs2: 271.57
pcs3: 100.25
jarlve m_2s_cycles: 2259.03
mean l2 sigma: 0.43
Here is the illustration of the row swaps to produce the modified cipher: http://zodiackillerciphers.com/images/z … ved-L2.jpg
It may be that the average sigma will increase, but the tail ends of the distribution will still look the same. I think that’s what I’m seeing when looking at specific L=2 cycles.
This whole business with cycles is really maddening.
Thank you smokie for this wonderful idea, I hope you do not mind my take on it?
Questions. Why would ABCABAB almost be 2 sigma higher than all the other ngrams? ABCABCA is not even in the top 10 for the 340?
You are welcome, I don’t mind, and I am rather enjoying watching you work and working on my own. But you are much faster at getting results. At this point I have no idea why ABCABAB is significant, but will continue my work over here. I am still working on my L = 2 spreadsheet. I am working on different options, and will soon start making some test messages with different encoding patterns, including 100% random symbol selection, to see what happens. We are definitely detecting something.
Questions. Why would ABCABAB almost be 2 sigma higher than all the other ngrams? ABCABCA is not even in the top 10 for the 340?
You will have to excuse me because some ngrams are missing from the 3-symbol lists. Such as ABC and ABA. I do not understand why since these ngrams are not missing from the randomizations and the sorting algorithm is working correctly.
Another point of evidence towards the "perturbed cycles" hypothesis might be the fact that in cycle shuffling experiments, I can easily increase the mean sigma of all normal cycles (ABAB, ABCABC, etc) by exchanging rows of Z340. The "best" cycles in the rearrangement don’t look better, but the rearrangement produces many more improved cycles overall. Can you run your L=2 and L=3 shuffles on this and see how it compares to the unmodified Z340?
By the way, I have changed my n-symbol cycles routine to not include symbols that occur only once.
Normal 340:
– 2-symbol cycle score: 2136
– 3-symbol cycle score: 5922
Your 340:
– 2-symbol cycle score: 2253
– 3-symbol cycle score: 6692
First 340 characters of the 408:
– 2-symbol cycle score: 2856
– 3-symbol cycle score: 12048
The thing with cycles is that they are multiplicative, cycles start to cycle with eachother so to say. If you do not compensate for that in your measurement the score blows up very easily. The calculation of the measurement may not be to exponential. The cycles are not isolated outside of the cipher.
Thanks Jarlve. So my modified 340 has slightly improved scores. Seems to be very easy to improve any kind of cycling score for Z340 with simple manipulations.
That’s a good point about the multiplicative effect of cycle scores.