So that works out to a 21.6% increase.
Okay, I will take another look at it at some point.
A couple of years ago I found that the cycles in the 340 seemed to respond very well to stacked sub string reversal. Find the fragment/sub string that when reversed maximally increases the cycle score and repeat. This fragment/sub string could be from position 51 to 140 to give an example. I did not look into it very deeply but thought it was interesting back then. You may want to check it out.
I did a column swap test too:
z408:
00000000011111111 00001000101111101 12345678901234567 12342687361074596 9%P/Z/UB%kOR=pX=B 9%P/RZBU=/OkBpX%= WV+eGYF69HP@K!qYe WV+e@G6FKYPHe!q9Y MJY^UIk7qTtNQYD5) MJY^NU7kQItT)YDq5 S(/9#BPORAU%fRlqE S(/9%#OPfBUAERlRq k^LMZJdrpFHVWe8Y k^LMHZrdVJFpYWe8 @+qGD9KI)6qX85zS( @+qGXDIK89q6(5z)S RNtIYElO8qGBTQS#B RNtIBYOlTEGqBQS8# Ld/P#B@XqEHMU^RRk Ld/PM#X@UBHEk^RqR cZKqpI)Wq!85LMr9# cZKq5pW)LI8!#Mrq9 BPDR+j=6N(eEUHkF BPDRe+6=Ej(NFUHk ZcpOVWI5+tL)l^R6H ZcpO)V5IlWLtH^R+6 I9DR_TYrde/@XJQA I9DR/_rY@TedAXJQ P5M8RUt%L)NVEKH=G P5M8VR%tEUN)GKHL= rI!Jk598LMlNA)Z(P rI!JNk89A5lMP)ZL( zUpkA9#BVW+VTtOP zUpk+AB#V9WPTtVO ^=SrlfUe67DzG%%IM ^=SrzleUGfD7M%%6I Nk)ScE/9%%ZfAP#BV Nk)Sfc9/AEZ%VP#%B peXqWq_F#8c+@9A9B peXq+WF_@qc8B9A#9 %OT5RUc+_dYq_^SqW %OT5qR+c_UYdW^S_q VZeGYKE_TYA9%#Lt_ VZeG9Y_E%KAY_#LTt H!FBX9zXADd7L!=q H!FBXXz79dDqL!A= _ed##6e5PORXQF%Gc _ed#X#5eQ6ROcF%PG Z@JTtq_8JI+rBPQW6 Z@JTrt8_Bq+I6PQJW VEXr9WI6qEHM)=UIk VEXrM96I)WHEk=UqI
2855.73 to 2959.39 (3.6%)
z340:
00000000011111111 01000000011111011 12345678901234567 35745618910436227 HER>pl^VPk|1LTG2d RG^>plHVP|kTL2E1d Np+B(#O%DWY.<*Kf) +KOB(#N%DYW*<fp.) By:cM+UZGW()L#zHJ :zUcM+BZG(W#LHy)J Spp7^l8*V3pO++RK2 pR87^lS*Vp3++KpO2 _9M+ztjd|5FP+&4k/ M4j+zt_d|F5&+k9P/ p8R^FlO-*dCkF>2D( R2O^Flp-*Cd>FD8k( #5+Kq%;2UcXGV.zL| +z;Kq%#2UXc.VL5G| (G2Jfj#O+_NYz+@L9 2@#Jfj(O+N_+zLGY9 d<M+b+ZR2FBcyA64K M6Z+b+dR2BFAy4<cK -zlUV+^J+Op7<FBy- lB^UV+-J+pOF<yz7- U+R/5tE|DYBpbTMKO RME/5tU|DBYTbK+pO 2<clRJ|*5T4M.+&BF c&|lRJ2*54T+.B<MF z69Sy#+N|5FBc(;8R 9;+Sy#zN|F5(c86BR lGFN^f524b.cV4t++ Ft5N^fl24.b4V+Gc+ yBX1*:49CE>VUZ5-+ X541*:y9C>EZU-BV+ |c.3zBK(Op^.fMqG2 .qK3zB|(O^pMfGc.2 RcT+L16C<+FlWB|)L T|6+L1RC<F+BW)clL ++)WCzWcPOSHT/()p )(WWCz+cPSO/T)+Hp |FkdW<7tB_YOB*-Cc k-7dW<|tBY_*BCFOc >MDHNpkSzZO8A|K;+ DKkHNp>SzOZ|A;M8+
2150.72 to 2332.67 (8.5%)
Jarlve random plaintext:
00000000011111111 00000001011101111 12345678901234567 45613292861473507 +E7'D*!$3*.)HSF1$ 'D*+7E3)$1.S!HF*$ M&^*%Y6[ZU=($VQ,* *%YM^&Z([,=V6$QU* $R+>)*'/KG$(*8V$B >)*$+RK(/$$8'*VGB ITC#X3>E14#]+HLX #X3ICT1]EL#H>+4X -3*=$)*@II^1HQ$[> =$)-*3I1@[^Q*H$I> !YP^<RU:T,<?$IO; ^<R!PY:<UO,$?IT; #QK*5A"$9I0&CE*^. *5A#KQ9&$^0E"C*I. +0D]M/JV$!K'*QR46 ]M/+D0$'V4KQJ*R!6 )-4Z"MI$8H1?9'?E& Z"M)4-8?$E1'I9?H& 7^A?(!-B)V?Z+55A ?(!7A^)?B5V+-Z5A *Q_M9KX+G_/$2-4%E M9K*_QG$+%/-X24_E ]R!;SFL)_<*8"1I3& ;SF]!R_8)3*1L"I<& #N=#HL';Y$O0Z*F@[ #HL#=NY0;@O*'ZF$[ S!X.V"':$*-^V:DK .V"SX!$^:D-'V:*K 3UF,*E!8AB"TQ$7K> ,*E3FUAT8K"$!Q7B> [4=*["^H#$XME*T3 *["[=4HXT$E^M*#3 C49+[]PG>-;%5#X* +[]C94>%G*;#P5X- &N)T/RE1@?N3>2SI T/R&)N@31N2E>S?I Q#]P$P'[JU*;O*:1@ P$PQ]#J;[1**'O:U@ @K$*#$U:?RW2,Y*9E *#$@$K?2:9WYU,*RE
2202.52 to 2285.19 (3.8%)
A couple of years ago I found that the cycles in the 340 seemed to respond very well to stacked sub string reversal. Find the fragment/sub string that when reversed maximally increases the cycle score and repeat. This fragment/sub string could be from position 51 to 140 to give an example. I did not look into it very deeply but thought it was interesting back then. You may want to check it out.
Interesting! I will give that a try since it sounds very curious.
I wrote a new cycle test this morning: use the cycle ngrams of one cipher to score the cycle ngrams of another cipher. This method has many advantages, it is quick and allot of information is captured. Furthermore, it eliminates the need to write complicated cycle detection routines and its use could be more universal (outside of cycle detection).
@smokie and others,
Feel free to create ciphers with different cycle types to add to the list. Try to keep the ioc close to the 340. Note the correlation with shortened cycles, which in turn correlate well with increasingly random cycles.
Initial results:
340.txt (scored with) 340: 1273.26
340.txt (scored with) smokie_shortenedcycles2: 1195.03
340.txt (scored with) smokie_shortenedcycles1: 1141.35
340.txt (scored with) 408_340: 1034.69
340.txt (scored with) 408: 913.41
340.txt (scored with) smokie_palindromic2: 988.86
340.txt (scored with) smokie_palindromic1: 982.77
340.txt (scored with) moonrock_regionalcycles2: 806.52
340.txt (scored with) moonrock_regionalcycles1: 800.93
340.txt (scored with) 340_randomized1: 753.03
340.txt (scored with) jarlve_anticycles1: 286.95
A couple of years ago I found that the cycles in the 340 seemed to respond very well to stacked sub string reversal. Find the fragment/sub string that when reversed maximally increases the cycle score and repeat. This fragment/sub string could be from position 51 to 140 to give an example. I did not look into it very deeply but thought it was interesting back then. You may want to check it out.
Interesting! I will give that a try since it sounds very curious.
If I don’t restrict the lengths of possible substrings, the algorithm is very greedy and will keep increasing cycle scores. I think it is because eventually it approaches arbitrarily re-writing the cipher text to manually reorder the symbols to produce the best cycles.
If I limit to only a few reversals, the cycle scores still go up dramatically. In the results below, Reverse(pos, len) means to reverse the substring of length "len" starting at position "pos":
Z408:
Starting score: 2855.73
Best operations: Reverse(165, 180)
Score: 3072.42
Cipher:
9%P/Z/UB%kOR=pX=BWV+eGYF69HP@K!qYeMJY^UIk7qTtNQYD5)S(/9#BPORAU%fRlqEk^LMZJdrpFHVWe8Y@+qGD9KI)6qX85zS(RNtIYElO8qGBTQS#BLd/P#B@XqEHMU^RRkcZKqpI)Wq!85LMr9#BPDR+j=6N(eXBF!H_tL#%9AYT_EKYGeZVWqS^_qYd_+cUR5TO%B9A9@+c8#F_qWqXepVB#PAfZ%%9/EcS)kNMI%%GzD76eUflrS=^POtTV+WVB#9AkpUzP(Z)ANlML895kJ!IrG=HKEVN)L%tUR8M5PAQJX@/edrYT_RD9IH6R^l)Lt+5IWVOpcZFkHUE9zXADd7L!=q_ed##6e5PORXQF%GcZ@JTtq_8JI+rBPQW6VEXr9WI6qEHM)=UIk
Best operations: Reverse(165, 180) Reverse(200, 86)
Score: 3211.55
Cipher:
9%P/Z/UB%kOR=pX=BWV+eGYF69HP@K!qYeMJY^UIk7qTtNQYD5)S(/9#BPORAU%fRlqEk^LMZJdrpFHVWe8Y@+qGD9KI)6qX85zS(RNtIYElO8qGBTQS#BLd/P#B@XqEHMU^RRkcZKqpI)Wq!85LMr9#BPDR+j=6N(eXBF!H_tL#%9AYT_EKYGeZVWqS^_qYd_+cURJk598LMlNA)Z(PzUpkA9#BVW+VTtOP^=SrlfUe67DzG%%IMNk)ScE/9%%ZfAP#BVpeXqWq_F#8c+@9A9B%OT5!IrG=HKEVN)L%tUR8M5PAQJX@/edrYT_RD9IH6R^l)Lt+5IWVOpcZFkHUE9zXADd7L!=q_ed##6e5PORXQF%GcZ@JTtq_8JI+rBPQW6VEXr9WI6qEHM)=UIk
Best operations: Reverse(165, 180) Reverse(200, 86) Reverse(217, 176)
Score: 3361.32
Cipher:
%P/Z/UB%kOR=pX=BWV+eGYF69HP@K!qYeMJY^UIk7qTtNQYD5)S(/9#BPORAU%fRlqEk^LMZJdrpFHVWe8Y@+qGD9KI)6qX85zS(RNtIYElO8qGBTQS#BLd/P#B@XqEHMU^RRkcZKqpI)Wq!85LMr9#BPDR+j=6N(eXBF!H_tL#%9AYT_EKYGeZVWqS^_qYd_+cURJk598LMlNA)Z(PzUpEV6WQPBr+IJ8_qtTJ@ZcG%FQXROP5e6##de_q=!L7dDAXz9EUHkFZcpOVWI5+tL)l^R6HI9DR_TYrde/@XJQAP5M8RUt%L)NVEKH=GrI!5TO%B9A9@+c8#F_qWqXepVB#PAfZ%%9/EcS)kNMI%%GzD76eUflrS=^POtTV+WVB#9AkXr9WI6qEHM)=UIk
If I leave the algorithm running, then after dozens of string reversals the score got up to 5625.05 before I aborted the program. So, I’m not sure how fruitful this search is without limits.
Z340:
This started with score 2150.72 and for operations Reverse(220, 68) Reverse(271, 46) Reverse(3, 163) has score 2755.25.
Cipher:
HER<7pO+J^+VUlz-K46AycBF2RZ+b+M<d9L@+zYN_+O#jfJ2G(|Lz.VGXcU2;%qK+5#(D2>FkCd*-OlF^R8p/k4&+PF5|djtz+M9_2KR++Op3V*8l^7ppSJHz#L)(WGZU+Mc:yB)fK*<.YWD%O#(B+pNd2GTL1|kPV^lp>FBy-U+R/5tE|DYBpbTMKO2<clRJ|*5T4M.+&BFz69Sy#+N|5FBc(;8)|BWlF+<C61L+TcR2GqMf.^pO(KBz3.c|+-5ZUV>EC94:*1XBy+Y_Bt7<WdkF|p)(/THSOPcWzCW)++LRlGFN^f524b.cV4t+OB*-Cc>MDHNpkSzZO8A|K;+
Jarlve random plaintext:
This started with score 2202.52 and for operations Reverse(149, 174) Reverse(65, 48) Reverse(175, 146) has score 2694.13.
Cipher:
+E7'D*!$3*.)HSF1$M&^*%Y6[ZU=($VQ,*$R+>)*'/KG$(*8V$BITC#X3>E14#]+H0I9$"A5*KQ#;OI$?<,T:UR<^PY!>[$QH1^II@*)$=*3-XL&CE*^.+0D]M/JV$!K'*QR46)-4Z"MI$8H1?9@1:*O;*UJ['P$P]#QIS2>3N?@E&7^A?(!-B)V?Z+55A*Q_M9KX+G_/$2-4%E]R!;SFL)_<*8"1I3&#N=#HL';Y$O0Z*F@[S!X.V"':$*-^V:DK3UF,*E!8AB"TQ$7K>[4=*["^H#$XME*T3C49+[]PG>-;%5#X*&N)T/RE1?'@K$*#$U:?RW2,Y*9E
@doranchak, the 340 responds better to simple manipulations than my test cipher. There seems to be that bit of extra potential in the 340. I am wondering if a high degree of 1:1 substitutions could cause this.
Added:
17. The random shift cycle, which shifts the position at which the substitution is selected in its homophone group to the left or to the right randomly. (smokie treats)
More ciphers added to the list:
340.txt (scored with) 340: 1273.26 340.txt (scored with) smokie_shortenedcycles2: 1195.03 340.txt (scored with) smokie_shortenedcycles1: 1141.35 340.txt (scored with) jarlve_perfectcycles1: 1117.23 340.txt (scored with) 408_1-340: 1034.69 340.txt (scored with) smokie_palindromic2: 988.86 340.txt (scored with) smokie_palindromic1: 982.77 340.txt (scored with) jarlve_palindromic1: 975.15 340.txt (scored with) 408: 913.41 340.txt (scored with) jarlve_26percentrandomhomophones1: 911.79 340.txt (scored with) 408_69-408: 854.65 340.txt (scored with) 340_reversed: 819.18 340.txt (scored with) moonrock_regionalcycles2: 806.52 340.txt (scored with) moonrock_regionalcycles1: 800.93 340.txt (scored with) jarlve_randomshiftcycles1: 772.15 340.txt (scored with) 340_randomized1: 753.03 340.txt (scored with) 340_randomized3: 667.14 340.txt (scored with) 340_randomized2: 645.29 340.txt (scored with) jarlve_anticycles1: 286.95
I wrote a new cycle test this morning: use the cycle ngrams of one cipher to score the cycle ngrams of another cipher. This method has many advantages, it is quick and allot of information is captured. Furthermore, it eliminates the need to write complicated cycle detection routines and its use could be more universal (outside of cycle detection).
How does this work? I don’t follow how the scores are computed relative to other scores.
I wrote a new cycle test this morning: use the cycle ngrams of one cipher to score the cycle ngrams of another cipher. This method has many advantages, it is quick and allot of information is captured. Furthermore, it eliminates the need to write complicated cycle detection routines and its use could be more universal (outside of cycle detection).
How does this work? I don’t follow how the scores are computed relative to other scores.
Build cycle ngram frequencies from one cipher and score the other cipher with it. I also tried a chi-squared test to measure the difference between the cycle ngram frequencies of both ciphers. My conclusion is that these tests may not work so well/have problems. For now I am going back to my original approach: dedicated cycle detection for each individual cycle type.
16. The alternating length cycle, which alternates between shorter and longer substitution cycles: 12 – 1 – 12 – 1 – 12 – 1, 123 – 12 -123 – 12 – 123. (Jarlve)
I wrote a routine that looks for perfect examples of this and compares it versus 1000 randomizations. It is hard to interpret these numbers, various cycle types seem to correlate with alternating length cycles, the shortened cycles ciphers by smokie seem to be a good match though. In the full length 408 no perfect alternating length cycles occur.
Alternating length 2-symbol cycle sigmas:
340: 4.59
408: -0.76
408_1-340: -0.91
408_69-408: 3.33
jarlve_26percentrandomhomophones1: 0.48
jarlve_anticycles1: -1.36
jarlve_palindromiccycles1: 4.01
jarlve_perfectcycles1: 2.97
jarlve_randomshiftcycles1: -0.39
moonrock_regionalcycles1: -0.77
moonrock_regionalcycles2: 0.03
smokie_palindromiccycles1: 12.27
smokie_palindromiccycles2: 6.49
smokie_shortenedcycles1: 3.85
smokie_shortenedcycles2: 4.27
Alternating length 3-symbol cycles sigmas:
340: 5.91
408: -0.05
408_1-340: -0.18
408_69-408: -0.26
jarlve_26percentrandomhomophones: 0.85
jarlve_anticycles1: -0.35
jarlve_palindromiccycles1: 2.34
jarlve_perfectcycles1: 8.47
jarlve_randomshiftcycles1: 0.78
moonrock_regionalcycles1: -0.56
moonrock_regionalcycles2: 0.15
smokie_palindromiccycles1: 2.98
smokie_palindromiccycles2: 13.98
smokie_shortenedcycles1: 5.18
smokie_shortenedcycles2: 6.17
AZdecrypt cycle types stats for: 340.txt -------------------------------------------------------- 2-symbol cycles: -------------------------------------------------------- Alternating length cycles: 4.59 sigma -------------------------------------------------------- MZMMZMMZMMZ: 11 VV;VV;VV;: 9 LL/LL/LL/: 9 LL;LL;LL;: 9 GG_GG_GG_: 9 LL7LL7LL7: 9 GG/GG/GG/: 9 ^^/^^/^^/: 9 GG;GG;GG;: 9 VV/VV/VV/: 9 3-symbol cycles: -------------------------------------------------------- Alternating length cycles: 5.91 sigma -------------------------------------------------------- ^*^*&^*^*&^*^*: 14 LL/;LL/;LL/;: 12 LL7;LL7;LL7;: 12 GG7;GG7;GG7;: 12 GG_;GG_;GG_;: 12 GG/;GG/;GG/;: 12 VV/;VV/;VV/;: 12 GG7&GG7&GG7: 11 LL7&LL7&LL7: 11 LL7qLL7qLL7: 11 Runtime: 28.96
Taking things slowly this weekend. I want to upgrade my encoder for different patterns, and still thinking of ways to do it.
Here is a chart of the 340, all L=2 isomorphic patterns sorted from left to right. You have to zoom way in and then scroll left and right to see the details. There are interesting little clusters of spikes, and they are interesting because they are spikes of patterns and continuations of patterns.
Brown arrows: Four little spikes AABAA; AABAAB; AABAABAA; AABAABAAB
Orange arrows: Three little spikes ABAAAB; ABAAABA; ABAAABAA
Red arrows: Three spikes, one little and two big ABAAB; ABAABA; ABAABAA
Blue arrows: Two medium and two big spikes ABAB; ABABA; ABABAB; ABABABA
Purple arrow: One big spike ABABAA. It is not a cluster of spikes of continuations of patterns, just one spike.
But I would first recommend reading the paper, because it goes into detail about the speed efficiencies of their algorithm:
Here is the cipher that is contained in the paper:
1 2 3 4 5 6 7 8 9 10 11 12 5 13 14 15 16 17 18 19 4 5 20 21 22 2 3 8 1 10 23 24 9 4 5 25 8 7 3 21 24 26 1 22 4 27 24 2 28 23 24 29 5 6 7 11 26 30 16 13 28 27 19 11 15 31 16 20 23 29 22 28 6 2 11 6 18 30 14 9 12 29 16 32 30 25 13 8 26 17 10 28 12 12 15 12 29 6 19 27 18 12 23 7 26 24 9 33 4 22 33 2 5 30 27 29 7 11 23 24 20 26 10 32 32 34 13 16 19 8 15 6 4 18 20 27 9 12 13 35 3 10 32 28 22 31 31 15 19 8 18 9 11 21 20 21 13 12 30 16 10 21 4 15 19 23 6 29 14 30 32 31 20 8 26 18 5 27 29 28 21 31 24 9 23 24 13 4 26 24 10 27 11 19 23 30 2 16 7 8
I wrote a new cycle test this morning: use the cycle ngrams of one cipher to score the cycle ngrams of another cipher. This method has many advantages, it is quick and allot of information is captured. Furthermore, it eliminates the need to write complicated cycle detection routines and its use could be more universal (outside of cycle detection).
How does this work? I don’t follow how the scores are computed relative to other scores.
Build cycle ngram frequencies from one cipher and score the other cipher with it. I also tried a chi-squared test to measure the difference between the cycle ngram frequencies of both ciphers. My conclusion is that these tests may not work so well/have problems. For now I am going back to my original approach: dedicated cycle detection for each individual cycle type.
Decided not to give up on it yet and have improved the results by capturing more cycle ngram information and using the logarithmic of the cycle ngram frequencies. It really seems to be working now, this made my day. The results are still fuzzy but can be worked with.
Here is a run of a uniquely randomized 340 versus the ciphers in the batch file, note how the other randomized 340 ciphers are at the top (none of these randomized 340 ciphers share the same randomization by the way):
340.txt (scored with) 340_randomized2: 437.49 340.txt (scored with) 340_randomized3: 434.90 340.txt (scored with) 340_randomized1: 404.95 340.txt (scored with) moonrock_regionalcycles2: 375.97 340.txt (scored with) moonrock_regionalcycles1: 373.10 340.txt (scored with) jarlve_randomshiftcycles1: 346.92 340.txt (scored with) 340_reversed: 341.79 340.txt (scored with) 340: 325.64 340.txt (scored with) smokie_shortenedcycles1: 303.96 340.txt (scored with) jarlve_26percentrandomhomophones1: 291.81 340.txt (scored with) tonyb1_perfectcycles1: 290.78 340.txt (scored with) 408_69-408: 278.20 340.txt (scored with) smokie_palindromic1: 270.64 340.txt (scored with) 408_1-340: 263.31 340.txt (scored with) jarlve_palindromic1: 261.63 340.txt (scored with) 408: 261.32 340.txt (scored with) jarlve_perfectcycles1: 245.52 340.txt (scored with) smokie_palindromic2: 238.78 340.txt (scored with) smokie_shortenedcycles2: 236.45 340.txt (scored with) jarlve_anticycles1: 213.58 340.txt (scored with) rayn_perfectcycles1: 211.12
And here is the normal 340. The 408 sub strings are now nearer to the top:
340.txt (scored with) 340: 951.98 340.txt (scored with) smokie_shortenedcycles1: 596.95 340.txt (scored with) 408_1-340: 566.98 340.txt (scored with) 408_69-408: 551.54 340.txt (scored with) jarlve_26percentrandomhomophones1: 550.21 340.txt (scored with) rayn_perfectcycles1: 549.15 340.txt (scored with) jarlve_palindromic1: 542.22 340.txt (scored with) jarlve_randomshiftcycles1: 537.68 340.txt (scored with) smokie_shortenedcycles2: 521.50 340.txt (scored with) jarlve_perfectcycles1: 514.75 340.txt (scored with) moonrock_regionalcycles2: 507.62 340.txt (scored with) 408: 507.22 340.txt (scored with) 340_reversed: 490.91 340.txt (scored with) smokie_palindromic2: 489.08 340.txt (scored with) smokie_palindromic1: 473.01 340.txt (scored with) moonrock_regionalcycles1: 459.69 340.txt (scored with) tonyb1_perfectcycles1: 450.24 340.txt (scored with) 340_randomized1: 446.73 340.txt (scored with) 340_randomized3: 395.16 340.txt (scored with) 340_randomized2: 377.12 340.txt (scored with) jarlve_anticycles1: 86.06
And here is the full 408. Notice how the cipher itself is not the number 1 result, this happens sometimes, and has to do with the ngram frequencies. Though all the 408 and perfect cycles ciphers are at the top. It is really working:
408.txt (scored with) 408_1-340: 903.36 408.txt (scored with) 408: 873.98 408.txt (scored with) 408_69-408: 573.92 408.txt (scored with) rayn_perfectcycles1: 501.29 408.txt (scored with) jarlve_perfectcycles1: 467.32 408.txt (scored with) tonyb1_perfectcycles1: 454.52 408.txt (scored with) smokie_shortenedcycles1: 414.29 408.txt (scored with) smokie_palindromic2: 407.16 408.txt (scored with) smokie_shortenedcycles2: 392.84 408.txt (scored with) jarlve_26percentrandomhomophones1: 384.52 408.txt (scored with) jarlve_randomshiftcycles1: 369.63 408.txt (scored with) jarlve_palindromic1: 361.26 408.txt (scored with) smokie_palindromic1: 333.51 408.txt (scored with) 340: 329.22 408.txt (scored with) 340_reversed: 326.03 408.txt (scored with) moonrock_regionalcycles1: 289.52 408.txt (scored with) moonrock_regionalcycles2: 279.60 408.txt (scored with) 340_randomized1: 273.43 408.txt (scored with) 340_randomized3: 230 408.txt (scored with) 340_randomized2: 214.01 408.txt (scored with) jarlve_anticycles1: 50.16
Here is the cipher that is contained in the paper:
Thanks!
Build cycle ngram frequencies from one cipher and score the other cipher with it.
How exactly do you score the other cipher relative to the frequencies of the first cipher’s cycle ngrams? I’m probably just missing something obvious.
Build cycle ngram frequencies from one cipher and score the other cipher with it.
How exactly do you score the other cipher relative to the frequencies of the first cipher’s cycle ngrams? I’m probably just missing something obvious.
1. Get the cycle ngram frequencies of cipher A. For that my routine goes through all 5-symbol cycles with 10-gram frequencies and added 9, 8, 7, 6, 5, 4, 3, 2-gram frequencies at the end of the cycle or for when the cycle is very short.
2. Normalization. Divide these frequencies by the total amount of ngrams for that ngram length and divide the shorter ngrams some more since these will have higher frequency counts and these shoud add to the score, not dominate it. After the division get the logarithm.
3. Go through the cycles of cipher B and sum all the corresponding ngram logs of cipher A. Multiply the sum by some factor.
Currently this approach does not really normalize the symbol frequencies of the cipher, for instance smokie_shortenedcycles1 with a raw ioc of 2666 is close to the top for allot of ciphers. Step 2 is a band-aid fix for these kind of problems but it still is an issue. Getting the sigma of each ngram would be much better as normalization but it would take so much time.
Here is a new run that considers that each cycle ABCDE could as well be BCDEA, CDEAB, DEABC and EABCD. I like how the 408 ciphers take second and third place here.
340.txt (scored with) 340: 316.81 340.txt (scored with) 408_69-408: 279.58 340.txt (scored with) 408_1-340: 279.28 340.txt (scored with) smokie_shortenedcycles1: 269.27 340.txt (scored with) 408: 256.80 340.txt (scored with) rayn_perfectcycles1: 254.92 340.txt (scored with) jarlve_topbottomcycles2: 240.52 340.txt (scored with) jarlve_topbottomcycles1: 239.39 340.txt (scored with) 340_reversed: 235.11 340.txt (scored with) jarlve_26percentrandomhomophones1: 233.47 340.txt (scored with) jarlve_palindromic1: 232.63 340.txt (scored with) jarlve_perfectcycles1: 232.04 340.txt (scored with) smokie_palindromic2: 229.78 340.txt (scored with) tonyb1_perfectcycles1: 229.03 340.txt (scored with) smokie_shortenedcycles2: 228.92 340.txt (scored with) moonrock_regionalcycles2: 228.68 340.txt (scored with) jarlve_randomshiftcycles1: 217.30 340.txt (scored with) 340_randomized1: 212.48 340.txt (scored with) moonrock_regionalcycles1: 212.34 340.txt (scored with) smokie_palindromic1: 209.09 340.txt (scored with) 340_randomized3: 192.02 340.txt (scored with) 340_randomized2: 190.94 340.txt (scored with) largo_oddevencycles1: 179.61 340.txt (scored with) jarlve_anticycles1: 89.21
I completely ditched the idea of scoring one set of cycle ngrams with another.
It now works like this:
Get all cycle ngram frequencies of cipher A and B and calculate the sigma versus randomizations thereof. Sum (and make positive) the numerical differences between the cycle ngram frequency sigmas of cipher A and B. This sum is the final number and a lower sum denotes a better correlation between cipher A and B by this system. To that I added the option to sum only the sigma differences that are above a certain value (say 1) to reduce noise.
You can find this functionality in AZdecrypt 1.091 under the file menu as "Batch ciphers (match symbol sequences)". On my old i7 it checks about 5 million symbol sequences (cycles) per second using 6 threads. It goes through all 3-symbol sequences and uses 6-gram frequencies and it could take up to a minuter per cipher to process.
AZdecrypt 1.091 executable: https://drive.google.com/open?id=1EtJ_W … Xoh9XJ8CYI
And the batch file I have been using: https://drive.google.com/open?id=1Hl6yz … kFsGRhG-Rn
The sigma option can be found under options, solver as "(Batch ciphers) Match symbol sequences, only use sigma over". Which probably should read "only sum sigma difference over".
Here are some results:
340 versus batch file:
1: 340 (versus) 340: 0 2: 340 (versus) 340_reversed: 632.16 3: 340 (versus) jarlve_topbottomcycles1: 742.42 4: 340 (versus) jarlve_topbottomcycles2: 902.65 5: 340 (versus) 408_69-408: 925.43 6: 340 (versus) jarlve_randomshiftcycles1: 963.44 7: 340 (versus) moonrock_regionalcycles2: 1011 8: 340 (versus) moonrock_regionalcycles1: 1176.88 9: 340 (versus) largo_oddevencycles1: 1202.21 10: 340 (versus) 340_randomized1: 1316.82 11: 340 (versus) jarlve_26percentrandomhomophones1: 1327.58 12: 340 (versus) tonyb1_perfectcycles1: 1402.42 13: 340 (versus) smokie_palindromic2: 1419.78 14: 340 (versus) smokie_palindromic1: 1424.45 15: 340 (versus) 340_randomized2: 1497.90 16: 340 (versus) 340_randomized3: 1504.96 17: 340 (versus) smokie_shortenedcycles1: 1591.68 18: 340 (versus) smokie_shortenedcycles2: 1596.30 19: 340 (versus) 408: 1656.97 20: 340 (versus) 408_1-340: 1719.17 21: 340 (versus) jarlve_palindromic1: 1798.17 22: 340 (versus) jarlve_perfectcycles1: 2406.96 23: 340 (versus) rayn_perfectcycles1: 2686 24: 340 (versus) jarlve_anticycles1: 3891.48
Randomized 340 versus batch file:
1: 340_randomized4 (versus) 340_randomized3: 241.32 2: 340_randomized4 (versus) 340_randomized1: 394.06 3: 340_randomized4 (versus) 340_randomized2: 541.61 4: 340_randomized4 (versus) moonrock_regionalcycles2: 610.62 5: 340_randomized4 (versus) moonrock_regionalcycles1: 802.61 6: 340_randomized4 (versus) 340_reversed: 1565.53 7: 340_randomized4 (versus) largo_oddevencycles1: 1634.59 8: 340_randomized4 (versus) 340: 1700 9: 340_randomized4 (versus) jarlve_topbottomcycles1: 1915.01 10: 340_randomized4 (versus) jarlve_randomshiftcycles1: 2004.45 11: 340_randomized4 (versus) 408_69-408: 2339.06 12: 340_randomized4 (versus) smokie_palindromic1: 2359.25 13: 340_randomized4 (versus) tonyb1_perfectcycles1: 2560.36 14: 340_randomized4 (versus) jarlve_topbottomcycles2: 2590.79 15: 340_randomized4 (versus) jarlve_26percentrandomhomophones1: 2864.42 16: 340_randomized4 (versus) smokie_palindromic2: 2930.01 17: 340_randomized4 (versus) 408_1-340: 3145.80 18: 340_randomized4 (versus) 408: 3158.73 19: 340_randomized4 (versus) smokie_shortenedcycles2: 3181.26 20: 340_randomized4 (versus) smokie_shortenedcycles1: 3257.79 21: 340_randomized4 (versus) jarlve_palindromic1: 3276.70 22: 340_randomized4 (versus) jarlve_perfectcycles1: 3913.67 23: 340_randomized4 (versus) rayn_perfectcycles1: 4112.89 24: 340_randomized4 (versus) jarlve_anticycles1: 5030.45
408 versus batch file:
1: 408 (versus) 408: 0 2: 408 (versus) 408_1-340: 57.69 3: 408 (versus) 408_69-408: 480.71 4: 408 (versus) tonyb1_perfectcycles1: 767.74 5: 408 (versus) smokie_shortenedcycles2: 1104.35 6: 408 (versus) rayn_perfectcycles1: 1117.10 7: 408 (versus) jarlve_topbottomcycles2: 1147.83 8: 408 (versus) jarlve_26percentrandomhomophones1: 1189.82 9: 408 (versus) jarlve_perfectcycles1: 1203.92 10: 408 (versus) smokie_shortenedcycles1: 1207.90 11: 408 (versus) smokie_palindromic2: 1309.57 12: 408 (versus) jarlve_palindromic1: 1497.30 13: 408 (versus) 340: 1636.33 14: 408 (versus) jarlve_topbottomcycles1: 1662.10 15: 408 (versus) jarlve_randomshiftcycles1: 1755.26 16: 408 (versus) 340_reversed: 1888 17: 408 (versus) largo_oddevencycles1: 1900.24 18: 408 (versus) smokie_palindromic1: 1905.17 19: 408 (versus) moonrock_regionalcycles1: 2593.94 20: 408 (versus) moonrock_regionalcycles2: 2701.20 21: 408 (versus) 340_randomized1: 2777.48 22: 408 (versus) 340_randomized2: 2882.50 23: 408 (versus) 340_randomized3: 2938.02 24: 408 (versus) jarlve_anticycles1: 4091.29