I may have found something interesting that could overthrow the word search idea. A difference of almost 15% in bigram counts between in and output (doing and undoing) full grid directional transpositions. Most noteably vertical and diagonal NE-SW. The thing here is that bigram counts are higher for undoing transposition. I will refer to it as either positive or negative.
The 408 also has a 15% difference but it is positive, which I think is normal because of information carry over to other directions. The ray_n cipher used in this thread exhibits a 7% positive difference and the 408 redone almost 32% positive. Even the word search is positive but only a few %.
It seems to be somewhat consistent for the test ciphers to be positive. I will try to recreate the "effect". The base difference for the plaintext used (340 characters of the 408, found in earlier post) is 99.4% Which is slightly negative so to say. After encoding with a new less perfect algorithm hopefully more likely to Zodiac encoding it becomes 111.6% So again positive.
Let’s apply a full grid diagonal NE-SE transposition:
iikiocifriaeirtil leleethokgrnelgis klpbicmndraglgrit igeeusalomnanhtew nlsmihifeafitigfp puottwesdohtenfth asinghutltscaoswr snuntasaeonhsetei ufincommmetkbabdi eliemioerrcehlaew lebensheeohtlrhdm mthaltptrtsiateew stslextrliwpllmii suimeeurtinllosgy oksgboiieiaicetme oennygfindkevoube viegaodrnebanoesw lvnhtioavlllymuul eitrnbealsleaaolr tiaeechiyivncyity
The difference for this plaintext is 92% Aha! That is a negative.
Encoding CHS:
g(ihIOUj?Z';Q1m"A nLf=;9&NiJ*cL)Y>p iA6<g-VE@/b5nJ?(a hY=;C%.fIW7'TGRL2 c)D+UeZ4=bjQm"546 6_N9aK;:@IBRLEjm& .X>7JGC9AapO'N%21 DT_cRb:.=IEeX;mLg C4(7-NVW+=9i'<@h ;nULVZI=*/O;BfbLK )=;Tp&L=NGaA?e@W +RB.nm691a%Q'R;L2 Dm:f=39*)"K6AnV>g X_(W;LC/ahcf)IpY0 Ni%5<IUZ=Qb"-;R+L N=E70Jj>T@i;SI_L Sg=Y.N@?c;<'EILD2 AS7&m(NbSnf)0VC_A =h91T;.n:fL'bI)* aU.=;OGZ0QSc-0"R0
81% So the encoding actually seems to articulate the bigram difference.
What does it mean? What could it mean? I’m not sure, maybe it is a fluke, maybe an indication of a vertical/diagonal transposition scheme, or a word search with the majority of the words in these directions. Furthermore I want to add that I’m not so sure anymore of my previous assumption, that the transposition was done after encoding. Because a full grid plaintext transposition plus poor encoding can have a significant impact on the numbers of the non-repeats as well.
Jarlve, this is fascinating work and I’m eager to hear more. I’m trying to catch up on what you’ve done so far.
Can you give more details about how you do the calculation of non-repeats? How are you generating the substrings to score from the cipher text? I’m not following how that works.
Thanks!
Hey doranchak, thanks for your interest.
About the non-repeats,
Consider every symbol as a starting point, and then count the length of the unique non-repeating string that follows. When done so for the entire cipher multiply the count of each length by the length to give weight to longer non-repeating strings. For the 340 in horizontal direction the score you then get is 4462. I also have an alternative "IoC" calculation for this since frequencies are involved. Graphing these frequencies is interesting.
for i=1 to 340 'each symbol as a starting point for j=i to 340 'count the length of the unique non-repeating string that follows counter+=1 'until repeat is found next j if counter>max_length then max_length=counter nr_frequencies(counter)+=1 counter=0 next i for i=1 to max_length nr_score+=nr_frequencies(i)*i next i print nr_score
The 340 peaks at a unique string length of 17 with a count of 26 and then drops rather sharply. I find it strange, it’s quite high and not so smooth. At first I thought the "+" symbols were somehow involved because of 340 / 24 being close to 17 but after removing the "+" symbols it still peaks at 17.
What follows is an image that has the length of the unique string that follows for each symbol of the 340.
https://www.dropbox.com/s/gk8bhh3htwy7g … 2.png?dl=0
Oh, ok. So if I understand the measurement correctly, it is a way to test randomness. It is an interesting measurement and seems inexpensive to compute. In the past, I explored some more expensive measurements, such as detecting rare patterns and estimating their probabilities. I was curious if any routes or transpositions of the cipher text produce increased appearances of improbable patterns.
Examples of improbable patterns include: Long sequences of homophone cycles, and large numbers of repeated n-grams and other repeating fragments. The candidate homophone cycle "l*M", for instance, appears like this in the 340: [l*M] [l*M] [l*M] lM [l*M] [l*M] [l*M]. Based on that sequence and the frequency of its constituent symbols it’s possible to estimate the probability of it occurring by chance. If the pattern was instead "l+M", the probability would be higher since "+" appears so often.
Other low probability repeated pairs of patterns in the 340 include "J??p7", "5?4?.", and "O?*?C" (where "?" are wildcards). I’ve been wanting to explore more candidate transpositions/routes that might produce more such patterns, perhaps indicating more structured underlying plaintext (assuming the transformation was performed after applying the symbol substitutions).
A problem with measurements of randomness is that they don’t distinguish between interesting and uninteresting symbol repetitions. Uninteresting repetitions are the kinds that are very easily obtained by chance. This is why I tend to focus on discovery of improbable patterns, because they might suggest underlying message structure.
I’m going to study more of your posts to try to gain some ideas about which transformations might be worth exploring. Thanks for all of your efforts!
Every time I check in on one of these cipher threads, I leave with my head spinning Good Luck with your research Guys
There is more than one way to lose your life to a killer
http://www.zodiackillersite.com/
http://zodiackillersite.blogspot.com/
https://twitter.com/Morf13ZKS
Hey doranchak,
I probably need a break from the cipher work. That being said I have some ideas which you could explore.
Instead of information moving horizontally through the ciper normally.
1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 -
Snake like patterns throughout the whole grid.
2 - 3 6 - 7 | | | | | 1 4 - 5 8 - 9
Or diagonal transpositions in pairs of two, three, etc.
12 34 56 78
Thanks.
Is it possible for you to upload the transposed cipher texts that produced the highest values of your non-repeats measurement? I’m curious if other interesting features can be detected in those transpositions.
The highest are ofcourse the horizontals but these are the most significant "bumps" and they are both diagonal.
Input: horizontal. Output: NE-SE, oxcart even. 3669. (doing transposition)
HR>ËIB¢Ÿ„½^µF+K¼Ä EÐP²+·BÃÐÌIPµÑL<- ÌVLÐO£MоÄ+·º¹MŸu ^TNºÆ+S»Êˆ¢ƒ++B+F GÄDKuJVÔ³D±¤ÂFRB¤ ±W»ZH´¤Ë±uy+</ˆ°³ y<G¤Ð+/>ÃNZ½µ+¼VÔ •W·OMÐFX¸RÐÔ•SÃ+K ¢L+¼¾ËG+±OEMŸ•+B¢ £+¸RCVOF+I³·ÂŸ¤OW R±^Ä•·BJDT+³B´ÐÌB KF»¤ÊÃ^yµN±X•^FI¢ Ì-LÆŸ+B»Iµ²Ã•+£/£ OIJAVÐIµÆ»IÆ<LTÐB ¢±°uÂJF^„+MC+HIO» G³ÌTRBN³-Ñ°+SFy-Ë K¤MÌÃF¼µG²£O˸CÐS -KâGCZ±LWPÄBÃN¤I O<ƒÌEuR+CÃWÔ>HZAK ±¾R>VÃT¤W<½MDO¾ƒ+
Input: SE-SW. Output: horizontal. 3592. (undoing transposition)
ı£GÆJTKH±L»¤K/²< ·RË¢I•L+³DIËy£+ˆ± L¼PW¢O+>¤LKVDWÐPF •¹³-^ºG´FËV+°ŸOÌO ZVµCG¤ABKFзu»IÄX yŸFMBR>¢+¾Ä»ÃNÃ<T ˆ¾+RBMÌÊ-u¸B½Â+ƒ+ +E+Ã^ÔO±+FÐЕ¢Ô-± HЄ½¤ÌƒO±OBMóµGL NŸÐ+Fº·R+y³BVZѣРBÐM^ÑÊZJDTFÃuMI£Ã S¼RKÆ+^Iµµ•VÆB¢C+ ¸¾+JÂ+E»IÂ>•W/-ƒÐ µ±+VÔIN³E^ÌT»K·GM uµJ+±CÐFHBI¢<Ì/R· µ¼O+SOAĤR̟Ƴ¢<O y¾-+ÃS^„KCP¸Ou<¼N »B°ÃBZ±°F²¤²WÔ¤¤G X´L¤½SÌB•+C<ËŸÃTW WÐIãÄNR+ËH+FDIM>
Maybe the second one will be of some interest. It somewhat overlaps with my findings about row 6 to 15 because they look visually similar for this transposition, although the visual similarity only exists mainly from row 8 to 13.
I’ve tried a couple of 100.000’s of combinatons of full grid transpositions like these with AZdecrypt, stacked rotations etc and nothing interesting came out of it. I have also tried 32 distinct full grid spirals, 4 starting points * 4 directions * doing/undoing. Damn cipher won’t yield!
Also, if such a strong full grid transposition is actual in the 340 it was surely done before the encoding. The non-repeat data strongly indicates horizontal encoding. Therefore, if transposition was done after or during encoding it has to be somewhat subtle.
If not then what could be the cause of some of my data sets being out of line in respect to the 408 and others? I’m thinking one or a mix of the following, transposition of the plaintext, poor encoding and/or symbol to letter distribution (can increase bigram counts by a huge factor), a "large" number of filler symbols (untested) and the plaintext being a word search is a really good fit for most of it. Untested languages and polyalphabetism does not correlate well with the bigram distribution found in the 340.