A diagonal shift?

Jarlve · 2015-03-07T15:50:42Z

Hey everyone, I have possible indentified a correlation between different sets of information that on first glance seems very related to the work of traveller1st trying to explain the pivots in: viewtopic.php?f=81&t=964&hilit=diagonal The correlation seems so strong that I believe that the first person who can make sense of the data has a chance of solving the 340. I will try to explain it in a way that everybody can understand. First of all the 340 is believed to be a homophonic substitution cipher similar to the 408. Various statistics back this up, more specifically it very likely seems to be cyclic. Homophonic substitution means that every letter could have more than one symbol substitution attached to it. Then there are 2 variants cyclic and random, cyclic means that for each letter, the symbol map is followed in order. And for random a dice is rolled to determine the symbol. Both have very different statistical signatures. I will from now on refer to cyclic homophonic substitution as CHS and random as RHS. The letter to symbol map for the 408. In a sense, how I see it, homophonic substitution simply blurs out the plaintext and information of the cipher. Now some time ago I developed a system that uses the information of the non-repeats, I don't know if anyone else uses this data but I have found it to be extremly powerful. This measures, using every symbol in the cipher as a starting point, the length of each unique string. For instance, the string akin to CHS "ABCABCABC", has 7 unique sub-strings with a length of 3. For RHS the string is more likely to be random and will have shorter sub-strings of non-repeats. But measuring this for any cipher will give you a total equal to the cipher's length so you have to multiplicate the counts by the length of the string. This system is so powerful that it can easily distuinguish between CHS and RHS or between a plaintext, a random plaintext and vigenere. It can also be used to determine writing direction. I took the information of the non-repeats to the next level and measure this data for 96 orientations. With orientation I mean, we write text in a right-to-left, top-to-bottom manner. Notice that you have a primary direction and a secondary direction. I figured our writing system is 2 dimensional and that using the common wind directions you have 16 different orientations. But the actual system that I developed for this uses an input and output direction, so following this you have already 240 (16x15) variations. I simplified this to either doing an orientation or undoing it, which relates to the input and the output. For each of these 32 orientations I also added the option to alternate the primary direction, starting even or uneven. So there you have 96. Is it needed to gather the non-repeat data for 96 different orientations? How could this possibly relate to anything? To answer these questions I have to spill quite some data. And I will use some ciphers to compare, each cipher will be capped to 340 characters. Now following various data for various ciphers including that of the non-repeats and other systems for reference later on, you don't have to interpret it right now. Just scroll down now. CHS ciphers: 408: system 1: 3538, 11317 system 2: 5714 system 3: 3351 primary direction, secondary direction: normal, primary direction alternated starting even, uneven. output: (do) ------------------------ normal directions -----> e-s: 4692, 4247, 4466. w-s: 4338, 4466, 4247. e-n: 4338, 4466, 4247. w-n: 4692, 4247, 4466. rotations -------------> s-e: 3062, 2907, 2997. n-e: 3135, 2997, 2907. s-w: 3135, 2997, 2907. n-w: 3062, 2907, 2997. diagonals 1 -----------> ne-se: 2757, 2952, 3045. sw-se: 3052, 3045, 2952. ne-nw: 3052, 3045, 2952. sw-nw: 2757, 2952, 3045. diagonals 2 -----------> se-sw: 3055, 3115, 2992. nw-sw: 2802, 2992, 3115. se-ne: 2802, 2992, 3115. nw-ne: 3055, 3115, 2992. ------------------------ input: (undo) ------------------------ normal directions -----> e-s: 4692, 4247, 4466. w-s: 4338, 4466, 4247. e-n: 4338, 4247, 4466. w-n: 4692, 4466, 4247. rotations -------------> s-e: 3218, 3326, 3474. n-e: 3315, 3474, 3326. s-w: 3315, 3474, 3326. n-w: 3218, 3326, 3474. diagonals 1 -----------> ne-se: 2578, 2531, 2828. sw-se: 2890, 2828, 2531. ne-nw: 2890, 2531, 2828. sw-nw: 2578, 2828, 2531. diagonals 2 -----------> se-sw: 2979, 3007, 2882. nw-sw: 3001, 2882, 3007. se-ne: 3001, 3007, 2882. nw-ne: 2979, 2882, 3007. ------------------------ ray_n: system 1: 3488, 11518 system 2: 5629 system 3: 3530 primary direction, secondary direction: normal, primary direction alternated starting even, uneven. output: (do) ------------------------ normal directions -----> e-s: 5046, 4823, 4664. w-s: 4452, 4664,...

Jarlve

(@jarlve)

Posts: 2547

Famed Member

Topic starter

I may have found something interesting that could overthrow the word search idea. A difference of almost 15% in bigram counts between in and output (doing and undoing) full grid directional transpositions. Most noteably vertical and diagonal NE-SW. The thing here is that bigram counts are higher for undoing transposition. I will refer to it as either positive or negative.

The 408 also has a 15% difference but it is positive, which I think is normal because of information carry over to other directions. The ray_n cipher used in this thread exhibits a 7% positive difference and the 408 redone almost 32% positive. Even the word search is positive but only a few %.

It seems to be somewhat consistent for the test ciphers to be positive. I will try to recreate the "effect". The base difference for the plaintext used (340 characters of the 408, found in earlier post) is 99.4% Which is slightly negative so to say. After encoding with a new less perfect algorithm hopefully more likely to Zodiac encoding it becomes 111.6% So again positive.

Let’s apply a full grid diagonal NE-SE transposition:

iikiocifriaeirtil
leleethokgrnelgis
klpbicmndraglgrit
igeeusalomnanhtew
nlsmihifeafitigfp
puottwesdohtenfth
asinghutltscaoswr
snuntasaeonhsetei
ufincommmetkbabdi
eliemioerrcehlaew
lebensheeohtlrhdm
mthaltptrtsiateew
stslextrliwpllmii
suimeeurtinllosgy
oksgboiieiaicetme
oennygfindkevoube
viegaodrnebanoesw
lvnhtioavlllymuul
eitrnbealsleaaolr
tiaeechiyivncyity

The difference for this plaintext is 92% Aha! That is a negative.

Encoding CHS:

g(ihIOUj?Z';Q1m"A
nLf=;9&NiJ*cL)Y>p
iA6<g-VE@/b5nJ?(a
hY=;C%.fIW7'TGRL2
c)D+UeZ4=bjQm"546
6_N9aK;:@IBRLEjm&
.X>7JGC9AapO'N%21
DT_cRb:.=IEeX;mLg
C4(7-NVW+=9i'<@h
;nULVZI=*/O;BfbLK
)=;Tp&L=NGaA?e@W
+RB.nm691a%Q'R;L2
Dm:f=39*)"K6AnV>g
X_(W;LC/ahcf)IpY0
Ni%5<IUZ=Qb"-;R+L
N=E70Jj>T@i;SI_L
Sg=Y.N@?c;<'EILD2
AS7&m(NbSnf)0VC_A
=h91T;.n:fL'bI)*
aU.=;OGZ0QSc-0"R0

81% So the encoding actually seems to articulate the bigram difference.

What does it mean? What could it mean? I’m not sure, maybe it is a fluke, maybe an indication of a vertical/diagonal transposition scheme, or a word search with the majority of the words in these directions. Furthermore I want to add that I’m not so sure anymore of my previous assumption, that the transposition was done after encoding. Because a full grid plaintext transposition plus poor encoding can have a significant impact on the numbers of the non-repeats as well.

AZdecrypt

Posted : March 25, 2015 1:55 pm

doranchak

(@doranchak)

Posts: 2614

Member Admin

Jarlve, this is fascinating work and I’m eager to hear more. I’m trying to catch up on what you’ve done so far.

Can you give more details about how you do the calculation of non-repeats? How are you generating the substrings to score from the cipher text? I’m not following how that works.

Thanks!

http://zodiackillerciphers.com

Posted : March 26, 2015 11:33 pm

Jarlve

(@jarlve)

Posts: 2547

Famed Member

Topic starter

Hey doranchak, thanks for your interest.

About the non-repeats,

Consider every symbol as a starting point, and then count the length of the unique non-repeating string that follows. When done so for the entire cipher multiply the count of each length by the length to give weight to longer non-repeating strings. For the 340 in horizontal direction the score you then get is 4462. I also have an alternative "IoC" calculation for this since frequencies are involved. Graphing these frequencies is interesting.

for i=1 to 340 'each symbol as a starting point
for j=i to 340 'count the length of the unique non-repeating string that follows
counter+=1 'until repeat is found
next j
if counter>max_length then max_length=counter
nr_frequencies(counter)+=1
counter=0
next i
for i=1 to max_length
nr_score+=nr_frequencies(i)*i
next i
print nr_score

The 340 peaks at a unique string length of 17 with a count of 26 and then drops rather sharply. I find it strange, it’s quite high and not so smooth. At first I thought the "+" symbols were somehow involved because of 340 / 24 being close to 17 but after removing the "+" symbols it still peaks at 17.

What follows is an image that has the length of the unique string that follows for each symbol of the 340.

https://www.dropbox.com/s/gk8bhh3htwy7g … 2.png?dl=0

AZdecrypt

Posted : March 27, 2015 3:06 am

doranchak

(@doranchak)

Posts: 2614

Member Admin

Oh, ok. So if I understand the measurement correctly, it is a way to test randomness. It is an interesting measurement and seems inexpensive to compute. In the past, I explored some more expensive measurements, such as detecting rare patterns and estimating their probabilities. I was curious if any routes or transpositions of the cipher text produce increased appearances of improbable patterns.

Examples of improbable patterns include: Long sequences of homophone cycles, and large numbers of repeated n-grams and other repeating fragments. The candidate homophone cycle "l*M", for instance, appears like this in the 340: [l*M] [l*M] [l*M] lM [l*M] [l*M] [l*M]. Based on that sequence and the frequency of its constituent symbols it’s possible to estimate the probability of it occurring by chance. If the pattern was instead "l+M", the probability would be higher since "+" appears so often.

Other low probability repeated pairs of patterns in the 340 include "J??p7", "5?4?.", and "O?*?C" (where "?" are wildcards). I’ve been wanting to explore more candidate transpositions/routes that might produce more such patterns, perhaps indicating more structured underlying plaintext (assuming the transformation was performed after applying the symbol substitutions).

A problem with measurements of randomness is that they don’t distinguish between interesting and uninteresting symbol repetitions. Uninteresting repetitions are the kinds that are very easily obtained by chance. This is why I tend to focus on discovery of improbable patterns, because they might suggest underlying message structure.

I’m going to study more of your posts to try to gain some ideas about which transformations might be worth exploring. Thanks for all of your efforts!

http://zodiackillerciphers.com

Posted : March 27, 2015 5:02 pm

morf13

(@morf13)

Posts: 7527

Member Admin

Every time I check in on one of these cipher threads, I leave with my head spinning :lol: Good Luck with your research Guys

There is more than one way to lose your life to a killer

http://www.zodiackillersite.com/
http://zodiackillersite.blogspot.com/
https://twitter.com/Morf13ZKS

Posted : March 27, 2015 5:39 pm

Jarlve

(@jarlve)

Posts: 2547

Famed Member

Topic starter

Hey doranchak,

I probably need a break from the cipher work. That being said I have some ideas which you could explore.

Instead of information moving horizontally through the ciper normally.

1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 -

Snake like patterns throughout the whole grid.

2 - 3   6 - 7   
|   |   |   |   |
1   4 - 5   8 - 9

Or diagonal transpositions in pairs of two, three, etc.

AZdecrypt

Posted : March 28, 2015 12:30 pm

doranchak

(@doranchak)

Posts: 2614

Member Admin

Thanks.

Is it possible for you to upload the transposed cipher texts that produced the highest values of your non-repeats measurement? I’m curious if other interesting features can be detected in those transpositions.

http://zodiackillerciphers.com

Posted : March 28, 2015 1:31 pm

Jarlve

(@jarlve)

Posts: 2547

Famed Member

Topic starter

The highest are ofcourse the horizontals but these are the most significant "bumps" and they are both diagonal.

Input: horizontal. Output: NE-SE, oxcart even. 3669. (doing transposition)

HR>ËIB¢Ÿ„½^µF+K¼Ä
EÐP²+·BÃÐÌIPµÑL<-
ÌVLÐO£MÐ¾Ä+·º¹MŸu
^TNºÆ+S»Êˆ¢ƒ++B+F
GÄDKuJVÔ³D±¤ÂFRB¤
±W»ZH´¤Ë±uy+</ˆ°³
y<G¤Ð+/>ÃNZ½µ+¼VÔ
•W·OMÐFX¸RÐÔ•SÃ+K
¢L+¼¾ËG+±OEMŸ•+B¢
£+¸RCVOF+I³·ÂŸ¤OW
R±^Ä•·BJDT+³B´ÐÌB
KF»¤ÊÃ^yµN±X•^FI¢
Ì-LÆŸ+B»Iµ²Ã•+£/£
OIJAVÐIµÆ»IÆ<LTÐB
¢±°uÂJF^„+MC+HIO»
G³ÌTRBN³-Ñ°+SFy-Ë
K¤MÌÃF¼µG²£OË¸CÐS
-KÃ¢GCZ±LWPÄBÃN¤I
O<ƒÌEuR+CÃWÔ>HZAK
±¾R>VÃT¤W<½MDO¾ƒ+

Input: SE-SW. Output: horizontal. 3592. (undoing transposition)

Ä±£GÆJTKH±L»¤K/²<
·RË¢I•L+³DIËy£+ˆ±
L¼PW¢O+>¤LKVDWÐPF
•¹³-^ºG´FËV+°ŸOÌO
ZVµCG¤ABKFÐ·u»IÄX
yŸFMBR>¢+¾Ä»ÃNÃ<T
ˆ¾+RBMÌÊ-u¸B½Â+ƒ+
+E+Ã^ÔO±+FÐÐ•¢Ô-±
HÐ„½¤ÌƒO±OBMÃ³µGL
NŸÐ+Fº·R+y³BVZÑ£Ð
BÐM^ÑÊZJDTFÃuMI£Ã
S¼RKÆ+^Iµµ•VÆB¢C+
¸¾+JÂ+E»IÂ>•W/-ƒÐ
µ±+VÔIN³E^ÌT»K·GM
uµJ+±CÐFHBI¢<Ì/R·
µ¼O+SOAÄ¤RÌŸÆ³¢<O
y¾-+ÃS^„KCP¸Ou<¼N
»B°ÃBZ±°F²¤²WÔ¤¤G
X´L¤½SÌB•+C<ËŸÃTW
WÐIÃ£ÄNR+ËH+FDIM>

Maybe the second one will be of some interest. It somewhat overlaps with my findings about row 6 to 15 because they look visually similar for this transposition, although the visual similarity only exists mainly from row 8 to 13.

I’ve tried a couple of 100.000’s of combinatons of full grid transpositions like these with AZdecrypt, stacked rotations etc and nothing interesting came out of it. I have also tried 32 distinct full grid spirals, 4 starting points * 4 directions * doing/undoing. Damn cipher won’t yield!

AZdecrypt

Posted : March 28, 2015 10:49 pm

Jarlve

(@jarlve)

Posts: 2547

Famed Member

Topic starter

Also, if such a strong full grid transposition is actual in the 340 it was surely done before the encoding. The non-repeat data strongly indicates horizontal encoding. Therefore, if transposition was done after or during encoding it has to be somewhat subtle.

If not then what could be the cause of some of my data sets being out of line in respect to the 408 and others? I’m thinking one or a mix of the following, transposition of the plaintext, poor encoding and/or symbol to letter distribution (can increase bigram counts by a huge factor), a "large" number of filler symbols (untested) and the plaintext being a word search is a really good fit for most of it. Untested languages and polyalphabetism does not correlate well with the bigram distribution found in the 340.

AZdecrypt

Posted : March 29, 2015 12:08 am

Zodiac Discussion Forum