Repeated symbols by rows

Largo

(@largo)

Posts: 454

Honorable Member

Topic starter

Yesterday I watched (again) David Oranchaks great presentation about the zodiac ciphers:
https://www.youtube.com/watch?v=BV5R3TBMWJg
I like this video because it is very informative and points out a lot of the odd things in z340. (By the way, Klaus Schmeh is also seen in this video. I have met him on the "Historical Ciphers Colloquium" in Germany a couple of months ago. He wrote several great books and he also has an very interesting blog.)

Doranchak, in your presentation you are talking about the rows which does not contain repetitions (e.g. row 1, 2, 3 and 11, 12, 13). In the past I have experimented a lot with this information and Dan Olsons assumption. But yesterday I asked myself why a repetition measurment should consider only the length of a row. If you take a text and write it within a grid, the number of lines without repetitions depends on the width of the rows. It says nothing about the "chunks" which have no repetitions. If you have e.g. a „+“ at the end of row 1 and a „+“ at the beginning of row 2 then no repetition is recognized. I don’t get the point what the repetitions within a row should state. This makes me wonder if even the symetry between the upper and lower half of z340 is something special.

I have implemented a small test which extracts chunks of symbols without a repetition. I am sure someone else has done that before. But if I compare z408 to z340 I do not see too much interesting differences. The only curious thing is that z408 contains longer chunks without repetition than z340 does. At the first look it seems that z340 is more repetitive than z408. But I think that is because of the high amount of plus signs. What do you think, is this something to have a closer look at?

ATM I am a bit tired. I will test it again without the plus signs and bring up some statistics. At the moment it is just "visual".

Comparison of chunks without repetition between z408 and z340 with my transcription:

z408:
abPcZ
cUBbdORefX
eBWV+gGYFhaHPiKjk
YgMJ
YlUIdmkTnNQ
YDopS1carBPORAUbs
RtkEdlLMZJvyzfFHVWgwYi+
kGDaKIph
kXwoxS1RNnjYEtO
wkGBTQSr
BLvcPr
BiXkEHMUlR
RdCZKkfIpW
kjwoLMyarBPDR+uehzN1gEZHdF
ZCfOVWIo+nLptlRhH
IaDRqTYyzvgciXJQAPoMw
RUnbLpNVEKHeGyIjJdoaw
LMtNApZ1PxUfd
AarBVWz+
VTnOPleSytsUghmDxGb
bIMNdpSCEca
b
bZsAPrBVfgXkW
kqFrwC+iaA
aBbOToRUC+qvYk
qlSkWVZgGYKE
qTYAabrLn
qHjFBXax
XADvzmLjekqg
vr
rhgoPORXQFbGCZiJTnkqw
JI+yBPQWhVEX
yaWIhkEHMpeu


z340:
HERabcdVPeIfLTGghN
b+BjkOlDWYmnoKpq
BrstM+UZGWjqLkuHJSb
bvdcwoVx
bO+
+RKgyzM
+u12hI7FP
+34e5bwRdFcO-ohC
eFagDjk7+KQl8
gUtXGVmuLIj
GgJp2kO+yNYu
+9LzhnM
+0
+ZRgFBtrA#4K-ucUV
+dJ
+ObvnFBr-U
+R571EIDYBb0TMKOgntc
RJIo7T4Mm+3BFu#zSrk
+NI7FBtj8wRcG
FNdp7g40mtV
41+
+rBXfos4zCEaVUZ7-
+ItmxuBKjObd
mpMQGgRtT+Lf#Cn
+FcWBIqL
+
+qWCu
WtPOSHT5jqbIFeh
Wnv1ByYO

Same comparison with original symbols. Sorry for the misaligned rows. There is something wrong with my fonts thus the line spacings are wrong (my fonts are even not monospaced at the moment. I will correct that later and contribute them to this forum if you like):

Posted : August 12, 2016 12:36 am

doranchak

(@doranchak)

Posts: 2614

Member Admin

I think the presence of clusters of "non repeat" rows suggests that the cipher author may have begun the encipherment scheme on or near those lines. It is presumably easier to avoid symbol repetitions at the beginning of homophonic encipherment than later (this could be tested in homophonic encipherment simulations). It also seems to confirm the direction of encipherment, since the non-repeat phenomenon doesn’t occur when reading by columns. But in z408, the non-repeat rows do not appear at the beginnings of the three sections so they do not coincide with the start of the sections of cipher.

I did some shuffle tests for repeated symbols by rows (for z408, and for z340). The results seem to confirm the presence of a deliberate scheme to avoid repetition of symbols. I suspect it may simply be a natural result of homophonic substitution (which would be some evidence that z340’s author made a real attempt to encode a real message). Also, the fact that z340 has 9 rows with no repeats (compared to z408’s 6 rows) seems to indicate even more deliberate encipherment of some kind, because random placement of symbols would tend to have a lot more repeats in each row. You are right that it depends on the grid width, so this measurement does seem arbitrary. But even at width 17, the presence of 9 non-repeating rows is still a very strong statistical anomaly when compared to random shuffles. In fact, none of my 1,000,000 shuffles produced as many as 9 non-repeating rows.

Your observation about the arbitrariness of grid width makes me think that we need to run the same tests for a wider variety of grid widths. Then we can compute the statistical significance (sigma) of each width, and see if it peaks at width 17 or at some other width. I’ve added this to my already long TO-DO list.

http://zodiackillerciphers.com

Posted : August 12, 2016 2:39 pm

doranchak

(@doranchak)

Posts: 2614

Member Admin

(By the way, Klaus Schmeh is also seen in this video. I have met him on the "Historical Ciphers Colloquium" in Germany a couple of months ago. He wrote several great books and he also has an very interesting blog.)

Klaus is a great guy; he’s really enthusiastic about collecting stories about codes and ciphers. I can be seen in the video for his talk as well. He recently added some English language posts to his blog.

I have implemented a small test which extracts chunks of symbols without a repetition. I am sure someone else has done that before. But if I compare z408 to z340 I do not see too much interesting differences. The only curious thing is that z408 contains longer chunks without repetition than z340 does. At the first look it seems that z340 is more repetitive than z408. But I think that is because of the high amount of plus signs. What do you think, is this something to have a closer look at?

There is a feature in my old "cryptoscope" tool that looks for such sequences. Scroll down to where it says "Largest non-repeating sequences", then click "Show all and chart".

I think it may be useful to compute the mean length of these chunks for z340 and z408, then compare that to random shuffles. I suspect the result will confirm that z408 and z340 have mean chunk lengths that are very significant compared to shuffles. I think all we can really conclude from it is that it is the effect of homophonic substitution (in a horizontal direction), so z408 and z340 are similar in that regard. Also, it might be interesting to answer this question: Is z340’s mean chunk length more or less statistically significant than z408’s?

http://zodiackillerciphers.com

Posted : August 12, 2016 2:50 pm

traveller1st

(@traveller1st)

Posts: 3583

Member Moderator

Klaus is a great guy; he’s really enthusiastic about collecting stories about codes and ciphers. I can be seen in the video for his talk as well. He recently added some English language posts to his blog.

Brilliant.

“I don’t know Chief, he’s very smart or very dumb.“

Posted : August 13, 2016 4:26 am

Largo

(@largo)

Posts: 454

Honorable Member

Topic starter

Thank you for the explanation doranchak! Now I understand why it absolutely makes sense to check how many rows have no repeated letters. For me this is a good evidence that if any kind of transposition is involved it was applied before the homophonic substitution. Otherwise the cipher would behave much more random.
I have not spend much time in a more detailed analysis because I have so many other promising ideas at the moment. My todo list is also very long

Posted : August 14, 2016 3:34 pm

Zodiac Discussion Forum

Repeated symbols by rows