Zodiac Discussion Forum

Homophonic substitu…
 
Notifications
Clear all

Homophonic substitution

1,434 Posts
21 Users
0 Reactions
304.3 K Views
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

.
I came up with column 1 as the best possibility for gibberish. Columns 13 and 16 were a distant second place. Random shuffles of columns 2-12, 14, 15 and 17 almost never resulted in higher overall period 19 repeat counts or total scores.

On my list of things to do is work on showing the bigram repeats by color shading so that they can be matched up easily with the eye. Also, work on developing messages with high numbers of period 19 repeats with fewer than 63 symbols. How much fewer I would like to know. Work on shuffling small portions of messages to try to identify gibberish. Explain why random shuffles of small portions of transposition ciphers that are not gibberish do not result in higher period 19 repeats and scores, whereas random shuffles of gibberish do result in higher period 19 repeats and scores. And apply the column shuffle experiment on the 340 if I was correct about EDIT: column 1.
.

 
Posted : December 2, 2015 5:48 am
(@mr-lowe)
Posts: 1197
Noble Member
 

Can someone put up a set of 15 bigram repeats so I can see if / how they would fit into my setup.
Just thinking that my parking the evens adjacent to the odds is only one of several permutations.. It may be that they are two totally different codes. Odds and evens that is.

 
Posted : December 2, 2015 6:33 am
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

@Mr lowe,

Normal 340, period 19, 2-gram repeats:
+|             2
+l             2
+B             2
+k             2
+4             3
+c             2
MF             2
|<             2
|T             2
#2             3
.L             2
BO             2
*5             2
k.             2
-R             2
p+             4
(+             3
z6             2
9^             2
N:             2
D(             2
PY             2
OF             2
G+             3
<S             3
^D             2
TB             2
YA             2
;+             2
Xz             2
Total 1: 37   Total 2: 90

Mirrored 340, period 15, 2-gram repeats:
BO             2
+B             3
+k             2
+M             2
+l             3
+4             3
*5             2
k.             2
.L             2
(+             3
MV             2
MF             2
2p             2
cF             2
z6             2
|T             2
|p             2
K<             2
<S             3
G+             2
TB             2
)+             2
p+             5
N:             2
^D             2
Xz             2
YA             2
OF             2
#2             3
;+             2
9^             2
PY             2
Total 1: 41   Total 2: 106

AZdecrypt

 
Posted : December 2, 2015 1:04 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

@smokie,

I’ve taken the liberty to automate your test idea and I’m also working on visualization. It’s very exciting! Here are some results for my test ciphers. The "distance average" is short for distance from average and is the important number to watch for, higher is more indicative of a gibberish fragment. My calculation for bigrams is c*(c-1) per unique bigram.

Fragment randomization bigram response test for: jarlve_row1.txt
-----------------
Period: 19. Fragment size: 17. Step: 17.
-----------------
Fragment number, fragment range, average bigrams, distance average.
-----------------
1: 1 to 17:                 77            -7.6
2: 18 to 34:                86            1.4
3: 35 to 51:                82            -2.8
4: 52 to 68:                84            -0.6
5: 69 to 85:                82            -2.9
6: 86 to 102:               88            3.3
7: 103 to 119:              84            -1.2
8: 120 to 136:              85            0.3
9: 137 to 153:              85            0.1
10: 154 to 170:             88            3.3
11: 171 to 187:             94            8.7
12: 188 to 204:             97            12.1 <---
13: 205 to 221:             90            5.3
14: 222 to 238:             79            -5.9
15: 239 to 255:             84            -0.6
16: 256 to 272:             90            4.7
17: 273 to 289:             81            -3.9
18: 290 to 306:             77            -8.4
19: 307 to 323:             84            -0.4
20: 324 to 340:             80            -4.7

Row 12 is correct.

Fragment randomization bigram response test for: jarlve_row2.txt
-----------------
Period: 19. Fragment size: 17. Step: 17.
-----------------
Fragment number, fragment range, average bigrams, distance average.
-----------------
1: 1 to 17:                 73            5.5
2: 18 to 34:                67            -0.8
3: 35 to 51:                70            2.2
4: 52 to 68:                74            6.2
5: 69 to 85:                66            -1.5
6: 86 to 102:               61            -6.6
7: 103 to 119:              64            -3.3
8: 120 to 136:              66            -1.8
9: 137 to 153:              62            -5.9
10: 154 to 170:             58            -9.8
11: 171 to 187:             73            5.4
12: 188 to 204:             71            3.1
13: 205 to 221:             59            -9.0
14: 222 to 238:             57            -10.2
15: 239 to 255:             64            -3.2
16: 256 to 272:             69            1.3
17: 273 to 289:             79            11.8
18: 290 to 306:             81            12.8 <---
19: 307 to 323:             72            4.8
20: 324 to 340:             67            -0.8

Row 18 is incorrect, it is row 17 so I’m getting the same result.

Fragment randomization bigram response test for: jarlve_row2.txt
-----------------
Period: 1. Fragment size: 17. Step: 17.
-----------------
Fragment number, fragment range, average bigrams, distance average.
-----------------
1: 1 to 17:                 45            -3.0
2: 18 to 34:                42            -5.8
3: 35 to 51:                45            -3.0
4: 52 to 68:                48            -0.5
5: 69 to 85:                50            1.6
6: 86 to 102:               46            -2.0
7: 103 to 119:              44            -4.1
8: 120 to 136:              51            3.2
9: 137 to 153:              48            -0.5
10: 154 to 170:             48            -0.3
11: 171 to 187:             49            1.0
12: 188 to 204:             53            4.3
13: 205 to 221:             49            0.7
14: 222 to 238:             46            -2.3
15: 239 to 255:             51            3.1
16: 256 to 272:             49            0.4
17: 273 to 289:             56            7.8 <---
18: 290 to 306:             49            1.2
19: 307 to 323:             46            -2.0
20: 324 to 340:             48            0.1

With period 1 row 17 is correctly indentified, probably because the row was inserted after encoding.

Early attempt at visualization for my second row cipher.

Brighter is a higher bigram response so more chance of being gibberish, row 17 clearly is fullbright. I think that by a process of summing row 17 should win over row 18.

267: 267 to 283:            75            7.3
268: 268 to 284:            77            9.7
269: 269 to 285:            77            9.1
270: 270 to 286:            77            9.7
271: 271 to 287:            77            9.8
272: 272 to 288:            77            9.6
273: 273 to 289:            79            11.7 <--- from
274: 274 to 290:            79            11.5
275: 275 to 291:            79            11.4
276: 276 to 292:            80            12.0
277: 277 to 293:            80            12.6
278: 278 to 294:            79            11.7
279: 279 to 295:            80            12.0
280: 280 to 296:            80            12.2
281: 281 to 297:            81            13.3
282: 282 to 298:            79            11.2
283: 283 to 299:            79            11.0
284: 284 to 300:            81            13.0
285: 285 to 301:            79            11.6
286: 286 to 302:            80            11.8
287: 287 to 303:            80            12.3
288: 288 to 304:            81            12.8
289: 289 to 305:            80            12.5
290: 290 to 306:            81            12.9
291: 291 to 307:            80            12.8
292: 292 to 308:            79            11.0
293: 293 to 309:            79            10.8
294: 294 to 310:            78            10.5
295: 295 to 311:            79            11.8 <--- to
296: 296 to 312:            78            9.9
297: 297 to 313:            75            7.1
298: 298 to 314:            74            6.6
299: 299 to 315:            74            6.3
300: 300 to 316:            74            6.5
301: 301 to 317:            73            4.9

This table is an excerpt of my second row cipher but with a stepping of 1. It can be seen that there is a wide section (about equal to the actual random fragment size?) of increased response so we should find something like that in the 340 also. I will work on that for tomorrow. Interesting stuff!

AZdecrypt

 
Posted : December 2, 2015 2:48 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

I came up with column 1 as the best possibility for gibberish.

Correct! Well done.

AZdecrypt

 
Posted : December 2, 2015 3:27 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

I updated my repeating ngram visualizer tool:

http://zodiackillerciphers.com/period-19-bigrams/

Now it shows repeating bigrams, trigrams, and fragments for the original 340 and the three best-scoring transposition schemes we’ve discussed so far (in terms of repeating ngrams).

http://zodiackillerciphers.com

 
Posted : December 2, 2015 3:34 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

.
Jarlve, I am glad that you are having fun with the shuffle row / column idea. Be my guest and take all the liberties that you want. If you want to skip working on smokie14 for a while or indefinitely, that is fine. Let me know and I will show you the distortion areas compared to the heat map.

Question: With a cipher that transposes plaintext into a period of 19 before encoding, why does shuffling of non-gibberish rows or columns most often not result in a higher count of period 19 bigram repeats or higher total repeat score, where the repeat scores are calculated with a probability formula using symbol counts and repeat counts as variables? Why does shuffling of gibberish rows or columns often result in higher repeat counts and total scores?

Does anybody want to take a crack at answering the question?
.

 
Posted : December 2, 2015 4:22 pm
(@mr-lowe)
Posts: 1197
Noble Member
 

@Mr lowe,

I’d say that’s a rather original interpretation and you did bring down the period 19 bigrams to period 1. I moved down your even columns by 1 as you mentioned because that gives the highest bigram count. It doesn’t solve though.

So the operations here are a column period 2 and an incremental per column (0,1,2,3…) up-shift. 38 bigrams.

Hi jarlve.. Can you bring the even column down by two from my original set up. . And run that test again. I know I’m a pain in the ass but I think that will fix my problem.

 
Posted : December 2, 2015 10:25 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

@smokie,

About your smokie14. I ran into a possible problem with AZdecrypt, it became unreponsive after processing about 3 million ciphers. But the program has processed more than this amount of ciphers previously. It could be a random lockup (computer issue) or a problem with the cipher input routine. I want to look at the possible program problem before working on smokie14 but if you want you can go ahead and show its scheme/etc. Because I probably won’t take another look at AZdecrypt until next year.

Question: With a cipher that transposes plaintext into a period of 19 before encoding, why does shuffling of non-gibberish rows or columns most often not result in a higher count of period 19 bigram repeats or higher total repeat score, where the repeat scores are calculated with a probability formula using symbol counts and repeat counts as variables? Why does shuffling of gibberish rows or columns often result in higher repeat counts and total scores?

1: Because actual rows/fragments contain language, which by nature is more repetitive and therefore contain more bigrams (on average) than a random mix of characters. So randomizing these will likely destroy this information and reduce the bigram counts.

2: On average, one random should not be better than another random and I think that’s not the issue here. These gibberish rows just have a higher response than non-gibberish rows, because there is more potential for increase.

With the 408 Zodiac likely copied (pulled down) a trigram for the last row, quite smart isn’t it.

@Mr lowe,

39 bigrams, no solve.

H+M8|CV@K<Ut*5cZG
N:^j*Xz6-z/JNbVM)
BpzOUNyBO+l#2E.B)
SMF;+B<MF<Sf9pl/C
_Rq#2pb&R6N:(+H*;
p+fZ+B.;+G1BCOO|2
#2b^D4ct+B31c_8Tf
(MVE5FV52c+ztZ1*H
dl5||.UqLcW<Sk.#K
-RR+4>f|p+dpVW)+k
Ucy5C^W(cFHl%WO&D
29^4OFT-+M>#Z3P>L
zF*K<SBKdEB+*5k.L
lXz6PYAG)pclddG+4
y.LWBOLKJy7t-cYAy
|TC7z|<z2p+l2_cFK
R)WkPYLR/9^%OF7TB
+kN^D(+4(8KjROp+8
|DpOGp+2|5J+JYM(+
>R(UVFFz9G++|TB4-

AZdecrypt

 
Posted : December 3, 2015 1:44 am
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

.
Jarlve, here is a message with about 77 period 19 bigram repeats where AB AB are counted as two. Do the row shuffle analysis, compare to the 340, and tell me what you think. Does shuffling individual rows typically or often times cause the count and score for period 19 repeats to increase or decrease? Why or why not?

EDIT: This is smokie16a.

9 11 25 10 5 10 29 27 35 9 13 7 15 24 14 15 11
26 2 30 3 1 19 17 5 25 18 29 34 17 33 12 19 3
8 6 19 32 9 28 25 34 12 14 16 24 6 19 13 18 8
31 32 10 29 27 35 9 13 7 21 25 11 4 7 1 12 26
29 32 28 8 30 6 33 16 16 26 17 18 2 24 3 31 19
34 26 12 1 13 9 17 28 8 30 12 14 34 18 4 31 32
7 5 16 33 19 17 1 13 25 12 31 27 14 6 1 35 11
28 33 10 29 27 35 34 14 12 24 18 8 9 32 7 7 25
20 26 17 12 30 28 8 5 12 33 34 18 28 8 16 29 11
27 9 13 7 24 22 15 26 16 30 32 3 5 25 18 29 17
24 20 26 13 2 30 28 18 5 16 28 8 31 32 7 24 18
28 9 13 7 23 14 19 16 16 33 3 10 34 14 6 6 21
25 18 8 1 7 29 16 35 28 8 26 2 30 17 18 15 31
16 28 33 6 9 18 25 34 28 8 1 18 21 8 5 32 29
4 9 24 25 21 29 11 27 2 26 16 30 2 14 16 13 9
32 15 31 16 1 4 25 3 5 31 13 4 1 35 11 28 8
24 29 8 31 20 26 10 9 27 35 30 4 21 25 11 27 2
5 3 33 12 24 12 23 17 35 1 20 26 34 29 21 9 11
27 32 14 18 7 25 20 30 23 33 19 12 23 13 31 12 5
2 24 3 1 19 17 26 23 14 19 21 29 35 11 28 16 23

Have fun with that and I will try to post tomorrow morning before my commute.
.

 
Posted : December 3, 2015 6:27 am
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

@smokie,

If possible, I think we should come to agree on a basic counting method for bigrams. AB AB = 1, because we are counting repeats. AB AB AB AB AB = 4, etc. I don’t see how your latest cipher could relate to the 340. It has only 35 symbols, 97 period 1 bigram repeats and 43 period 19 bigram repeats.

Does shuffling individual rows typically or often times cause the count and score for period 19 repeats to increase or decrease? Why or why not?

On average, random fragments will show a higher increase over non-random fragments. Because non-random fragments contain language information that are more repetitive by nature.

Take p1 and r1 in my plaintext library, r1 is just the randomized counterpart of p1 with the same frequencies.

p1 period 1:
bigram count method 1: 191
bigram count method 2: 511 <– more than double of r1

r1 period 1:
bigram count method 1: 133
bigram count method 2: 221

Count method 2 is more actual when factoring in encoding because it gives more weight to more frequent appearing bigrams. It is as follows. Per fragment count the bigrams (AB AB AB, c=3) and then c*(c-1). Total all fragments. Can also divide by 2 afterwards since all added numbers are even. It’s the set of triangular numbers: 1, 3, 6, 10, 15, 21… (+1, +2, +3, +4, +5, +6…).

AB AB = 2 (1)
AB AB AB = 6 (3)
AB AB AB AB = 12 (6)
AB AB AB AB AB = 20 (10)

AZdecrypt

 
Posted : December 3, 2015 1:40 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

I’ll ask a question now. Right shifting the columns/rows by 1 causes radical and strange changes/shifts in the 340’s bigrams at various periods, is that explicable?

dHER>pl^VPk|1LTG2
)Np+B(#O%DWY.<*Kf
JBy:cM+UZGW()L#zH
2Spp7^l8*V3pO++RK
/_9M+ztjd|5FP+&4k
(p8R^FlO-*dCkF>2D
|#5+Kq%;2UcXGV.zL
9(G2Jfj#O+_NYz+@L
Kd<M+b+ZR2FBcyA64
--zlUV+^J+Op7<FBy
OU+R/5tE|DYBpbTMK
F2<clRJ|*5T4M.+&B
Rz69Sy#+N|5FBc(;8
+lGFN^f524b.cV4t+
+yBX1*:49CE>VUZ5-
2|c.3zBK(Op^.fMqG
LRcT+L16C<+FlWB|)
p++)WCzWcPOSHT/()
c|FkdW<7tB_YOB*-C
+>MDHNpkSzZO8A|K;

Before shift:
Period 15 (mirrored): 41 bigrams, 2 trigrams.
Period 19: 37 bigrams, 2 trigams.

After shift:
Period 15 (mirrored): 36 bigrams, 3 trigrams.
Period 19: 45 bigrams, 4 trigrams. <—

Basicly, performing the operation causes period 15 dominance to shift toward period 19 while increasing the bigram and trigram count even further. What sort of transposition could cause such response?

AZdecrypt

 
Posted : December 3, 2015 2:38 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

.
Jarlve, I can count bigram repeats your way as well, but my count method uses the number of repeats in my probability formula:

Score A19B = 1 / [ ( count A / 340 ) * ( count B / 340 ) ] ^ count A19B. That is the probability of any particular bigram repeat with a random shuffled message. But I also find the natural logarithm of the score when ranking a list of different repeats, such as A19B, C19D, and E19F.

However, I hope that you still do the random row shuffle test on smokie16a, which I named the message two posts above. You are right that the message has fewer symbols and a higher number of period 1 repeats. The purpose for the message is to compare with the 340 and some of the transposition messages that you have been making, taking into account the values from your row shuffle testing.
.

 
Posted : December 3, 2015 2:43 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

I’ll ask a question now. Right shifting the columns/rows by 1 causes radical and strange changes/shifts in the 340’s bigrams at various periods, is that explicable?

Do you mean moving columns 2-16 one column to the right and taking column 17 and moving it to column 1?

 
Posted : December 3, 2015 2:47 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

However, I hope that you still do the random row shuffle test on smokie16a, which I named the message two posts above. You are right that the message has fewer symbols and a higher number of period 1 repeats. The purpose for the message is to compare with the 340 and some of the transposition messages that you have been making, taking into account the values from your row shuffle testing.

Okay. Period 19 right?

Do you mean moving columns 2-16 one column to the right and taking column 17 and moving it to column 1?

Yes. Here’s a numeric version, have kept the original numbering.

17 1  2  3  4  5  6  7  8  9  10 11 12 13 14 15 16
33 18 5  19 20 21 22 23 24 25 26 27 28 29 30 31 32
41 20 34 35 36 37 19 38 39 15 26 21 33 13 22 40 1
16 42 5  5  43 7  6  44 30 8  45 5  23 19 19 3  31
54 46 47 37 19 40 48 49 17 11 50 51 9  19 52 53 10
21 5  44 3  7  51 6  23 55 30 17 56 10 51 4  16 25
11 22 50 19 31 57 24 58 16 38 36 59 15 8  28 40 13
47 21 15 16 41 32 49 22 23 19 46 18 27 40 19 60 13
31 17 29 37 19 61 19 39 3  16 51 20 36 34 62 63 53
55 55 40 6  38 8  19 7  41 19 23 5  43 29 51 20 34
23 38 19 3  54 50 48 2  11 25 27 20 5  61 14 37 31
51 16 29 36 6  3  41 11 30 50 14 53 37 28 19 52 20
3  40 63 47 42 34 22 19 18 11 50 51 20 36 21 58 44
19 6  15 51 18 7  32 50 16 53 61 28 36 8  53 48 19
19 34 20 59 12 30 35 53 47 56 2  4  8  38 39 50 55
16 11 36 28 45 40 20 31 21 23 5  7  28 32 37 57 15
13 3  36 14 19 13 12 63 56 29 19 51 6  26 20 11 33
5  19 19 33 26 56 40 26 36 9  23 42 1  14 54 21 33
36 11 51 10 17 26 29 43 48 20 46 27 23 20 30 55 56
19 4  37 25 1  18 5  10 42 40 39 23 44 62 11 31 58

AZdecrypt

 
Posted : December 3, 2015 4:04 pm
Page 40 / 96
Share: