Zodiac Discussion Forum

Homophonic substitu…
 
Notifications
Clear all

Homophonic substitution

1,434 Posts
21 Users
0 Reactions
304.2 K Views
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

I made a cipher with a 19 part message, possibly relating to the period 19 observation in the 340. I’m trying to find discrepancies.

MT4F`RQCUG'hE5-_d
0T+8<,3V1;?<cVVD1
S><WAXH*&7iVO#C'
1.`X?Oh$_M[5U&'G<
96F1R*_]E)A<"8Qc&
I3#(6S;Y=hVVIDUK
75'>Be.$]WMVFi;gV
C?1-R_A&`[JUJG0*
7'V9Y.V_&CMQEN8a+
'",d_#IHDFc3B/hVN
?RA>(W/KJ<$7.MF
+*41V9YX],JRiIY&H
OKQ)gIV=#$<'`1<aA
G+J;76C1U4Y?VIT
*eVd]E8N,D;_>VWhU
i`)H&G<(V1KC=S.E-
M8!'[_5"/BFQ&6X#<
;?0'6_ec$D]RgU-+S
1&Q4,5>W*i'THKA`a
G3hc<VVE["OdBC0+,

Some interesting observations, 11 bigram period 1 (very low), but a small peak at period 2 with 24 bigrams. Reminds me of the 2nd half of the 340. Ofcourse a pronounced peak at period 19 but mirrored is lower than normal (as expected) and it’s the other way around for the 340. Message solves perfectly when cast into a 19 by 18 grid and reading NS-WE. We are looking for something that creates the highest peak at period 15 mirrored and a slightly lower peak at period 19 normally.

Bigram period comparison with the 340 to show what I mean (also note that the 340 has a higher base then my message, maybe we should try to figure that into the equation as well).

AZdecrypt

 
Posted : September 20, 2015 12:27 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

I rank them with a formula that considers total count of the symbols involved and total count of repeats. For instance, the 5 – 19 has both high count symbols and therefore not unlikely to have two at period 18. But because there are four at period 18, that is much more significant.

1/ [ [ ( Symbol A Count / 340 ) * ( Symbol B Count / 340 ) ] ^ Number of Occurrences ] * 100000

I used a reciprocal multiplied by 100,000 because it is easier for my brain to compare two scores that way. The higher the score, the lower the probability.

29 – 42 scores the highest because 29 has count of six and 42 has count of four, but they occur in period 18 bigrams three times!

I will work on a little table this morning and post, show some numbers to compare.

EDIT: Here is a comparison of 340, c_p1 and r1_s4_p1. 340 has seven that score over 1000, c_p1 has none, and I was surprised to see that r1_s4_p1 had two. In that message, 7- 24 occurs three times, and 2 – 60 occurs two times but they have relatively low symbol count ( I was somewhat amazed by 2 – 60 because there are only two count of 60). Nevertheless, the 340 has more, higher scores. I am saying to examine the period 18 bigrams that score the highest, which are most likely to be caused by the cipher. The low scoring period 18s are less likely to be caused by the cipher.

 
Posted : September 20, 2015 1:40 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

Hey smokie, I really like your new in-depth bigram analysis and formula. I see that you also added a cycle score, that’s cool.

By the way, a regular bigram is period 1 and the peak in the 340 is period 19. Sorry to have caused confusion. Also when considering period 19 you can choose to or not to wrap around the cipher. If not wrapping around the cipher you should set the length of the 340 to be (340 – period) in your formula.

29 – 42 scores the highest because 29 has count of six and 42 has count of four, but they occur in period 18 bigrams three times!

What if these are 1:1 substitutes which often couple in english language? The score for this couple seems really high, maybe reduce the weight a bit?

AZdecrypt

 
Posted : September 20, 2015 7:10 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Hey smokie, I really like your new in-depth bigram analysis and formula. I see that you also added a cycle score, that’s cool.

By the way, a regular bigram is period 1 and the peak in the 340 is period 19. Sorry to have caused confusion. Also when considering period 19 you can choose to or not to wrap around the cipher. If not wrapping around the cipher you should set the length of the 340 to be (340 – period) in your formula.

29 – 42 scores the highest because 29 has count of six and 42 has count of four, but they occur in period 18 bigrams three times!

What if these are 1:1 substitutes which often couple in english language? The score for this couple seems really high, maybe reduce the weight a bit?

Jarlve, check out Practical Cryptography. It looks to me like a regular bigram is "step" 0. Now I am not sure whether there is a difference between step and period.

http://practicalcryptography.com/crypta … id-cipher/

I shuffled the 340 30 times. I saved the shuffles so if you want to see any one in particular just ask. The scores are rank top ten. The statistics show that the 340’s period 18 or 19 scores are high. However, I may have to amend or take back my statement about 5 – 19. It seems that the highest scores are often associated with high count symbols.

So I am thinking, identify the period 18 or 19 bigrams that are unusually high and put those under the microscope first. This morning I woke up thinking that one possibility is that the symbols are a signal to do something. Not a letter, but a signpost of some sort. Especially since they span a total of 20 symbols. Maybe find the highest scoring ones and check 20, 40, 60 spaces backwards and 20, 40, 60 spaces frontward etc. to see if there is anything unusual.

I wrap around.

ST

 
Posted : September 20, 2015 7:46 pm
daikon
(@daikon)
Posts: 179
Estimable Member
 

There’s another few ways to increase the bigrams,

Set the cipher in a 20 by 17 grid and reading SW-SE (diagonal) produces 38 bigrams and 35 mirrored.

The thing is, Z340 is pretty low on the bigram repeats, so there are a lot of ways of increasing their count. Relatively "a lot". Ah, yeah, I knew I’ve done this before. Here’s a quote from one of my old posts: "I’ve ran a test that did all possible symbol-wise transpositions of the width 9 (i.e. all subsequent groups of 9 symbols were rearranged/shuffled the same way). There are 362,880 possible ways. For Z340, nearly 5% of the transpositions increased its bigram repeats. For Z408, *none* of them increased the bigram repeats. Using width 9 transpositions alone, there are 581 distinct ways of increasing Z340’s bigram repeats by at least 30%."

Why 9-column transposition? It’s the largest number of columns I was willing to wait for. Iterating through all possible 17-column transpositions would’ve taken years, if not centuries. So unfortunately I don’t think that optimizing for bigram repeats is a good way of solving Z340. It might reduce the number of possible candidates to take a closer look at, but it will still be 1-5% of a very-very large number.

 
Posted : September 20, 2015 10:59 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

Daikon what is your thought on the bigram peak a period 19 or 15 mirrored? Not wrapping around the cipher it might be as rare as 1 in a million (versus 340 randomized) and we didn’t have to search through millions of variations.

AZdecrypt

 
Posted : September 20, 2015 11:56 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Simple Proposed Cipher:

1. Zodiac drafted the plaintext horizontally and there were slightly more than 323 symbols. The resulting number of rows is 19 in most of the columns.

2. Zodiac re-drafted the plaintext horizontally by copying the plaintext as found in its vertical order, from leftmost column to rightmost column.

3. Zodiac used cycle substitution with randomization.

4. Zodiac filled in gibberish at the end.

Note in the picture that high scoring period 18/19 bigrams are not included in the top rows.

The plaintext that appears in positions 21 and 22 in this example are a high frequency bigram, with a repeat at 141 and 142. When Zodiac got done with Step 2, they appeared as period 18/19 bigram repeats in plaintext. Then Zodiac used cycles, with some randomization, and because we know that more cyclic messages do have more bigram repeats, they appear as a period 18/19 bigram repeat in ciphertext.

Smokie

 
Posted : September 21, 2015 2:38 am
daikon
(@daikon)
Posts: 179
Estimable Member
 

Daikon what is your thought on the bigram peak a period 19 or 15 mirrored? Not wrapping around the cipher it might be as rare as 1 in a million (versus 340 randomized) and we didn’t have to search through millions of variations.

The bigram spike at period 19 is one of the things I noticed about Z340 (although I’m sure I wasn’t the first one). I think it’s significant, so I spent some time exploring various transposition possibilities, but couldn’t find anything. Why I think 19 columns makes sense for a columnar transposition? Think of how Z would construct the cipher. He uses his beloved 17-column matrix, writes out the message into it. Ends up with 19 rows and the last 20th row would be incomplete (say, only 1 or 2 letters). Then he would start reading his message by columns, and writing to another 17-column matrix. And then he would add filler at the end to complete the 20th row. I know it’s a bit confusing, but you just need to remember that we are de-coding, or un-transposing the cipher that Z coded, or transposed. We need to think in the terms of undoing the transposition. And 19 (and a bit) rows/columns by 17 columns/rows just fits. But clearly if he did that, there is something else that is going on. Or I made a mistake somewhere, which is also possible. 🙂

In any case, the important thing to remember about transpositions is that you need to know the exact length of the message. We already know that Z408 had filler at the end, and the whole "ZODAIK" at the end of Z340 already looks mighty suspicious, so it is fairly certain that the length of the message in Z340 is *not* 340. Why it’s important? When you wrap around the edge of the cipher to the next column, you need to know where the incompletely filled last row ends, otherwise it changes the resulting untransposed cipher. Not hugely, but nonetheless. I’ve actually tested all possible cipher lengths from 304 (19×16) to 342 (19×18), but it didn’t solve to anything. Actually, that was a while back, when I was still using 5-grams, I think. Might be time to revisit with higher-order N-grams, and test mirror and snake routes as well.

 
Posted : September 21, 2015 3:36 am
daikon
(@daikon)
Posts: 179
Estimable Member
 

Simple Proposed Cipher:

1. Zodiac drafted the plaintext horizontally and there were slightly more than 323 symbols. The resulting number of rows is 19 in most of the columns.

2. Zodiac re-drafted the plaintext horizontally by copying the plaintext as found in its vertical order, from leftmost column to rightmost column.

3. Zodiac used cycle substitution with randomization.

4. Zodiac filled in gibberish at the end.

Hehe, while I was writing my post, you said the exact same thing, how cool is that. 🙂

 
Posted : September 21, 2015 3:38 am
daikon
(@daikon)
Posts: 179
Estimable Member
 

Not wrapping around the cipher it might be as rare as 1 in a million (versus 340 randomized) and we didn’t have to search through millions of variations.

Forgot to ask, what do you mean by this exactly? Specifically, "not wrapping around the cipher" and in connection with the rest?

 
Posted : September 21, 2015 3:40 am
(@mr-lowe)
Posts: 1197
Noble Member
 

Daikon what is your thought on the bigram peak a period 19 or 15 mirrored? Not wrapping around the cipher it might be as rare as 1 in a million (versus 340 randomized) and we didn’t have to search through millions of variations.

The bigram spike at period 19 is one of the things I noticed about Z340 (although I’m sure I wasn’t the first one). I think it’s significant, so I spent some time exploring various transposition possibilities, but couldn’t find anything. Why I think 19 columns makes sense for a columnar transposition? Think of how Z would construct the cipher. He uses his beloved 17-column matrix, writes out the message into it. Ends up with 19 rows and the last 20th row would be incomplete (say, only 1 or 2 letters). Then he would start reading his message by columns, and writing to another 17-column matrix. And then he would add filler at the end to complete the 20th row. I know it’s a bit confusing, but you just need to remember that we are de-coding, or un-transposing the cipher that Z coded, or transposed. We need to think in the terms of undoing the transposition. And 19 (and a bit) rows/columns by 17 columns/rows just fits. But clearly if he did that, there is something else that is going on. Or I made a mistake somewhere, which is also possible. :)

In any case, the important thing to remember about transpositions is that you need to know the exact length of the message. We already know that Z408 had filler at the end, and the whole "ZODAIK" at the end of Z340 already looks mighty suspicious, so it is fairly certain that the length of the message in Z340 is *not* 340. Why it’s important? When you wrap around the edge of the cipher to the next column, you need to know where the incompletely filled last row ends, otherwise it changes the resulting untransposed cipher. Not hugely, but nonetheless. I’ve actually tested all possible cipher lengths from 304 (19×16) to 342 (19×18), but it didn’t solve to anything. Actually, that was a while back, when I was still using 5-grams, I think. Might be time to revisit with higher-order N-grams, and test mirror and snake routes as well.

I am with you on all of these comments filler will create train wrecks as will spelling mistakes as they act like a wild card… drop the last line..or two..i like that a lot

this is just some sifting I have been playing with i spaced it out and added my own touch but not much.. its still ugly nonsense.. I was trying to see if I could tie some words together that are apart that would let me see how he constructed it .. but its to garbled. although somewhat amusing with some words. :oops: ANYWAYS below was a "solve someone put up a few weeks back I will try and dig out who did.

un for of played the
noseding came r in their
mised to the make
cover alifives of th
avisanal action alive to rangle verthed
icon in the may be a bad the games and how ever is t so far
eminists are for e sever this
lowing land sorth tr
acount a rang her ist
him a strangled then
a strong s are dbygrose
curry back blongraded
was the mouth is a
won of use cola went fort
we had practition the wings are sided
the come based in tho

 
Posted : September 21, 2015 8:57 am
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

daikon and smokie I really like your assessments because it seems to be the simplest explanantion. But if actual, there must be some extra step somehow, and I’m wondering what it could be. You could consider he did another transposition after the vertical one but that would have killed the bigram repeats. I’ve explored the mirrored, flipped and reversed variations and some oxcart routes and it doesn’t make much sense because we can solve oxcart ciphers just the way they are.

I’m beginning to think the options must be somewhat limited. You could say that he perhaps skipped/flipped/whatever every n’th symbol while doing the vertical transposition or something in that manner (parts cipher) but then we would find a bigram peak at period 2 etc. What have we missed that does not destroy bigram repeats?

AZdecrypt

 
Posted : September 21, 2015 10:43 am
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

I don’t know. I am still trying to figure out how the number of repeats is so high.

I have been making messages with "I like killing," shuffling the plaintext like the last step in a columnar transposition cipher, different one key distributions of 63 symbols and randomization percentages. I got total count 50 one time, which is only 25 repeats. Zodiac 340 has a lot more. It seems very difficult to have so many repeats when using 63 symbols, unless high frequency plaintext are replaced with only a few symbols or there was another process.

EDIT: Or maybe the actual message has a lot of repeating words. Maybe he said "killing" a dozen times or something like that.

There are several high count 1:1. Some cycle well with others, and some do not. I am thinking, we should make a list of all of these repeats, and look at the overall cycle activity of the respective symbols. Rank the symbols by period 18/19 repeats and by cycle score, see if there is a correlation. It would seem that to get so many repeats, the symbols with higher period 18/19 repeat scores would only cycle well with none, one or two other symbols. They could have high overall cycle scores, but the number of symbols that they cycle with should be low.

Still tough to use 63 symbols and get so many repeats. I will mess around with it. Maybe start with a key with 26 symbols. See what happens. Then start adding symbols. Maybe add in increments of 26 to make a flat key.

EDIT: It is also very difficult to have so many repeats with an overall cycle score of 63400. When I flatten out the key or use less randomization to increase the number of bigrams, the cycle score is much higher.

The cycles are fractured but there are patterns to them… I still think that some of the symbols map to more than one letter. That may help with the 63 symbol problem…

EDIT: I also cannot get close to the period 18/19 repeat scores that I posted above. My experiments so far only score about the same or maybe a little better than high scoring randomizations.

 
Posted : September 21, 2015 12:49 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Tonight I updated my spreadsheet so it will transpose a 17×20 grid and make period 19/20 bigram repeats. I have to do some testing to find out if there is the same number of period 0/1 bigram repeats before transposing as there are period 19/20 bigram repeats after transposing. I did it once, and it worked.

In any case, Jarlve, when you get time, may I request that you post to the first page un-transpositions of the 340 with 323, 324, 325, etc. length messages which would cause period 18/19 bigram repeats and with gibberish at the end. Daikon said that he tried to solve them and it didn’t work. But I am thinking that we should consider another aspect in addition to transposition and cyclic homophonic substitution.

Just on the top, I am thinking about expanding the symbols that do not cycle well with other symbols, before un-transposition. Starting with the +. He couldn’t have shuffled columns around, because that would have destroyed the period 18/19 bigram repeats.

I should look back, but am thinking that symbols that don’t cycle well with other symbols may be polyalphabetic.

EDIT: Or maybe a more practical approach would be to first compare the different un-transposed messages and try to figure out how long the message was, among other things.

I will continue to work on creating messages that, after transposition and cyclic homophonic substitution with randomization, have similar statistics to the 340.

Good night.

 
Posted : September 22, 2015 5:09 am
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

@smokie, I added 3 alterations of the 340 based on the bigram period 19 repeats (check under alterations of the 340). I think doranchak’s original interpretation (based on the scheme of the 21 by 21 magic square?) is the most interesting because it also creates the most 3-grams.

And there’s something else going on with it, when applied any vertical transposition scheme to it the cycle scores approach that of the normal 340! Is this just a magic trick or is there more going on?

340_d1_n-e.txt

NO+p8>kAMS|DzKHZ;
kB-d_CWYc<O|7BFt*
+WT+c/)P(WO)CSpzH
)LFL1lR6WcCBT<|++
M.Oq3pGz^2B.|Kfc(
Vy4UB9ZXC51E-*>+:
b+^.+fcl5VG24F4tN
N(9|;S58yFR#Bz+c6
JM2|.<*+c5&lTBR4F
/YK5BOtpUEb+|TRDM
zJFl+BUOyVp-+7-^<
K+cdZy<RAM26+F4bB
@J_LfN9jY(#zGO+2+
V52.+UzKcLqX|%G#;
C(lkpOF8->R*2^dDF
|4+5kzF/tP_j+9d&M
8+p*+pVR73K^p2lOS
M(J+)BULyZ#:GzcWH
+DKBWf(Y)#.NO<p%*
H^LEVTRPG>k2p|dl1

1  2  3  4  5  6  7  8  9  10 11 12 13 14 15 16 17
7  18 19 20 21 22 23 24 25 26 2  11 27 18 28 29 30
3  23 31 3  25 32 33 34 35 23 2  33 22 10 4  13 15
33 36 28 36 37 38 39 40 23 25 22 18 31 26 11 3  3
9  41 2  42 43 4  44 13 45 46 18 41 11 14 47 25 35
48 49 50 51 18 52 16 53 22 54 37 55 19 30 6  3  56
57 3  45 41 3  47 25 38 54 48 44 46 50 28 50 29 1
1  35 52 11 17 10 54 5  49 28 39 58 18 13 3  25 40
59 9  46 11 41 26 30 3  25 54 60 38 31 18 39 50 28
32 24 14 54 18 2  29 4  51 55 57 3  11 31 39 12 9
13 59 28 38 3  18 51 2  49 48 4  19 3  27 19 45 26
14 3  25 20 16 49 26 39 8  9  46 40 3  28 50 57 18
61 59 21 36 47 1  52 62 24 35 58 13 44 2  3  46 3
48 54 46 41 3  51 13 14 25 36 42 53 11 63 44 58 17
22 35 38 7  4  2  28 5  19 6  39 30 46 45 20 12 28
11 50 3  54 7  13 28 32 29 34 21 62 3  52 20 60 9
5  3  4  30 3  4  48 39 27 43 14 45 4  46 38 2  10
9  35 59 3  33 18 51 36 49 16 58 56 44 13 25 23 15
3  12 14 18 23 47 35 24 33 58 41 1  2  26 4  63 30
15 45 36 55 48 31 39 34 44 6  7  46 4  11 20 38 37

AZdecrypt

 
Posted : September 22, 2015 11:44 am
Page 21 / 96
Share: