Zodiac Discussion Forum

Homophonic substitu…
 
Notifications
Clear all

Homophonic substitution

1,434 Posts
21 Users
0 Reactions
304.3 K Views
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

So I think that what you are saying is that with the 340, there are a lot of strings of symbols that are 17 symbols long where there are not any repeating symbols. Is that correct? I think that would also have something to do with the number of different symbols in the message. If I understand you correctly, I also wonder what the test ciphers show because they also have 63 symbols. Can the observation be simulated? You may have already stated so. Just need to get up to speed with some of the other stuff that you have been talking about.

I have been doing some testing on my own. To answer the question in my mind about randomization and cycles. I have been showing quantities of certain cycle lengths in prior posts. Statistics about how many ABABAB cycles are in the 340, etc. And I have randomly scrambled the 340 in its entirety to compare with the 340 cycle statistics to determine about how many of the cycles in the 340 are true or false. My question was whether, if I make a message with about the same number of high, medium and low count symbols, and select the symbols at random, will the statistics be comparable to the statistics for a randomly scrambled 340?

The answer to that question is yes. I will post the stats a bit later. And another question. What degree of randomization did Zodiac use with the 340. The answer is More than 20% and less than 35%. About 25% to maybe 30%. And I will post stats a bit later.

The next question is, does the apparent random symbol selection have anything to do with why the 340 cannot be solved? Is that evidence of the second step? He did that to some degree with the 408, and it solves fine. So, is there a relationship between apparent random symbol selection and the reason whey the 340 has not yet been solved?

Mr. Lowe: Zodiac cycled his symbol selection when enciphering the message, and the cycles work from left to right, from top to bottom. Just like normal English writing and the 408. I think that if Zodiac did anything like make a crossword puzzle or wrote stuff backwards or diagonal or whatever, it seems that he would have had to do that with the plaintext before encipherment.

 
Posted : August 20, 2015 3:16 am
(@mr-lowe)
Posts: 1197
Noble Member
 

smokie if you replace all of the ands with a + sign and then add filler at the end to make up for the left over spaces and throw in a few spelling errors on the "THEREDGLAREOF" cipher, how well how fast and how different would it solve then.

 
Posted : August 20, 2015 4:58 am
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

So I think that what you are saying is that with the 340, there are a lot of strings of symbols that are 17 symbols long where there are not any repeating symbols. Is that correct? I think that would also have something to do with the number of different symbols in the message. If I understand you correctly, I also wonder what the test ciphers show because they also have 63 symbols. Can the observation be simulated? You may have already stated so. Just need to get up to speed with some of the other stuff that you have been talking about.

Yes. With perfect cycling and increasingly more symbols the red line will continue to left-shift and the mountain will become flatter. It actually represents, to some degree, the distribution of the cycles, short, medium and long cycles. The further the line is left-shifted the more chance of finding longer cycles. It does show that the cycles probably cut off sharply at some point (17) and that there are probably longer cycles to be found in the horizontally mirrored version of the 340. I’m not sure how well that correlates with your findings.

I will create non-repeat images for all of the test ciphers in the main post and we’ll see what comes out.

The next question is, does the apparent random symbol selection have anything to do with why the 340 cannot be solved? Is that evidence of the second step? He did that to some degree with the 408, and it solves fine. So, is there a relationship between apparent random symbol selection and the reason whey the 340 has not yet been solved?

Yes, it’s what I expressed some posts back. Could it be so simple that the 340 is just like the 408 but with just much more "random/individual polyalphabetic symbols"? It seems that over 10% random symbols could account for the 340 being unsolved.

I created a polyalphabetic cipher with just this scheme. Per symbol I introduced a 15% chance to select a random symbol instead of following the cycle. The message does not solve though it scores a bit higher than the 340 so I would guess that if actual for the 340 at least 15% to 20% should be random. Expanding high count symbols should not work because the scheme is spread evenly across all symbols. Consider this an easy one, if we would try to crack it.

Perfect cycles + 1 high count 1:1 substitute + 15% chance of assigning a random symbol: c_s1_pi15_p1.txt

I’m also thinking that the "+" symbol is very much in line with a 1:1 substitute because it frequently appears as a double. kiLLing, thriLLing, etc.

AZdecrypt

 
Posted : August 20, 2015 9:45 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Jarlve, I agree with everything that you just said. Sorry I haven’t been prompt lately. I have been a little under the weather and excited about a new job prospect.

I counted the inter-cycle random symbols in c_s1_pi15_p1. There were 43 of them for 11.9%. The first several symbols for the E cycle were random. I can see how the scheme would make for a difficult solve. I think that is approximately what Zodiac may have done. Like you said, spread out the randomization over a lot of different symbols and keep the high count 1:1 as just that.

I did a little intra-cycle random symbol analysis to figure out just how much Zodiac may have selected his cycle symbols at random. I used the first 340 of the 408 and made a key to approximately match the 340 distribution of symbol count:

test 1 key:

A 1 36
B 5
C 6 7
D 8 9
E 10 11 12 13 14 15 16
F 18
G 19 20 52
H 21 22 23 24 2 4
I 25 26 27 28 29 37
J
K 30
L 31 32 33
M 34
N 35
O 39 40 41 42 43
P 44 45 3
Q
R 46 47 48
S 49 50
T 53 54 55 56 57 38
U 59 60 51
V 61
W 62 63
X 17
Y 58
Z

test 1 total number of symbols in left column, count per symbol in right column. For example, there are twelve symbols with a count of 4.

1 1
9 2
5 3
12 4
10 5
9 6
7 7
1 8
3 9
2 10
2 11
0 12
1 13
1 21

1. Every column except A is the mean number of cycles over ten randomizations.

2. Comparing A and B, we see what we already know, that Zodiac used cycles because the 340 has more cycles than if we completely scramble the 340.

3. Compare B and C. I made ten test 1 messages with 100% random symbol selection. The closeness of the numbers shows that there is little if any difference between scrambling a message and using 100% random symbol selection when making a comparison to the actual 340 cycle stats.

4. Compare A with D through H. The numbers show that Zodiac used intra-cycle randomization of about 25% – 30%. It’s kind of tough to draw an exact number. But with only 20% randomization, there are going to be some cycles that score higher than with the 340. With 35% to 40% randomization, there are generally not as many high scoring cycles.

5. Compare A to I, or c_s1_pi15_p1. Despite your 11.9% use of inter-cycle random symbols, your new message has a lot more cycles in it than the 340.

I think that you are on to something. Spreading out the inter-cycle randomization over a lot of different symbols makes it pretty difficult to find the symbols that don’t really cycle well with other symbols, or in other words, to find the wildcards. I would have to conclude that although this may not be the exact scheme that Zodiac used, he may not have necessarily been a total amateur after all.

I am in a little state of slowing down on this site, needing to gather myself and having other issues. But I am still thinking about the 340 and what he may have done. I have been thinking about maybe diagramming the cycles somehow. How to figure out what symbols cycles with each other. I have a lot of ideas, and I think that this is a good start. He used some combination of intra-cycle randomization, and a scheme of inter-cycle randomization, or inter-cycle use of symbols that may not have been random. I imagine him plucking a symbol from the next letter to the right or the left on the key once in a while. It would be nice if he had a scheme so that another person intended to be able to solve would know what to do. Aren’t most ciphered messages sent from person A to person B, where person B knows how to solve? It would be interesting to know if the message was constructed that way, or just to confuse.

Mr. Lowe, I read you idea about replacing the word "and" with a + symbol. I don’t have Jarlve’s high multiplicity solver on my computer, but guess that it would make solving more difficult. I am sure that I have read about that idea on some website somewhere, where the + may have represented a word instead of just a letter. It may be, but Zodiac didn’t do that with the 408. He did, however, use the filled triangle to represent three different letters in the 408, making it polyalphabetic. You may be right, I don’t know. But my thinking from the beginning is that Zodiac just did "more" of that with the 340. A slight variation of the 408 scheme that makes the message unsolvable.

 
Posted : August 21, 2015 3:37 am
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Jarlve, I cannot do much right now. I just got a very demanding new job. But I also have prospects for another job that would allow me to spend evenings and some time on the weekends on the 340. In any case, I was wondering what you are thinking lately.

I am wondering about you new high multiplicity solver. What is the smallest portion of a 340 ciphertext message that you could solve? I wonder if there are areas of the 340 that may be easier to solve. I think about taking out the dark areas shown by the hot spot program and see what happens. Maybe just try to solve the lighter areas.

 
Posted : August 23, 2015 5:07 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

@smokie,

I understand and I’m also taking it slowly, and will get back to this thread (and your previous post) eventually.

As for the solver, with a high scoring plaintext AZdecrypt097 (with 5-grams) can go a little over 0.4 multiplicity (if the encoding is perfect). The smallest portion for which I feel reasonably comfortable is about 170 characters and 63 symbols (0.37) since it has solved quite a few of these in testing. As I said, it all depends though.

I’ve tried removing some of the dark spots, most noteably removing the 2nd line improves the score more than any other. But it doesn’t make much sense since allof of these symbols are first time appearances.

Good luck with your new job and prospect!

AZdecrypt

 
Posted : August 23, 2015 8:31 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

I am at a bit of a stalemate, but have been messing around with an idea today. I was thinking about trying a manual solve (yeah right) by identifying some high frequency words or couples of words that have two pairs of letters. Then finding pairs of high scoring two symbol cycles that match the pattern. After working with Zodiac’s letters, I came up with a nice short list:

KILLING
PEOPLE
THISIS
WITHTHE
COLLECTING
WHENTHE

I also have a short list of two symbol cycles that closely step together and where about 70% are true, depending on the score:

I’ll see if I can find a match or two just for fun.

COLLECTING = No find.

 
Posted : August 30, 2015 1:59 am
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

I couldn’t resist my curiosity about how many words have those kinds of patterns, so here’s the result of an exhaustive dictionary search:

http://zodiackillerciphers.com/two-pair-pattern.html

The first number is the relative frequency in English.

(Warning: it’s a very large list)

http://zodiackillerciphers.com

 
Posted : August 30, 2015 5:31 am
glurk
(@glurk)
Posts: 756
Prominent Member
 

You must have a ridiculously huge dictionary.

-glurk

——————————–
I don’t believe in monsters.

 
Posted : August 30, 2015 4:09 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

I use "all.num.o5" from here: http://www.kilgarriff.co.uk/bnc-readme.html

It has over 200,000 words in it.

http://zodiackillerciphers.com

 
Posted : August 30, 2015 4:19 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Your dictionary is cool. I did a pretty exhaustive search for the configurations. And I lowered the bar for 6-30:

6 30 6 30 6 30 6 6 30 6 30 6 30

But I did not find many situations where pairs of two symbol cycles cluster together. Most of them are at the end of Row 11 or beginning of Row 12. I don’t know if anyone has noticed anything different about these areas or not. I found another configuration on Row 13.

I don’t know what else to do. :-(

 
Posted : August 30, 2015 4:55 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

And maybe I found "killing people".

Here are the cycles:

14 23 23 23 23 23 14 23 14 23 14 23 14 23 23 ( Z340 has 23 such cycles; a random scramble will have between two and ten )

31 37 31 37 31 37 31 37 31 37 31 37 37 31 ( Z 340 has two such cycles; a random scramble will have between 0.5 and 1.0 )

36 41 36 41 36 41 36 41 36 36 36 36 36 36 ( Z340 has 23 such cycles; a random scramble will have between two and ten )

6 30 6 30 6 30 6 6 30 6 30 6 30 ( Z 340 has 35 such cycles, not including the second half; a random scramble will have between twelve and twenty three )

And here is "killing people." I did not find the same configurations anywhere else in the message for high scoring cycles.

Note that three of the cycles have a break or change in pattern at about mid point. I wonder if Zodiac shifted the letters on his key at mid point. Statistically, the "solution" is a bit heavy on letter P.

Is that far fetched?

EDIT: There are a lot of high scoring two symbol cycles that look like this: 36 41 36 41 36 41 36 41 36 36 36 36 36 36. Zodiac did that too with the 408. And some look like this: 14 23 23 23 23 23 14 23 14 23 14 23 14 23 23. And I only found the configurations in the bottom half of the message.

I wonder if he shifted the letters on his key, maybe with every other letter, or an even number of letters. Maybe the 17 columns hides something. Certainly people have tried to rearrange the message into 20 x 17 or whatever. I wonder if rearranging into 20 x 17 would show that some cycles only appear in certain columns. I wonder if he used a different key shift in the second half. I wonder if making a message with perfect cycles but with a letter shift at the key every other symbol or whatever would result in similar cycle statistics.

 
Posted : August 30, 2015 6:00 pm
daikon
(@daikon)
Posts: 179
Estimable Member
 

You must have a ridiculously huge dictionary.

Based on the word counts (I’m assuming that’s the number in front of the word), it’s pretty small actually. This one is quite a bit larger. And if you want truly ridiculously huge: one trillion words. But it’ll cost you $150 last time I checked.

 
Posted : August 30, 2015 8:58 pm
daikon
(@daikon)
Posts: 179
Estimable Member
 

And here is "killing people."

Interestingly, the tail end of "people" forms one of the 2 large pivots in Z340. With "niople" vertically above.

 
Posted : August 30, 2015 9:12 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

And I had yet another idea. I was thinking about whether Zodiac could have used more than one key, whether he could have used some sort of key shift. And I was also thinking about Jarlve’s suggestion that maybe he used a different key for different rows. I noticed that the qq and ++ doubles all occur at even numbered rows. So maybe this is a 1:1 substitute for only certain rows. And I have been wondering about cycles like this one:

14 23 23 23 23 23 14 23 14 23 14 23 14 23 23.

Is this evidence of using multiple keys? Because sometimes 14 may cycle with 23, and sometimes 23 may not cycle with another symbol. If Zodiac used a key shift, like a Caesar shift, then that could explain the cycle where there are a lot of 23’s together at one part of the cycle. That may explain the lonely 14 at the beginning at the cycle because at that point, 14 doesn’t cycle with 23. It is a 1:1 or cycles with other symbols.

So I made an experiment to try to duplicate 340 stats. I made a key and shifted it every other row. I made it so that L would cycle with other symbols on odd numbered rows, and only be 1:1 on even numbered rows. I shifted the key 13 spaces for odd rows. You are going to be surprised by the statistics and patterns for high scoring cycles.

Note for future comparison of two symbol cycles to determine whether they map to the same plaintext, they are either next to each other numerically ( i.e. 1 and 2 ), or they sit across each other from the two keys.

Here is the message, where each symbol maps to two plaintext ( all cycles are perfect ):

22 30 23 28 10 29 24 31 32 25 35 16 43 11 39 44 30
49 39 47 43 35 19 14 48 59 16 59 15 14 5 63 20 44
19 14 57 36 24 55 25 53 34 41 47 11 15 58 37 56 20
36 1 61 59 62 62 59 2 55 26 59 62 45 56 37 63 47
25 36 54 21 13 14 42 48 49 10 50 55 5 11 6 1 57
15 48 63 35 1 59 14 18 58 49 63 4 15 16 46 36 2
16 10 47 40 58 50 1 35 23 34 2 31 41 15 3 32 30
17 5 61 59 62 62 15 4 63 48 18 57 59 1 55 56 59
59 12 52 34 13 54 21 10 33 40 53 55 56 19 48 23 30
62 59 2 54 49 28 6 47 11 48 3 44 49 59 16 59 14
11 59 12 35 4 13 55 56 10 47 54 20 1 36 17 11 55
18 59 3 56 30 4 19 13 10 5 43 61 15 4 50 51 27
24 54 21 2 16 25 47 32 55 19 12 5 13 52 56 44 3
12 16 5 52 59 17 59 15 18 57 35 16 26 58 48 1 59
8 25 11 22 60 23 30 31 4 12 49 13 5 41 47 35 24
3 6 36 11 37 46 59 44 49 38 1 45 35 62 62 17 57
11 22 20 3 59 12 29 23 31 32 13 9 60 24 30 31 4
47 43 5 63 48 63 31 14 62 37 25 49 15 59 27 59 62
31 38 39 56 17 23 59 13 62 40 58 34 62 35 2 33 10
42 49 44 35 21 14 47 31 4 19 26 59 62 62 16 12 32

And here is the solution:

i l i k e k i l l i n g p e o p l
e b e c a u s e i t i s s o m u c
h f u n i t i s m o r e f u n t h
a n k i l l i n g w i l d g a m e
i n t h e f o r r e s t b e c a u
s e m a n i s t h e m o s t d a n
g e r o u s a n i m a l o f a l l
t o k i l l s o m e t h i n g g i
v e s m e t h e m o s t t h r i l
l i n g e x p e r e n c e i t i s
e v e n b e t t e r t h a n g e t
t i n g y o u r r o c k s o f f w
i t h a g i r l t h e b e s t p a
r t o f i t i s t h a t w h e n i
d i e i w i l l b e r e b o r n i
n p a r a d i c e a n d a l l t h
e i h a v e k i l l e d w i l l b
e c o m e m y s l a v e s i w i l
l n o t g i v e y o u m y n a m e
b e c a u s e y o u w i l l t r y

O.k. in a few minutes I will show the cycle statistics and high scoring cycles.

 
Posted : August 31, 2015 3:41 am
Page 11 / 96
Share: