Mr. Lowe, I didn’t know that you were out there working on this. This message has multiple cycled keys which are all different, but I am not telling how many yet. For instance, the first symbol in the message I encoded with Key 1. The second symbol in the message I encoded with Key 2. And so on. Almost all of the symbols in the message represent more than one letter, and it will not solve in an auto solver as presented. The keys are different and the message has to be modified by separating it into different numbers of "parts," which Jarlve calls them. For example, with a message that has two keys, there are symbols encoded with Key 1, the "odds" and symbols encoded with Key 2, the "evens." Jarlve calls those Part 1 and Part 2. Determine how many parts there are with cycle scores. Then compare the symbols in each part. Find the part that has the most number of mutually exclusive symbols and keep that one the way that it is. Then expand or change the symbols in the other part(s) so that they are all count of 1. You would have a lot of unique symbols, which would increase the multiplicity. Then feed it to an auto solver, probably Jarlve’s, and see if it will solve.
Jarlve, take your time. I wouldn’t be surprised if you or someone else blows this one out very fast though. I am not sure.
By the way, I found the exercise very instructive. I did it by hand. If you make a key and encode by hand, putting a check mark by the symbols as you go, you get a real feel for how Zodiac must have encoded the symbols. I don’t know if he used multiple keys, and would like to dispense with that idea if possible.
But with the 408 and I suspect with the 340 he started out with perfect cycles that become more random as he went. My theory is that there is a very simple reason for that. He wanted to make sure that he used all of the symbols in a cycle at least once. Once all were checked off a few times, he started to randomize more and more. With high frequency letters, he would have had a lot of check marks on the symbols. There would have been so many that it would have been difficult to keep track of the exact order of symbol selection.
If his key was written on a small piece of paper, or a bit condensed, that would explain the randomization because the check marks would start to be very difficult to count and keep track of with high frequency letter symbols. They would have been very clustered together. The only way to cycle symbols perfectly is to make a big key, with the symbols spread out away from each other so that there is room for a nice, neat line of check marks that are easy to count. That would have taken a lot more effort. Thus the increased randomization of the cycles in the second half.
I am going to have to make a table or graph maybe of cycle score totals row by row. Something to see if there is a trend or a big change somewhere to try to show more conclusive evidence of this.
Okay I’ll start with some analysis first because trying to solve a cipher that needs expansion may take hours or more so I can’t blindly start expanding stuff. I hope to find out what is going on. You already gave some hints so.
Unique string frequencies (non-repeats):
Possible discrepancy between normal and horizontally mirrored. At first glance smokie7 seems quite a bit less cyclic than the 340. IoC is lower than the 340 which means a flatter distribution of symbols.
Bigram distribution over different orientations/directions:
Horizontals: 25.75%
Verticals: 27.34%
Diagonals 1: 21.09%
Diagonals 2: 25.78%
Bigrams are equally spread over the directions, typically there should be a 30-40%+ bump for the horizontals. Horizontal bigrams are not hiding at higher periods.
Row-repeats:
Some lines have quite a bit more repeats than others, not sure if anything. Only 5 rows have no repeats which adds to the smokie7 appearing less cyclic than the 340.
Next up will be parts analysis with my new cycle measurement.
Cycle analysis for smokie7 (summed unique string frequencies versus cycle measurement):
(summed unique string frequencies a.k.a. non-repeats are normalized over the character count with 340 as base)
Splits, and other stuff:
Full: 3675 / 140
Mirrored: 4107 / 161
Uneven rows flipped: 3525 / 151
Even rows flipped: 3627 / 147
Uneven rows only: 3394 / 119
Even rows only: 3966 / 150
1st half: 3486 / 105
2nd half: 3860 / 129
Character 1-113: 3198 / 98
Character 114-226: 3758 / 125
Character 227-340: 3835 / 108
Rows 1-5: 3000 / 93
Rows 6-10: 3552 / 105
Rows 11-15: 3520 / 105
Rows 16-20: 3868 / 107
Parts
(a 2 two parts would be one part sitting on uneven symbols and another on even symbols and so on)
Part 1: 3310 / 118
Part 2: 3734 / 113
Part 1: 4169 / 170
Part 2: 6526 / 155
Part 3: 5304 / 163
Part 1: 3644 / 101
Part 2: 2932 / 115
Part 3: 3472 / 120
Part 4: 3508 / 131
Wow this is fun!
Strangely mirrored does better than normal, can’t explain this yet.
It’s looking quite clear that it’s a 3 part message since it yield the strongest returns, if not then something really flukey is going on. Sadly this is all I can do for today, tomorrow I’ll start with some cipher score return tests from the solver to see if something can be learned from that. I don’t want to break the cipher immediately, I’m aiming to improve my information systems so that it may carry over to the 340.
Jarlve and Mr. Lowe, don’t get too excited about expanding symbols quite yet. I am afraid that the best we may be able to do on this project is identify if maybe Zodiac used multiple keys. If you have a three part message, you will have to expand the not mutually exclusive symbols in two of those parts. Multiplicity will be too high. One important factor is the total count of symbols in each part that is not mutually exclusive.
EDIT: You expanded to 148 symbols with m5p1, but that was a bit different. I think that you only used one key, and randomized every fourth symbol starting with the second symbol.
With the 340 odds and evens:
Shared number of symbols in Part 1 and Part 2 = 49.
Part 1 count of shared symbols (the number to expand) = 139.
Part 1 number of mutually exclusive symbols = 9.
Part 2 count of shared symbols (the number to expand) = 158.
Part 2 number of mutually exclusive symbols = 5.
So it seems to me that to minimize multiplicity, I would have to add:
139 shared and expanded symbols from Part 1
9 mutually exclusive symbols from Part 1
49 shared but not expanded symbols from Part 2
5 mutually exclusive symbols from Part 2
———————————————————-
= 202 unique symbols / 340 = 0.594 multiplicity
I think that this logic is correct, and I may have to just limit my work to trying to make a message with multiple keys that mimics 340 stats. If you see things differently, let me know. EDIT: The logic was incorrect. See Jarlve’s post below for explanation about expanding and multiplicity.
There is a difference between parts with different keys and wildcards or randomized symbols.
We need to create an entirely new set of symbols for wildcards or randomized symbols because the original mapping has been destroyed. But for parts with different keys we can keep the symbols because they map correctly for each part, so I suggest we just add a numeric value to set it apart. There’s 58 symbols in the uneven part and 54 symbols for the even part. That should give us no more than 112 symbols.
Here is the 340 numbered by appearance, just add 100 to to the uneven or even part. 112 symbols, multiplicity 0.32.
101 2 103 4 105 6 107 8 109 10 111 12 113 14 115 16 117 18 105 19 120 21 122 23 124 25 126 27 128 29 130 31 132 33 120 34 135 36 137 19 138 39 115 26 121 33 113 22 140 1 141 42 105 5 143 7 106 44 130 8 145 5 123 19 119 3 131 16 146 47 137 19 140 48 149 17 111 50 151 9 119 52 153 10 154 5 144 3 107 51 106 23 155 30 117 56 110 51 104 16 125 21 122 50 119 31 157 24 158 16 138 36 159 15 108 28 140 13 111 21 115 16 141 32 149 22 123 19 146 18 127 40 119 60 113 47 117 29 137 19 161 19 139 3 116 51 120 36 134 62 163 53 131 55 140 6 138 8 119 7 141 19 123 5 143 29 151 20 134 55 138 19 103 54 150 48 102 11 125 27 120 5 161 14 137 31 123 16 129 36 106 3 141 11 130 50 114 53 137 28 119 52 120 51 140 63 147 42 134 22 119 18 111 50 151 20 136 21 158 44 103 6 115 51 118 7 132 50 116 53 161 28 136 8 153 48 119 19 134 20 159 12 130 35 153 47 156 2 104 8 138 39 150 55 119 11 136 28 145 40 120 31 121 23 105 7 128 32 137 57 115 16 103 36 114 19 113 12 163 56 129 19 151 6 126 20 111 33 113 19 119 33 126 56 140 26 136 9 123 42 101 14 154 21 133 5 111 51 110 17 126 29 143 48 120 46 127 23 120 30 155 56 136 4 137 25 101 18 105 10 142 40 139 23 144 62 111 31 158 19
O.k. you are right. I was being a Zodiac but that helps me feel better about smokie7 and I will edit that post soon. I hope that it can be solved. I have been thinking also about a new technique for making messages.
I’m not going to try to solve the smokie7 because it is too difficult (will make attempts when my solver improves). Don’t worry about it, I think it’s far more important to be able to figure out what is going on with a cipher than solving it right away (ask questions first, shoot later). Do we have strong leads that the Zodiac 340 cipher is a 2/3/4 keyed (even/uneven/every n’th) part cipher?
By the way, is my analysis for smokie7 correct that it’s a 3 part keyed cipher? If we can figure out a couple of these then perhaps we could also figure out if anything like this is actual in the 340. If not then we can move on to other polyalphabetic schemes.
I’m in process of creating a cipher for you but I need to make adaptations to my software so it might take a few days. By the way, if you make another cipher for me don’t tell me anything about it, but let me make a cipher for you first.
Yes, smokie7 has three keys. With both smokie6 and smokie7, where I cycled the symbols, your stats showed that as soon as you broke the message down into the right number of parts. Your 340 numbers did not show that. Actually I should probably do a table/ line up of messages soon to compare with the 340, with my smokie7 numbers included. Go ahead and make another one, and I will make another one that I have plans for. Take your time. I will start on a simple table, and we can fill in stats for the next two messages. Then we can decide if Zodiac may have used more than one key or done something similar to one of the recent messages. Thanks.
Have fun: m6p28.txt
I will start on a simple table, and we can fill in stats for the next two messages. Then we can decide if Zodiac may have used more than one key or done something similar to one of the recent messages.
Excellent.
O.k., here is smokie8, where I attempted to emulate 340 cycle stats using multiple keys:
50 45 43 50 1 57 36 33 41 61 44 10 31 51 32 12 46
55 23 7 25 56 11 55 51 18 12 58 24 48 16 52 45 34
29 13 20 60 16 39 44 15 50 59 28 62 63 54 8 22 22
36 13 46 37 48 17 36 26 56 15 32 6 26 38 30 23 32
52 44 1 15 11 57 39 29 9 5 12 19 13 20 10 30 17
50 8 59 49 43 47 27 16 8 33 23 24 34 25 46 21 51
9 47 36 36 8 6 18 40 7 44 42 24 48 15 15 16 46
52 9 32 40 14 20 22 29 47 3 12 10 21 4 58 53 41
57 60 49 63 57 54 50 46 1 38 45 7 10 36 55 60 22
32 54 18 12 14 19 41 19 2 15 55 31 36 34 56 17 43
3 15 41 52 8 45 4 29 9 48 56 36 46 24 16 55 26
61 31 36 35 56 15 22 11 63 19 34 49 62 33 55 44 9
5 38 1 18 58 46 22 58 44 49 46 10 21 60 62 48 1
34 29 28 33 26 34 50 4 32 24 50 18 51 44 16 40 8
34 56 13 55 30 38 27 60 44 62 22 34 10 47 4 38 35
18 10 47 29 63 15 46 49 48 51 1 44 45 4 7 26 32
17 13 36 9 6 18 13 56 16 60 23 13 11 4 10 33 22
32 12 55 53 44 13 18 15 50 44 1 49 18 55 43 11 16
12 55 31 24 34 56 45 9 49 45 15 56 51 45 44 27 55
47 20 13 21 54 6 10 57 50 19 32 15 34 38 37 60 61
I had to get through this so that I could get to the table analysis, which may take some time.
thought of the day. unless you already thunked it.. Whilst we are discussing two part or three part cipher possibilities’. the end of each part `may` have filler.. so in effect two or three lines could have filler in them so as to start the code off again in a next line reading from right to left standard format so as not to give away its a new cipher… This would add to its complexity.
hope ya get what im thinking..
@Mr lowe, that is a possibility.
@smokie, thanks for another cipher.
Cycle analysis for smokie8:
Unique string frequencies:
Flat top?
Encoding direction assessment:
Full: 4037 / 196
Mirrored: 3900 / 172
Uneven rows flipped: 3644 / 176
Even rows flipped: 3948 / 195
Seems to be encoded normally (left-to-right) but perhaps some discrepancies.
Order of encoding, polyalphabetic and further analysis:
Rows and columns assessment:
Uneven rows: 3790 / 120
Even rows: 3868 / 123
Rows +3 starting from 1: 2988 / 120
Rows +3 starting from 2: 3545 / 111
Rows +3 starting from 3: 3413 / 109
Rows +4 starting from 1: 3456 / 109
Rows +4 starting from 2: 3416 / 134
Rows +4 starting from 3: 2896 / 110
Rows +4 starting from 4: 4892 / 137
Rows, uneven pairs: 3750 / 164
Rows, even pairs: 3636 / 116
Uneven columns: 3847 / 188
Even columns: 3302 / 126
Columns 1-9: 3717 / 121
Columns 10-17: 3393 / 135
Uneven columns and uneven paired rows are quite high, possible outliers.
Division parts assessment:
1st half: 3938 / 140
2nd half: 3952 / 163
Character 1-113: 4022 / 126
Character 114-226: 4173 / 123
Character 227-340: 3784 / 147
Rows 1-5: 3844 / 114
Rows 6-10: 3384 / 104
Rows 11-15: 3998 / 127
Rows 16-20: 3652 / 135
Looks fairly normal though rows 6-10 don’t seem to cycle very well.
Interval (interlaced) parts assessment:
Part 1: 3882 / 141
Part 2: 3560 / 145
Part 1: 3203 / 111
Part 2: 3204 / 128
Part 3: 3724 / 108
Part 1: 3088 / 130
Part 2: 3616 / 127
Part 3: 3292 / 106
Part 4: 3224 / 113
Numbers are weaker than the division table so it’s making it worse.
So far I can’t say much yet, although one of my other measurement shows that some of the symbols seem to prefer sitting on either even or uneven positions. It may be a lead worth looking into. Don’t tell me anything, I’m going to dig deeper but will need some time.
I like how you break down your assessments into direction, division parts and interval parts. It is high quality work. I have learned a lot from this recent project and am thinking about how I may conclude. I will begin to assemble the table/ data summary tonight.
Thanks,
I added another section to my previous post (rows and columns) because I didn’t want to miss anything obvious. You don’t have to include everything in your table just yet but some general categories in which to classify things under would be nice. I’m going the extra mile because when we get back to the 340 I don’t want to leave a stone unturned.
Tomorrow I’ll make a start on a symbol analysis for the smokie8.
Jarlve, I am having a lot of fun with this project. So far I have a nice simple format set up for the tables and data filled in for smokie8, which I made by hand and caused me to have some new thoughts. You are thorough and I am very interested in what you come up with for symbol analysis. I want to do more tonight, but am short on sleep and feeling very tired.