Okay doranchak, good luck with it. It’s an amazing project!
@smokie, the mystery cipher is perfectly cyclic with a few 1:1’s but columnar transposition was applied after encoding. Transposition schemes where information only moves horizontally (gridwise) are the least disruptive to the cycles and such a ‘subtle’ transposition is a possibility for the 340 when applied after or during encoding.
You did catch on here and there and you were able to flush out the dominant 1:1.
Thank you very much and that was the last cipher I wanted you to test. All the results are in my head but I still need to process the information a bit. If you require any more specific ciphers let me know.
We need a way to differentiate between (randomisation in cycles+1:1 substitutes/transposition) and (wildcards/nulls/filler). I moved these schemes into two groups because I believe we may not be able to make a distinction between those inside a group. If we can come up with such a test then we will be able to narrow down on what is going on with the 340.
homophone(s): 5,38 (5,38,5,38) homophone(s): 6,24,29,43,41,59 (6,24,29,43,41,59,6,24,29,43,41,59,6,24,29,43,41,59,6,24,29,43,41,59,6) homophone(s): 16,15,18,31,52 (16,15,18,31,52,16,15,18,31,52,16,15,18,31,52,16,15,18,31,52,16,15,18) homophone(s): 2,63 (2,63,2,63,2,63) homophone(s): 8,13,25 (8,13,25,8,13,25,8,13,25,8,13) homophone(s): 10,26,27 (10,26,27,10,26,27,10,26,27,10,26,27,10,26,27,10,26,27,10,26,27,10,26,27,10,26,27,10,26,27,10,26,27,10) homophone(s): 1,21,33,38,41 (1,21,33,38,41,1,21,33,38,41,1,21,33,38,41,1,21,33,38,41,1,21,33,38,41,1,21,33,38,41) homophone(s): 17 (17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17,17) homophone(s): 11,37,40,47 (11,37,40,47,11,37,40,47,11,37,40,47,11,37,40,47,11,37,40,47,11,37,40) homophone(s): 12,23,51 (12,23,51,12,23,51,12,23,51,12,23,51,12,23,51,12,23,51) homophone(s): 4,7,22,19 (4,7,22,19,4,7,22,19,4,7,22,19,4,7,22,19,4,7,22,19,4,7,22,19,4,7,22,19,4,7,22,19,4,7,22,19,4,7,22) homophone(s): 3,14 (3,14,3,14,3,14,3,14,3,14) homophone(s): 9,32 (9,32,9,32,9) homophone(s): 28,34,35,36 (28,34,35,36,28,34,35,36,28,34,35,36,28,34,35,36,28,34,35,36,28,34,35,36,28,34,35,36) homophone(s): 20,54,57 (20,54,57,20,54,57,20,54,57,20,54,57) homophone(s): 30,53 (30,53,30,53,30,53,30,53) homophone(s): 45 (45,45) homophone(s): 43,48,58 (43,48,58,43,48,58,43,48,58,43,48,58,43,48) homophone(s): 46,51 (46,50,46,50,46) homophone(s): 49,55 (49,55,49,55,49,55,49,55) homophone(s): 56,60 (56,60,56) homophone(s): 61,62 (61,62,61,62,61,62,61,62)
Short time out for me. I have some thoughts after the Mystery Cipher experiment, which I will outline for you guys very soon. Doranchak, good luck with your new hillclimber program. Smokie
Basic Concepts
These efforts and experiments have been very interesting for me. So far in this thread, we have:
1. Resolved that Zodiac used cycles;
a. Discussed how to flush these cycles out of the huge list;
b. Discussed whether the cycles are symbols in perfect order, or whether Zodiac may have switched some of the symbols around as he applied them;
2. Found that a few of the high count symbols are not in cycles (5, 19, 20 and probably 51), and there is at least one way to identify them;
a. Discussed what these high count symbols may be; and
3. Discussed and performed some high level computer programming that may help to solve the 340.
Issues with Wildcard
I was a little bit surprised by the results of the Mystery Cipher experiment. I think that we have an error somewhere, that perhaps Jarvle gave me the cycles for another message. I am sure that I would have picked up on the 10 26 27 cycle. Check above and make sure, because in the Mystery Cipher I get 10 26 27 26 10 26 27 etc.
I think that there are some overarching issues however that I need to talk about. I came up with Wildcard because of four facts:
1. Zodiac used cycles in the 408 and 340;
2. Some of the symbols are not in cycles and sit next to each other (qq and ++);
3. Zodiac used a darkened triangle to represent A, S and I in the 408; and
4. It is highly probable that I made false observations of missing symbols in cycles and q or + being where the missing symbol should have been found.
EDIT: Although the 6 30 37 cycle in the 340 scores the highest for L=3, it is possible that it is not a cycle at all. We just can’t be sure without knowing what the cycles are. therefore, my observation that the 6 30 37 could have a missing symbol with a wildcard substituted may not be true.
I don’t want to take the steam out of anyone’s efforts. I asked for help and I got it. I don’t know if Wildcard is correct, but if it is, I came up with it based on almost certainly false observations. It would at least in part have been a lucky guess.
The Cycles
If Wildcard blows up like all of the other efforts, I wonder if you guys are going to continue pursuing the cycles to try to use them somehow.
I am not a computer or math expert. But my hunch is that trying to flush out the cycles is not going to be easy. I don’t think that there is going to be a quick fix. I am just saying that because I can find long sequences in these experiments and have them turn out to be false cycles.
Just general concepts here, but I think that finding the cycles, which I hope that your solving programs will render unnecessary, will take some very high level math, computer programming, and creative thinking.
1. I think of Google and how the site must find relevant websites and rank them. There is an interrelatedness between the symbols, and I wonder if some type of program that examines complicated networks of those relationships might help.
2. I also think about looking at different parts of the 340 and comparing them.
3. I also think about maybe looking at the beginning of the 340. A cycle for E will be there. We don’t know what the L value is, but wouldn’t the E Cycle show up before all of the others? E is going to be in Row 1 twice, Row 2 twice, roughly, etc. So I wonder about limiting the scope of the search to find some of the high frequency letter cycles. Their most likely candidates among a smaller list of cycles. Then taking those symbols out, expanding the scope of the search, and looking for mid range frequency letter cycles. Take those symbols out, expand the scope of the search some more, and look for low frequency letter cycles. Etc. I wonder how that would work.
Could that program work its way down from most likely candidates for high frequency letter cycles to most likely candidates for low frequency letter cycles, and send that information to another program that tries those symbols while looking for a solution at the same time?
Nevertheless, there are a bunch of different factors involved. Maybe we should work smart instead of working hard, and see if the hillclimber programs will come up with a weak solution. If not, then perhaps a pursuit of the cycles. I am sincerely hoping that some of the issues that we have identified will contribute to a solution.
Thanks,
Smokie
Oh yeah Jarlve, with my spreadsheet, I can use some trial and error to find cycles. I enter the symbol values into cells and the spreadsheet shows me the patterns. The pattern cells are conditionally formatted so that they turn different colors for different symbols. However, I always have to structure my trials like 1 2, 1 2 3, 1 3, etc. If I try with 1 3 2, the spreadsheet won’t work right. EDIT: Transposition makes the spreadsheet not work right when I enter the symbols, but the spreadsheet will still show the symbols in the order that they are found. Check above though for the Mystery Cipher cycles; mine were different.
Thanks for the summation smokie.
It are the original pre-transposition cycles for the mystery cipher.
I’d like to continue on the wildcard idea but for now I have to let it rest for a while.
Jarlve,
No problem. Thank you for your efforts so far. If you have any summations of your own, that would be helpful for anyone reading the thread in the future. I think that you tested various ciphers with ZKDecrypto. If you have any information to summarize about your findings that are not already posted, I would appreciate that. If your observations support or do not support Wildcard, or anything else that you want to add.
See you when you return. If we get a solve, I will PM you.
Smokie
I have an idea.
I found the download sites for zkdecrypto and AZdecrypt today. Under the AZdecrypt site information, the program can work with up to 100 symbols.
Is it possible to do this:
1. Assign one or more of the alleged wildcard symbols individual numbers for each wildcard symbol;
2. Find a (weak?) solution;
3. identify some of the cycles and collapse the cycles into one new symbol to represent all of the symbols in the cycle;
4. Assign one or more of the alleged wildcard symbols individual numbers for each symbol; and
5. Continue doing this until we have a solution?
For instance, there are eleven "q" symbols, which I call Symbol 5. Make those symbols 64, 65, 66, etc. Find out what happens when using one of the programs. If there is any intelligible language to work with, use that to find cycles for the other symbols. Collapse those cycle symbols into one new symbol number. See if the program finds the same solution. Then add new symbols to represent each individual "+", which I call Symbol 19. Keep doing that.
Is that possible?
Smokie
Hey smokie,
1) Yes, ofcourse you can replace each wildcard with new individual symbols. But the problem here is 26 ^ (total wildcard symbols). See point about multiplicity in 2.
2) Possible if you keep the multiplicity of the cipher (unique symbols divided by total symbols) not much higher than 0.23. You may need some luck in finding a right combination. Solvers have problems with higher multiplicity ciphers to the point were it just will not solve.
3) Yes, such a thing is discussed in the paper doranchak linked in this thread. REMOVE_HOMOPHONES function.
4, 5) Worth trying on some test ciphers first with increasing difficulty.
If you decide to go into higher multiplicity (m>2) I strongly recommend you use ZKDecrypto. AZdecrypt assumes multiplicity of that around the Zodiac 340 cipher, and that a reasonable full grid solution without too much interruptions is available (speed solver for many ciphers).
I’ll see if I can give 5 a shot tomorrow. I can add an outer hill climber to AZdecrypt that does that. I’ll start with a very simple test cipher that has 1 wildcard totalling 10 symbols. The idea is just to improve upon it. If that can be done then it remains to be seen how many symbols could be restored this way (again, multiplicity).
O.k., I am learning how to use Decrypto.
I did try to substitute one new symbol for each alleged polyalphabetic symbol to see what would happen. Multiplicity was 0.34 and score leveled out at about 40,000. Got a few words here and there and some next to each other. Otherwise gibberish.
I also solved your Shaman message. Fun program!
Take your time.
At multiplicity 0.34 you may need at least 6 and possibly also 7-grams to distinguish the plaintext from gibberish. I will need to make some multiplicity upgrade to my solver first before I attempt your point 5. I’m working on it but it’s hard.
O.k.
Below is the numeric message with wildcards, which I shall forthwith refer to as alleged polyalphabetic symbols ("APS") , broken down into new individual symbols.
I started with 1-63. Replaced 5’s with 101-111. Replaced 19’s with 121-144. Replaced 20’s with 151-162. Replaced 51’s with 171-180.
1 2 3 4 101 6 7 8 9 10 11 12 13 14 15 16 17
18 102 121 151 21 22 23 24 25 26 27 28 29 30 31 32 33
152 34 35 36 37 122 38 39 15 26 21 33 13 22 40 1 41
42 103 104 43 7 6 44 30 8 45 105 23 123 124 3 31 16
46 47 37 125 40 48 49 17 11 50 171 9 126 52 53 10 54
106 44 3 7 172 6 23 55 30 17 56 10 173 4 16 25 21
22 50 127 31 57 24 58 16 38 36 59 15 8 28 40 13 11
21 15 16 41 32 49 22 23 128 46 18 27 40 129 60 13 47
17 29 37 130 61 131 39 3 16 174 153 36 34 62 63 53 31
55 40 6 38 8 132 7 41 133 23 107 43 29 175 154 34 55
38 134 3 54 50 48 2 11 25 27 155 108 61 14 37 31 23
16 29 36 6 3 41 11 30 50 14 53 37 28 135 52 156 176
40 63 47 42 34 33 136 18 11 50 177 157 36 21 58 44 3
6 15 178 18 7 32 50 16 53 61 28 36 8 53 48 137 138
34 158 59 12 30 35 53 47 56 2 4 8 38 39 50 55 139
11 36 28 45 40 159 31 21 23 109 7 28 32 37 57 15 16
3 36 14 140 13 12 16 56 29 141 179 6 26 160 11 33 13
142 143 33 26 56 40 26 36 9 23 42 1 14 54 21 33 110
11 180 10 17 26 29 43 48 161 46 27 23 162 30 55 56 36
4 37 25 1 18 111 10 42 40 39 23 44 62 11 31 58 144
ZKDecrypto races to a score of about 36k, slows down a little, then eventually scores around 39k. I got over 40k a few times.
Here is an example, which scored 40223. It starts out with "I cried," and interestingly, has the word "Herman" in it. Short words like "pain’ and "love" are in there, and the word "other" shows up twice. Looking again at the middle portion, this could easily read "let these others I love" if only two high frequency letters are changed.
There is a lot of gibberish, and I understand that some of those words may disappear with higher scoring results.
ICRIEDANCETHENHER
SHEATTHOSEENMUSTH
ISPAINGTHETHETLIS
ALLYADOUNDTHEIRSE
RWILLFORTHECONDED
FORANDHOURRETIEST
THESTOBEGACHNNLET
THESTOTHERSELOVEW
RMITISTREMBASSEDS
OLDGNWASTHEYMANSO
GARDHFCTSELFINISH
EMADRSTUHNDININGG
LEWASHASTHEMATBOR
DHISATHEDINANDFOR
SUCHUPDWRCINGTHOU
TANDLESTHEANTITHE
RANCEHERMANDERTHE
OTHERLEACHAINDTHA
THEREMYFOREHOUORA
IISISWEALTHOSTSBK
If anyone wants to play around with this, you are welcome to. I am just learning ZKDecrypto, and maybe someone has techniques to squeeze a higher scoring result out of the program.
I also tried to substitute only for each individual APS by itself, all of which scored over 32000. But substituting for all APS scored the highest.
It makes sense to me that the "+" would be polyalphabetic. When I used to try to solve by hand, even when I got several good words started, there was no way that I could change the "+" without ruining those words. I think that having the "+" locked to only one plaintext makes the message pretty much impossible to solve.
S.T.
Hey Jarlve, I have another idea and was wondering what you think.
What about changing the dictionary.txt file to trick Decrypto by adding wildcards into the words. For example, the word "slave" could have six different versions:
slave
*lave
s*ave
sl*ve
sla*e
slav*
Then take out a lot of the words that Zodiac probably didn’t use, like the word "ABATTOIR," to mitigate the increased file length.
In Decrypto, lock the alleged polyalphabetic symbols to * (or perhaps don’t lock them) and see what happens. If we get a readable solution, substitute the appropriate letters for the *s.
Could something like that work?
With ZKD, the "dictionary.txt" files is used ONLY to list found words in the cipher. It is not used in the actual solving process AT ALL.
So changing it has no effect in any way on the actual solution that is found. The program will work fine without it, in fact.
-glurk
——————————–
I don’t believe in monsters.
Glurk,
Thanks for answering that question. And yes I see that the program will in fact work without the dictionary.text file. I tried it.
Isn’t that your program?
We have established that Zodiac did use cycles, and that those symbols and two others do not cycle well with any other symbols.
Do you have any thoughts about whether the "+" symbol, the "q" symbol, or others could be polyalphabetic?
S.T.
Hey Jarlve, I have another idea and was wondering what you think.
What about changing the dictionary.txt file to trick Decrypto by adding wildcards into the words. For example, the word "slave" could have six different versions:
slave
*lave
s*ave
sl*ve
sla*e
slav*Then take out a lot of the words that Zodiac probably didn’t use, like the word "ABATTOIR," to mitigate the increased file length.
In Decrypto, lock the alleged polyalphabetic symbols to * (or perhaps don’t lock them) and see what happens. If we get a readable solution, substitute the appropriate letters for the *s.
Could something like that work?
Hey smokie,
The files bigraphs, trigraphs, tetragraphs and pentagraphs under the language/eng directory are used to score the ciphers. I changed them to your specifications and used the "Z" letter as wildcard because that letter is unused by ZKDecrypto in the ciphers we usually work with (you can check this by looking at the Init Key distribution and it has to do with its low frequency in English). Then in ZKDecrypto lock every wildcard symbol to the letter "Z" and you are good to go. ZKD ngrams with "Z" added for every position and duplicates removed.
I’m not sure how well this would work and you may need to change IoC weight a bit. Problem with suspected wildcards is that they create a lot of double symbols.
Wow thanks Jarlve!
I constructed a message (first 340 letters of the 408) with perfect cycles, no 1:1 or wildcards and the new n-graph files with the Z’s worked.
Symbol
Count
4 A 1 2 3 4
1 B 5
2 C 6 7
2 D 8 9
8 E 10 11 12 13 14 15 16 17
1 F 18
2 G 19 20
4 H 21 22 23 24
5 I 25 26 27 28 29
0 J
1 K 30
3 L 31 32 33
1 M 34
4 N 35 36 37 38
5 O 39 40 41 42 43
2 P 44 45
0 Q
3 R 46 47 48
3 S 49 50 51
5 T 52 53 54 55 56
2 U 58 59
1 V 60
2 W 61 62
1 X 57
1 Y 63
0 Z
63
25 31 26 30 10 30 27 32 33 28 35 19 44 11 39 45 31
12 5 13 6 1 58 49 14 29 52 25 50 51 40 34 59 7
21 18 58 36 26 53 27 49 34 41 46 15 18 59 37 54 22
2 38 30 28 32 33 29 35 20 61 25 31 8 19 3 34 16
26 36 55 23 17 18 42 47 48 10 50 56 5 11 6 4 58
51 12 34 1 37 27 49 52 24 13 34 43 50 53 9 2 38
20 14 46 39 59 51 3 35 28 34 4 32 40 18 1 33 31
54 41 30 29 32 33 49 42 34 15 55 21 25 36 19 20 26
60 16 50 34 17 56 22 10 34 43 51 52 53 23 47 27 31
32 28 37 19 11 57 44 12 48 13 38 7 14 29 49 15 60
16 35 5 17 54 55 10 46 56 24 2 36 20 11 52 53 25
37 19 63 39 58 47 48 40 6 30 50 41 18 18 62 26 54
21 3 20 27 46 33 55 22 12 5 13 51 56 45 4 47 52
42 18 28 53 29 49 54 23 1 55 61 24 14 38 25 62 26
31 32 5 15 48 16 5 43 46 35 27 36 44 2 47 3 8
28 7 17 4 37 9 1 33 31 56 21 10 29 22 2 60 11
30 25 32 33 12 8 61 26 31 32 5 13 6 39 34 14 34
63 50 33 3 60 15 51 27 62 28 31 32 38 40 52 19 29
60 16 63 41 59 34 63 35 4 34 17 5 10 7 1 58 49
11 63 42 59 61 25 33 31 53 48 63 54 43 50 32 39 62
Solves in about a minute with both original and wildcard n-graph files
2. Took 20 (the Symbol "B" in the Z340) from G cycle and moved it to Z cycle. Put 20 in the same places where it appears in the 340.
a. Using the original n-graph files, and without locking 20 to Z, solved in less than 2 minutes but thought that 20 was E (maybe because of the expected frequency?).
b. Using the wildcard n-graph files, and without locking 20 to Z, solved in about 3 minutes and found the letter Z where 20 was. Decrypto seemed to have corresponded the n-grams with the Z’s in them and used them in the solution.
ILIKEKILLINGPEOPL
EBEZAUSEITISSOMUC
ZFUNITISMOREFUNTH
ANKILLINGWILDGAME
INTHEFORRESTBECAU
SEMANISTHEMOSTDAN
GEROUSANIMALOFALL
TOKILLSOMETHINGGI
VESMETHEMZSTTHRIL
LINGEXPERENCEIZEV
ENBETTERTZANGETTI
NGYOURROCKSOFFWZT
HAGIRLTHEBEZTPART
OFITISTHATWHENIWI
LZBEREBORNINPARAD
ICEAZDALLTHEIHAVE
KILLEDWILLBECZMEM
YSLAVESIWILLNOTGI
VEYOUMYNZMEBZCAUS
EYOUWILLTRYTOSLOW
I also tried experiments with Symbol 19, but they were messed up partly by me and partly by the double symbols I think. I have more experiments to do.
S.T.