So I had this idea about Z340, inspired by how Copiale cipher was cracked. What if some symbols in the cipher stand not for 1 letter, but for 2 (or possibly more)? For example, Z could have encoded "TH" as one symbol (or a set of symbols), "LL" as another symbol/set, "IN" as yet another, etc.. To test my theory, I created this test cipher. I won’t tell you what was used as the plaintext, but I can give you a hint that I encoded the top 12 most frequent 2-letter combinations as separate symbols. The result was a 390-letter plaintext encoded into a 337-symbol cipher, so I added a 3-symbol filler to make it an even 340-symbol cipher. I’ll call it daikon5 for future reference:
CV@pIpAuvik+9jTZn0_5 bLdB]wH6ea^SgbGC4wc7 PIfaDx1Ep@uvhlAUYi2e +vxZg8MN0xnd_3bJIc5F wx+e91<W2wy[aZ3G5c1V 6f2uzpBuK7e0xvhiCozc dxIe83]xO@uvh+rkyZD^ 0A4wdoIEn+<[yx5wZ]4v im9bPM6_pL7gflBx1hCN Tx0ndxj2Ozg@<A3x5Il> FBXC+@lAunyZn8PGvk1M 2YB^zDW3ux0CQ5odp@uI XlAun+_9cZemHU1ozBlC uE6[i@o0m7acmF2ednI^ 3bJ+m8alAu]NmzKV9BY6 lG7Ox8jcm_9uZ^4vh6fL T5ozg7Pem1f4yT@f+ZDK
I haven’t tried to match Z340 in any other respect, other than the length. It can’t be solved with any of the current auto-solvers as far as I can tell. You’ll be able to recognize some of the words and even phrases in some of the solutions, but the rest of it won’t make much sense, so you’ll unlikely be able to reconstruct the entire message unless you figure out where the phrases are from. In hindsight, I should’ve used a more obscure text, but it wasn’t the goal to make it a challenge, I was just making a test for myself to see if I can solve it.
How to solve this type of cipher then? The first idea I had was pretty obvious. I changed my solver to allow 2-letter combinations when trying to substite a symbol in the cipher. The problem is, there are just too many possible combinations. Well, precisely 26*26 = 676. Plus 26 possibilities of a single letter, and you have 702 possibilities to try for each symbol. Vs just 26 before. My solver was just not getting anywhere with such a wide search field and I tried many-many more iterations per restart. It was taking nearly an hour per restart and still nothing.
The next idea I had was to try just the top most frequent bigrams as possible candidates, instead of all 676. I was still not getting anywhere. The solver was just too greedy. It was happily using the top bigrams "TH", "HE", "IN", etc. for too many symbols, as they are also often used in highest scoring N-grams, so the solver was maximizing the total score by proposing solutions similar to the following most of the time "THETHINGTHETHINGTHETHING…". So I had to limit only one symbol per each bigram (i.e. 1-to-1 substitutions for symbols standing for 2 letters). Which matches my test cipher, but not necessarily Z340. Only when I did that and reduced the set of possible bigrams to about 20, I was finally able to solve my test cipher. But the problem was that the set of bigrams I used in the test cipher had to match the ones I tried pretty closely, or it wouldn’t solve.
Next, I tried solving Z340 that way, but the problem was to find the correct set of bigrams to try. I’ve tried the top 20 English bigrams. No solve. Tried the top 20 bigrams in other Zodiac letters. No solve. I tried to mix the two. Still nothing. I could be trying possible sets of 2-letter combinations for a long time… I got stuck at this point for a while. I needed something that wouldn’t be limited to only 1-to-1 substitutions for bigrams, and something that will try all possible bigrams, not a pre-selected set of 20.
Eventually I had the idea to "expand" each symbol into 2. Similar to wildcards idea, but instead of expanding each occurrence of the wildcard symbols into a new unique one, you expand each into a pair of symbols, where only the second one is new. Something like this: if you have a cipher "ABACDA", you change it to "AaBbAaCcDdAa". I.e. "A" becomes 2 separate symbols: "Aa", "B" = "Bb", etc.. And you allow a special "null" letter, which stands for "nothing here", or "delete this symbol" when trying substitutions. This letter can only by tried/used for the newly introduced set of symbols (i.e. the second letter, in even positions). And that’s it. Some of the symbols will end up standing for 1 letter, some for 2. The multiplicity of the cipher technically remains the same, as even though you double the number of unique symbols, but you also double the length of the cipher. There is still the problem with "greediness", but I added a penalty to the fitness function (the one that computes the score of the proposed solution), to reduce the score depending on how long the solution is. It forced the solver to produce solutions as compact as possible (i.e. not expand each and every symbol into 2 letters). It solves the test cipher above pretty cleanly.
Now as far as Z340… Didn’t get a solution. Ran it for 24 hours, just in case. Nada. Not even close to anything legible. Thought I’d share the results anyway, just in case I missed something, or in hopes that what I said above would inspire someone else to come up with something useful.
Oh, and as far as using 3 (or even more) letters per symbol. I don’t think it’s very likely. Trigrams just don’t repeat that often (compared to bigrams), so it doesn’t make much sense. Even if Z decided to do it anyway, there won’t be a lot of occurrences of such symbols, so the cipher should be solvable regardless of them (similar to typos).
I do think its a cool way to look at things, and have heard of ciphers being done in a similar manner. I think TH was one of the most common, but there were things like -able, and other prefixes/suffixes. Has anyone ever tried to use all of Z’s communications to make a "dictionary" of his word usage? To me, some of his letters are hard to make out with his terrible handwriting, and I would also fear it may not correlate as the "neatness" of his ciphers may not align with his rambling in others.
And not to create more work, and if I have time I could help with a Z usage dictionary, but how to weight his usage of words, letters, combinations… That I I don’t know- I call that programming magic
I just know things like that have been done in genetics, the most common being BLAST ( http://blast.ncbi.nlm.nih.gov/Blast.cgi ) to search for probable coding regions (i.e. genes or regulatory elements), alignments among different species, filler, etc with a matrix score of some sort to allow for "errors." Magic, I tell you.
-m
The problem when solved will be simple– Kettering
marie-
I think it has more to do with statistics than "magic," LOL.
daikon-
That type of cipher (homophonic with letter pairs encoded) is often known as a nomenclator in classical crypto.
You should try your solver on 340.tonyb2.example.txt included with ZKD. It is encoded in that same way.
-glurk
——————————–
I don’t believe in monsters.
Thank you doing this daikon, and for another cipher.
It can’t be solved with any of the current auto-solvers as far as I can tell.
Edit your statement please.
Score: 19170 Ioc: 663 M: 194 C: 340 S: 66 ipikekinagpeoplebeca useitssomuchfunitsmo refundorkinartillgam eadeforledbecauseman sdemoonsasidueanamop ofanekinsomedargivem edemoatdrinarespienc eitseverbendidasetta gyourrocksofftidoril ldebedparefiniadaeth nicieitinbiebornapor alicensandeihavekine ctinbecomemysloveiti nrodgiveyoumynamebec auseyoutintlyespoilo tnordopmyconectarofs laveformyoftilifeens CV@pIpAuvik+9jTZn0_5 bLdB]wH6ea^SgbGC4wc7 PIfaDx1Ep@uvhlAUYi2e +vxZg8MN0xnd_3bJIc5F wx+e91<W2wy[aZ3G5c1V 6f2uzpBuK7e0xvhiCozc dxIe83]xO@uvh+rkyZD^ 0A4wdoIEn+<[yx5wZ]4v im9bPM6_pL7gflBx1hCN Tx0ndxj2Ozg@<A3x5Il> FBXC+@lAunyZn8PGvk1M 2YB^zDW3ux0CQ5odp@uI XlAun+_9cZemHU1ozBlC uE6[i@o0m7acmF2ednI^ 3bJ+m8alAu]NmzKV9BY6 lG7Ox8jcm_9uZ^4vh6fL T5ozg7Pem1f4yT@f+ZDK
That type of cipher (homophonic with letter pairs encoded) is often known as a nomenclator in classical crypto.
Yep, it’s similar to a nomenclator cipher. Although I thought nomenclators usually contained words or even entire phrases, and the cipher was numeric due to large number of entries in the nomenclator (often thousands).
You should try your solver on 340.tonyb2.example.txt included with ZKD. It is encoded in that same way.
Hmm, I didn’t know that. Although my solver could already crack it before, it was one of the harder ciphers (requires a lot of restarts). Makes sense, it’s like it has a lot of typos.
It can’t be solved with any of the current auto-solvers as far as I can tell.
Edit your statement please. 😉
That’s not really a solution though. The only coherent part is "people because it is so much fun", the rest doesn’t make much sense. Admit it, if you didn’t know the text already, you wouldn’t be able to even recognize it. 🙂 Anyway, like I said, it wasn’t meant to be a challenge, just a test. I could’ve easily made much harder.
There are more readable fragments in there and given the circumstances I find it a very good (partial) solve. I understand it’s not a challenge but I just wanted to point out that solvers can make some sense of low-level nomenclators.
So, you want a challenge, eh? 🙂 Alright. Here, I present to you daikon6:
LZhiCM+ks8t9uNuVFab0 mvklDWkGLMw4xfVEjIyW 1JoCkxgz2fAc3HdksN5y VZhksXyFbiODm+ksPmGL aBTlZ6h0EiU1l2RH!ksj C@l#lT3$M+0z1g8c2NDz 3kj%UeEQY+ks$LwWCjKz FMGNDI0J9L^ra%HlOFZb XE!wGCKla7IDjMZhiEN1 zAuTC$dh2DiU3j%THfJE th+vitLMatyVN4yWZhz0 tj%Ul#5FihxgiCx6j^tL w7xfVDjKj%Tl1$EWGMBV myug2sNyW3mvklCVkj%U l+eDPY0W$L4yVah$M12z 3f8c+yWHdksNwVEjIksj %TS9m#UCLMzFNGLyDzvJ AM^rZ%HlQFabElNvm0RC jz&jD1rZ%GlOHB8tKE52 raFm9T3@yWksXy6g&s$e +GzvCdz0cjHzsuLFbZU7 y1sMYDEPVxA42WBX5GCI
It’s longer, 440 symbols, so it should be easier to solve. 15 bigrams encoded as one symbol each (1-to-1 substitutions). The rest are normal homophones (52), no typos, no polyalphabetic symbols, no transcription errors, no super-rare words, no poetry, no number sequences, no tricks. I did use a more obscure text that you hopefully won’t be able to recognize by just a couple of words. You don’t even need to solve the plaintext, just tell me what is being discussed (i.e. the overall subject). Good luck. 🙂 I’d be super impressed if you solve this.
Wasn’t really looking for a challenge at the moment but I’ll work on it (on first sight it may be about fish). When I finish (solved or unsolved) I’ll make a similar cipher (nomenclator) for you to test out your solver.
How do you encode these messages? Non-repeat scores and graphs are rather similar to the 340.
Update: can’t solve it normally. I’d like to try some other things, could you perhaps give away the length the message was originally?
The full plaintext length for daikon6 is 508.
To encode such ciphers, I just take the plaintext, make it all uppercase, then find the top N bigrams, then replace them with lowercase letters. For example, "TH" becomes "a", "IN" -> "b", "HE" -> "c", etc.. So "This is that" will become "THISISTHAT", top bigrams "TH"x2 = "a", "IS"x2 = "b", will be encoded as "abbaAT". Then I run that semi-encoded plaintext through a normal homophonic substitution routine.
Another reason I thought this might be used in Z340 is that encoding bigrams as single symbols naturally reduces the bigram repeats. So it would be much more likely to happen by pure chance that almost no bigram repeats are present in the second half of Z340.
Here you go daikon, the original message is 398 letters. For each bigram I introduced one new symbol. It was quite difficult to create a cipher this way so that I couldn’t solve it normally. I’m working on solving your messages with a new routine, just slowly trying some things of my own.
5*bD0<3F17+Bh*cab Y0.UZfBS%,;-CQ+(1 'JDMRTL?X]*9!E(% UTiQ?%;E,[ZD3Y7G -']Ua/ebA<cS7A0Q+ R9VgJXiW)[.Zf0=! 1hT&D&S?7OE*CF)+f Bce).]GUW:&9SAIW B;X3TL0<?]iM(;ce9 LQY7SICgEO+TIWUf@ 6?&IRBE%J,'-Fb@! 5afD7]9[<TMiQ#Jcg h?e`!'XCGM(PS;IWU OgffD%5*1O"YhZfB EiM!e"AOX30R%-UiQ [1ge+(5#]bZMY;9<D Ffi0Q[*T?5%7ObCG eP+W:0,.]=!-h9acE S%aD7SFQ#TIWR1'YK CG7I-gM!U1i?FY+X
Here you go daikon, the original message is 398 letters.
Here’s what I got so far:
DENNISFONGALSOKNOWNATHRESHOLDOFPAINISANAMERICANENTREPENEURANDRETIREDPROFESIONALGAMERWHATINOWFOLLOWSISANEXCERPTOUTOF
THRESHSQUAKEBIBLELEARNINGTOSHOOTTOTHEGROUNDBELOWYOUROPPONENTSFEETCANPOTENTIALLYINCREASEYOURHORSEBYANORDEROMAGINOUDE
WHILEEFFECTIVEROCKETJUMPINGCANHELPYOUREACHHIDDENAREATAKESHORTCUTTOWEAPONSANDARTIFACTSANDEVEINESCAPEFIGHTSIFNEEDED
LEARININGTHESOUNDSOTHEQUAKEWORLDWILLGIVEYOUANAMAZINGLYACCURATEGRASP
Not the cleanest, but pretty good, I think.
For each bigram I introduced one new symbol.
That’s actually not a requirement for my solver at all, doesn’t have to be 1-to-1. Pretty much the only requirement is that symbols have to represent no more than 2 letters. It might get a somewhat readable solution if you try more than 2 letters per symbol, but I haven’t tried it.
It was quite difficult to create a cipher this way so that I couldn’t solve it normally. I’m working on solving your messages with a new routine, just slowly trying some things of my own.
That’s always good to get an independent confirmation! Just in case I missed something. Also, you might get a partial solution in the process, that might help us figure out Z340.
Well done! That’s amazing daikon, I didn’t think you could get it.
If your not getting a solve for the 340 I’m pretty confident that you’ve at least ruled out your own hypothesis to some degree.
dennisfongalsokno wnasthresholdofpa inisanamericanent repeneurandretire dproffesionalgame rwhatnowfollowsis anexcerptoutofthr eshsquakebiblelea rningtoshoottothe groundbelowyourop ponentsfeetcanpot entiallyincreasey ourhitratebyanord erofmagnitudewhil eeffectiverocketj umpingcanhelpyour eachhiddenareasta keshortcutstoweap onsandartifactsan devenescapefights ifneededlearningt hesoundsofthequak eworldwillgiveyou anamazinglyaccura tegrasp
Well done! That’s amazing daikon, I didn’t think you could get it.
If your not getting a solve for the 340 I’m pretty confident that you’ve at least ruled out your own hypothesis to some degree.
dennisfongalsokno wnasthresholdofpa inisanamericanent repeneurandretire dproffesionalgame rwhatnowfollowsis anexcerptoutofthr eshsquakebiblelea rningtoshoottothe groundbelowyourop ponentsfeetcanpot entiallyincreasey ourhitratebyanord erofmagnitudewhil eeffectiverocketj umpingcanhelpyour eachhiddenareasta keshortcutstoweap onsandartifactsan devenescapefights ifneededlearningt hesoundsofthequak eworldwillgiveyou anamazinglyaccura tegrasp
Jarive,
Have you ever held the 340 cipher up to a mirror. The letters look altogether different. Number 1 on the original text looks like the letter J. The whole mirror image looks interesting. Can you check out this idea with your code breaking background?
Holmes