Zodiac Discussion Forum

340..partially solv…
 
Notifications
Clear all

340..partially solved 😉

364 Posts
44 Users
0 Reactions
81.2 K Views
Quicktrader
(@quicktrader)
Posts: 2598
Famed Member
Topic starter
 

The air is thin.

http://www.data-compression.com/english.html#notes

delivers us the statistical data for third order probabilities, e.g. the probability of the three letter combination ‘FMB’ (0.00051 or 0.051%). The file offers the distribution of the third letter on a combination of two only. However we can combine (by multiplication) both and therefore get, together with e.g. Scott Bryce’s letter distribution (multiplicate again), the real chances of finding a three letter combination in a cleartext (trigrams).

In the 340 there are two trigrams repeating twice. Based on this statistical information, those trigrams should have a probability of at least 0.00588 or 0.588% to appear twice in a 340 letter cleartext. It is likely, however, that those two trigrams indeed fulfill this criteria, as there might be additional trigrams of this kind hidden behind other homophones.

If we check the overall probabilities of trigrams, there are not many with such a ‘high’ value. Only the trigrams

ERE
HER
AND
YOU
THE
ING

fulfill this criteria. ‘THI’, for example, is a frequent trigram, too. However it’s probability is ‘only’ 0.0029 or 0.29%. Therefore it wouldn’t appear twice, especially if there is a third or even more of this trigrams being hidden behind other homophone combinations.

Although just an assumption, the trigrams considered above might represent the trigrams visible in the 340.

Of those, only two trigrams can be matched together, this luckily is the case in the 340 with the symbol combination of ‘IOFBc’ (line 13). It would be a severe statistical outlier if a trigram with the probability of e.g. 0.29% showed up twice in the 340, especially if there might be additional trigrams hidden amongst other homophones. It is more likely that, eg. the combination ‘THE’ with a probability of rock solid 2.6% appears approximately 8-9 times in the cleartext but is only visible as two identical (homophone) trigrams.

The only two trigrams with an adequate statistical value that may be combined together is ‘THE’ and ‘ERE’. Both have a probability of >0.5% and can be combined together. Therefore it may be assumed that the combination ‘IOFBc’ is represented by the cleartext word ‘THERE’. Even if someone is guessing a different trigram to be present here, it is most likely that at least one of those (I believe both are correct) would represent the trigram ‘THE’.

Update:
When lowering the benchmark to those trigrams that are expected ‘at least once’ to appear in the cipher, the list of trigram combinations (‘IOFBc’) becomes slightly longer. Out of 17,576 (26^3) trigram combinations, only 12 trigrams (!) are expected to appear at least once in a 340 letter cleartext. In fact, most of the trigrams that we could think of (e.g. AAA, AAB, AAC etc.) do not have an expected probability of > 1/340 (which equals to 0.0029411 or 0.29%).

Those ‘frequent’ trigrams are:

AND ENT ERE FOR HAT HER ING ION TER THA THE YOU

and are the following ‘IOFBc’ therefore the most likely ones (although it’s trigrams actually area expected to show up once, not twice, in the 340):

ENTER
HATER
HATHA
HATHE
THAND
THENT
THERE

For reasons described above, the trigram combination ‘THERE’ is still the most likely one. However others, e.g. ‘ENTER’, ‘HATER’, ‘THAND’, ‘THENT’, appear to be plausible solutions, too. So let us get aware about it: The symbol ‘I’, for example, in any case does represent the letter ‘E’, ‘H’ or ‘T’ – or a trigram that wouldn’t be expected even once in the cipher, which in fact does appear twice (or even more often, considering the various homophones in the cipher). Latter not only relevant for only one, but both of the trigrams.

QT

*ZODIACHRONOLOGY*

 
Posted : September 6, 2014 6:47 pm
AK Wilks
(@ak-wilks)
Posts: 1407
Noble Member
 

Excellent work! I will have some more substantive comments tomorrow. But I just wanted to say this is very interesting, you are making progress and keep going.

MODERATOR

 
Posted : September 7, 2014 4:38 am
Quicktrader
(@quicktrader)
Posts: 2598
Famed Member
Topic starter
 

And we keep going..

So far we figured out that certain bigrams as well as trigrams have individual probability values each. Those were, so far, given under the assumption that a bigram or trigram might end with a ‘space’, e.g. at the end of a word. However Z didn’t show us where words end/stop, except from the beginning and the end of a cipher. Meanwhile, this data was adjusted ‘as if’ there is no ‘space’ occurring at all.

We also know that certain trigrams are more frequent than others, and, that the last letter of those trigrams (IOFBc) end with a certain letter (F) which is in fact the first letter of the second (repeating) trigram. What we don’t know so far is the frequency of the combination of those two trigrams, therefore the frequencies of a pentgram such as ‘IOFBc’. This frequency, however, is the combination of both trigram frequencies (either combined by multiplication, according to it’s frequency in the cipher OR related to the bigram frequency of the fourth letter following the ‘F’ – both methods appear to be useful for further proceeding).

With the adjusted frequencies (without ‘space’) we can now get values such as:

AND – 1.114703%
DED – 0.1807452%

therefore ‘ANDED’ with a combined frequency of 0.0020089% (multiplication).

So now we can finally look at which combination of trigrams is the most frequent for the ciphertext IOFBc. Working down from this, we statistically have now the most efficient approach to figure out the rest of the cipher text (‘THERE’ appears to be the pentgram with the highest frequency – but which one is the second most frequent?).

Still working on the excel file, so we’ll see,

QT

QT

*ZODIACHRONOLOGY*

 
Posted : September 13, 2014 11:48 pm
Quicktrader
(@quicktrader)
Posts: 2598
Famed Member
Topic starter
 

And we keep going on..

So far we had figured out that certain bigrams as well as trigrams have individual probability values. Those , so far, were understood under the assumption that a bigram or trigram might end with a ‘space’, therefore being at the end of a word. However Z doesn’t show us where his words end/stop, except from the beginning and the end of the cipher.

Meanwhile the available data was adjusted ‘as if’ there is no ‘space’ occurring at all.

We also do know that certain trigrams are more frequent than others. And that the last letter of one of those trigrams (IOFBc) actueally does end with the first letter of the second trigram (‘F’). What we hadn’t known so far is the frequency of the combination of those two trigrams, thus the frequencies of a ‘pentgram’ such as ‘IOFBc’. This frequency, however, is the combination of both trigram frequencies (combined either by multiplication, according to it’s frequency in the cipher OR even better related to the bigram frequency depending on the fourth letter following that ‘F’ – both methods appear to be somehow useful).

With the adjusted frequencies (without ‘space’) we now got values such as:

AND – 1.114703%
DED – 0.1807452%

therefore ‘ANDED’ has a combined frequency of 0.0020089% (multiplication). Meaning that in a text with 100,000 letters (equal to approximately 2,500 sentences) the sequence ‘ANDED’ would appear approximately twice (e.g. with words such as ‘stranded’ or ‘landed’.

Finally we can have a look which combination of those trigrams is the most frequent, therefore most likely representing the ciphertext IOFBc.

Working down from this, we statistically get the most efficient approach to figure out further structures of the cipher (‘THERE’ appears to be the pentgram with the highest frequency – but which one is the second most frequent?).

Still working on the excel file, so we’ll see,

QT

*ZODIACHRONOLOGY*

 
Posted : September 14, 2014 4:41 am
vasa croe
(@vasa-croe)
Posts: 493
Honorable Member
 

Honestly…i think it is a lot simpler than this. I understand the need to over complicate everything, but there is no possible way that a human mind could have built an unbreakable code that this many people have attempted to Crack with brain and computer power, back in the 1960’s….

I firmly believe this code was built on livestock brands, and from a personal book…unless someone finds the particular book, it will never be solved.

 
Posted : September 14, 2014 6:17 am
Jarlve
(@jarlve)
Posts: 2547
Famed Member
 

Hey Quicktrader,

I’m following your thread with interest, just gotta catch up a bit with reading the early pages. =)

@vasa croe

So far no one has been able to prove it is not a normal homophonic substitution cipher besides the fact that no one has been able to solve it this way. And I think it’s not hard to make an unbreakable code, layer transposition+vigenere+homophonic substitution.

I don’t think Quicktrader is overcomplicating, in fact he’s doing the opposite.

But yes, it may be something entirely different. And it’s good that people are trying different approaches.

AZdecrypt

 
Posted : September 14, 2014 12:27 pm
Quicktrader
(@quicktrader)
Posts: 2598
Famed Member
Topic starter
 

Honestly…i think it is a lot simpler than this. I understand the need to over complicate everything, but there is no possible way that a human mind could have built an unbreakable code that this many people have attempted to Crack with brain and computer power, back in the 1960’s….

I firmly believe this code was built on livestock brands, and from a personal book…unless someone finds the particular book, it will never be solved.

Well there is nothing over complicated..this is a most simple analytical step of multiplicating two probabilities together. Personally I do believe that Z had possibly received his symbols from livestock brand books, however I also do believe (for various, not only obvious reasons but also issues such as entropy or floating frequency analysis) that the 340 has a similar structure than the 408, latter imply that Z had again used homophone sequences when choosing the order of the homophones. Nevertheless I like the idea of book codes, such as the ‘Beale’ ciphers…
viewtopic.php?t=907&p=9999
http://en.wikipedia.org/wiki/Beale_ciphers

Indeed it is interesting to understand not only one but more encipherment methods, as there are plenty of them. Simon Singh has a great website and even greater book about it…read a few hundred pages in three days only, but only few sentences regarding homophone substitions ciphers. Almost nothing about how to crack those, however, but definitely an interesting read.

Still would be interested out of which brand book the halloween card symbol actually had come from..

QT

*ZODIACHRONOLOGY*

 
Posted : September 14, 2014 1:16 pm
vasa croe
(@vasa-croe)
Posts: 493
Honorable Member
 

Honestly…i think it is a lot simpler than this. I understand the need to over complicate everything, but there is no possible way that a human mind could have built an unbreakable code that this many people have attempted to Crack with brain and computer power, back in the 1960’s….

I firmly believe this code was built on livestock brands, and from a personal book…unless someone finds the particular book, it will never be solved.

Well there is nothing over complicated..this is a most simple analytical step of multiplicating two probabilities together. Personally I do believe that Z had possibly received his symbols from livestock brand books, however I also do believe (for various, not only obvious reasons but also issues such as entropy or floating frequency analysis) that the 340 has a similar structure than the 408, latter imply that Z had again used homophone sequences when choosing the order of the homophones. Nevertheless I like the idea of book codes, such as the ‘Beale’ ciphers…
viewtopic.php?t=907&p=9999
http://en.wikipedia.org/wiki/Beale_ciphers

Indeed it is interesting to understand not only one but more encipherment methods, as there are plenty of them. Simon Singh has a great website and even greater book about it…read a few hundred pages in three days only, but only few sentences regarding homophone substitions ciphers. Almost nothing about how to crack those, however, but definitely an interesting read.

Still would be interested out of which brand book the halloween card symbol actually had come from..

QT

I will check out the website and book…thanks!

And I think I posted the book and page of the Otto Kuehster one as well as the circle 8 yesterday.

 
Posted : September 14, 2014 5:22 pm
Quicktrader
(@quicktrader)
Posts: 2598
Famed Member
Topic starter
 

Small update to the previous posts..

We may assume that the repeating trigram ‘IOF’ statistically appears at least twice in a text with 340 characters. We further may assume that this statistical expectaion is valid for the second repeating trigram, the ‘FBC’, too.

Both assumptions are backed by the idea that even more of each of those trigrams could be ‘hidden’ amongst additional homophones. So far, those might be positioned on a yet unidentified position in the cipher. The trigram ‘ERE’, for example, is expected to show up 2.8 times in a 340 cipher. It therefore could very well appear twice with identical homophones and in addition to that appear once, with a different set of homophones, somewhere else in the cipher.

Now let’s have a closer look at those trigrams:

Out of 17,576 (26^3) potential trigrams deriving from the 26 letter alphabet, there are only thirteen combinations of two trigrams (‘IOF’ and ‘FBc’) that match BOTH, the frequency expectation of a minimum of 1.5 per 340 (therefore appearing rather twice than once) AND including the feature of being linguistically ‘compoundable’ together in a way so that a ‘IOFBc’ cipher structure can be obtained. For this second criteria, the 3rd letter of the first trigram must in fact be identical to the 1st letter of the second.

Out of 17,576 trigrams (with many of those not used in the english language at all, e.g. ‘QYX’), the following combinations of two trigrams tend to appear (each trigram for itself) at least twice in a 340 cipher:

THE + ERE = THERE (0.0238%)
THE + ENT = THENT (0.0200%)
ENT + THE = ENTHE (0.0200%)
THE + ESS = THESS (0.0135%)
ERE + ERE = ERERE (0.0065%)
ERE + ENT = ERENT (0.0055%)
NDE + ERE = NDERE (0.0054%)
THA + AND = THAND (0.0049%)
NDE + ENT = NDENT (0.0046%)
ERE + ESS = ERESS (0.0037%)
NDE + ESS = NDESS (0.0030%)
ENT + THA = ENTHA (0.0030%)
ENT + TER = ENTER (0.0030%)

Please note that there are no more combinations possible with two trigrams expected to show up at least 1.5 times in a 340 text.

So IF the cleartext is a different one than one of those 13 combinations above, either one or two of its trigrams is a statistical outlier (no matter how strong). All this of course based on the given statistical data as well as the assumption of the cipher being a homophone cipher.

It is obvious that some of these trigram combinations appear to be rather ‘artificial’ (e.g. ‘NDESS’) whilst others appear to rather represent frequent words (e.g. ‘ENTER’ or ‘THERE’). The artificial ones occurr due to the fact that we only look at the trigrams itself, not the word frequencies of the 5-letter-combination itself, and. that some frequent trigrams simply do not match well to other frequent ones. Nevertheless, most of the combinations can be applied to certain phrases of the English language.

In fact these 13 cleartext phrases are the only (!) combinations (out of 26^5 = 11,881,376) with the length = 5 that match the considered criteria of trigram frequencies.

Other trigram combinations, such as ‘ITHER’, might be frequent word phrases, too. But it’s trigrams are not expected to appear more often than 1.5 per 340, therefore at least twice: ‘ITH’ for example has a frequency of 0.378%, therefore is expected to show up only 1.3 times. The second trigram, ‘HER’, has frequency of 1.140% expected to show up 3.9 times. Therefore only the second trigram would fulfill the statistical expectation.

We still cannot say ‘for sure‘ that not any letter combination is in fact the cleartext. But if we trust in statistics, they are not.

So what happens if we do assume two trigrams with a statistical value of less than 1.5 per 340?

The air is getting thinner..as an example let’s take the last of some analyzed 400 trigram combinations (‘WHIGH’) which is structured as following:

IOF = ‘WHI’ is expected to show up 0.5 times per 340 (0.154%)
FBC = ‘IGH’ is expected to show up 0.6 times per 340 (0.172%)
(both nevertheless do show up twice!)

leading to a ‘IOF’ and ‘FBc’ probability of 0.0002656% in a length=340 text.

Chances therefore are approximately 1 : 1,100 that BOTH of the trigrams, ‘WHI’ and ‘IGH’, will appear twice at a certain position of the cipher.

We now would like to know the probabilities of two trigrams appearing together at any position of the cipher. For us it doesn’t matter on which position our urgently sought ‘IOFBc’ structure is present (in reality it is line 13, but this is equal to all of the other positions) and has our previous observation focussed on one of 100 (percent) or one of 340 positions.

We therefore should transform these values now on all of the possible locations in the 340 cipher, those where such a trigram (combination) might be placed to. This is not 340 but 338 positions, assuming that the trigram doesn’t switch over the end of the cipher to it’s first homophone again. Therefore trigrams can be placed on 338 different positions and do we conclude:

AbsoluteFrequTrigram#1 x AbsoluteFrequTrigram#2 x 338 positions
338 positions………………………….338 positions

or

AbsoluteFrequTrigram#1 x AbsoluteFrequTrigram#2 / 338 positions

which in our previous ‘WHIGH’ example leads us to:

0.5 x 0.6 / 338 = 0.08875%,

representing approximately a 1 : 1,100 chance that the combination of ‘WHIGH’ does appear in a text with the length = 340.

And this is our chance: The multiplication of two trigram probabilities, occurring in the English language, exponentially boosts or destroys our expectation ratios for two repeating trigrams in combination (‘IOFBc’). The two trigrams next to each other are therefore a chance to ‘crack’ the cipher.

Not sure yet? So let’s have an equal look at the combination of ‘THERE’:

IOF = ‘THE’ is expected to show up 10.0 times per 340 (2.944%)
FBC = ‘ERE’ is expected to show up 2.8 times per 340 (0.811%)
(both even likely to show up twice)

therefore

10.0 x 2.8 / 338 = 8.1624%,

thus representing a 1 : 12.25 chance that ‘THERE’ will appear in a 340 text!
Compared to 1 : 1,100 of the trigram combination ‘WHIGH’!

The air is getting thicker now. If someone would ask if the phrase ‘THERE’ is written in the 340 or the phrase ‘WHIGH’, I’d easily bet a 100 bucks on ‘THERE’ without thinking further about it.

CONCLUSION:

– It is more likely that a frequent trigram combination represents the ‘IOFBc’ part of the cipher.

– Out of 17,576 trigrams, there only exist 13 combinations of two that fulfill the criteria of appearing at least 1.5 per 340 AND being able to create a linguistic combination (3rd letter of trigram #1 is equal to 1st letter of trigram #2)

– Amongst the Top 400 most frequent trigram’s combinations, the probability ratios range from 1:1,100

– Summarized probability of the ‘Top 23’ trigram combinations is approximately 75%. This means that with a probability of 75% at least one of those ‘Top 23’ trigram combinations is present in the 340 (no matter if visible or not).

– Of those ‘top 23’, an overall of 19 contain the trigram ‘THE’, either as the first or as the second trigram.

– With a total of approximately 11.9m alphabetical letter combinations of length = 5, it is still possible that the ‘IOFBc’ structure derives from a ‘rare’ trigram combination.

– However statistical expectation for those rare trigram combinations drop dramatically (e.g. to over 1:2,000). It can be estimated that the decrease of probabilities is following a similar-to-logarithmic behaviour (e.g. top 10 – 10%, top 100 – 10%, top 1000 – 10% etc.)

– Although many combinations exist, their individual probabilities are reduced. A theory of both, the trigrams expecting to have a probability of >1.5 per 340 AND the overall expectation of the trigram combination ‘IOFBc’ might be combined. This again shows that the ‘Top 38′ appear to be attractive for further analysis, with all the 13 statistically valuable trigrams’ combinations amongst them.

– A theory of ‘frequent language’ is an additional approach and delivers an certain ‘cracking approach’ for itself: Some trigrams are frequent for themselves standing alone, however not in combination with each other. Others, however, are indeed ‘frequent language’, such as ‘ENTER’, ‘THERE’, ‘ERENT’ etc. are in fact representing frequent words of the English language. Therefore those may be selected first out of a list such as the ‘top 38’. Using those, further methods can be performed, e.g. ZDK, Oranchak Webtoy or other linguistic and cipher structure analysis.

– Given certain frequencies for the ‘IOFBc’, there even migth be certain letters ‘excluded’. For example is it quite unlikely that the fourth letter (‘B’) is representing the letter Q because ‘B’ has a frequency by itself of approximately 3.5%. All ‘IOFBc’ homophones do, btw, represent letters with a frequency of at least approximately 2.0-3.5%.

– Last but not least: Particularly in homophone ciphers, combinations of ‘seldom’ trigrams are even least likely. It must be assumed that a trigram which shows up in a homophone cipher twice, could in fact be present more often, a third, fourth, fifth etc. time (hidden amongst different homophones representing the same alphabetical letters). This supports the idea of frequent trigrams (e.g. both > 1.5 per 340) being prominent to the overall ratio of expectation, although latter actually does includes the trigram frequencies, too.

– Finally: The e.g. ‘Top 400’ can be entered into other methods of cryptanalysis, such as additionally trying certain letters for the ‘+’ symbol, to obtain longer chains of cleartext. This again can be cross-checked to other areas of the cipher.

QT

*ZODIACHRONOLOGY*

 
Posted : September 16, 2014 5:34 pm
(@masootz)
Posts: 415
Reputable Member
 

is it possible part of the cipher isn’t encrypted? would leaving random letters as plain text add to the difficulty in deciphering?

 
Posted : September 17, 2014 9:13 pm
smithy
(@smithy)
Posts: 955
Prominent Member
 

Yes it is, yes it would. Interesting thought.

 
Posted : September 17, 2014 11:04 pm
(@masootz)
Posts: 415
Reputable Member
 

Yes it is, yes it would. Interesting thought.

i know it’s easy to see whatever you want to see in the 340, but when i play with the webtoy i’m often struck by how easily a character fits into a space without being decrypted (e.g. – if a "C" is actually just a C) and i think something that like would be up his alley – essentially making it harder by making it simpler.

here’s another thought (that i stole from one of those long rambling blog posts that add and subtract numbers to reach a conclusion that makes your head spin) – what if the 340 isn’t a substitution cipher but rather a grille cipher? you solve it by taking a specific cut out (paper, map, etc) and overlaying it on top of the letters so that the exposed letters make the message. we’ve seen the somewhat apparent four "by" and part of "paradise", however i can’t find anything he sent in that timeframe that would work. the bus bomb diagram was suggested by someone but they pretty much cut out random parts to make it fit what they wanted it to say.

 
Posted : September 18, 2014 4:38 pm
(@masootz)
Posts: 415
Reputable Member
 

below is what i get when i represent a cipher as its actual corresponding letter. obviously a letter frequency significantly lower than the expected value could indicate an additional representative character elsewhere, but i’m trying to keep this simple and linear. a letter frequency much higher than the expected value would be a poor candidate for an unencrypted character. the ones that match the closest (within four occurrences of expected value) are C, F, G, J, K, M, P, U, V, W, X, Y, Z. the closest matches are G (-1), M (-1), W (-2), X (+1).

PLAINTEXT: EXPECTED:
A 2 (1%) A 28 (8%)
B 12 (4%) B 5 (1%)
C 5 (1%) C 9 (3%)
D 4 (1%) D 14 (4%)
E 3 (1%) E 43 (13%)
F 10 (3%) F 8 (2%)
G 6 (2%) G 7 (2%)
H 4 (1%) H 21 (6%)
I 10 (3%) I 24 (7%)
J 4 (1%) J 1 (0%)
K 7 (2%) K 3 (1%)
L 6 (2%) L 14 (4%)
M 7 (2%) M 8 (2%)
N 5 (1%) N 23 (7%)
O 10 (3%) O 26 (8%)
P 3 (1%) P 7 (2%)
Q – not represented
R 8 (2%) R 20 (6%)
S 4 (1%) S 22 (6%)
T 5 (1%) T 31 (9%)
U 5 (1%) U 9 (3%)
V 6 (2%) V 3 (1%)
W 6 (2%) W 8 (2%)
X 2 (1%) X 1 (0%)
Y 4 (1%) Y 7 (2%)
Z 4 (1%) Z 0 (0%)

 
Posted : September 18, 2014 5:16 pm
Talon
(@talon)
Posts: 183
Estimable Member
 

Yes it is, yes it would. Interesting thought.

here’s another thought (that i stole from one of those long rambling blog posts that add and subtract numbers to reach a conclusion that makes your head spin) – what if the 340 isn’t a substitution cipher but rather a grille cipher? you solve it by taking a specific cut out (paper, map, etc) and overlaying it on top of the letters so that the exposed letters make the message. we’ve seen the somewhat apparent four "by" and part of "paradise", however i can’t find anything he sent in that timeframe that would work. the bus bomb diagram was suggested by someone but they pretty much cut out random parts to make it fit what they wanted it to say.

Yes, I have often thought a grille cipher would make sense. Another possibility would be a 17 x 20 crossword puzzle where the blacked out spaces covered up the insignificant symbols.
It’s possible that it could have been taken from a Sunday crossword from an area newspaper just prior to sending the 340.

 
Posted : September 18, 2014 5:29 pm
(@capricorn)
Posts: 567
Honorable Member
 

😆 I’ll be glad when you all solve this cypher. I’ve developed a permanent headache from trying to understand what is being said. :lol:

Give me psychology any day. :P

Me too! Why can’t the + just stand for "z" or "I" as that is the symbol he used on his costume?

I really think the grill idea is the most likely! I know my guy had a tablet with that which he showed me and explained how it is used in computer programming. I also think the crossword idea is good as well as word search and word-within-a-word. What I really like is symmetry and mirrors and the letters that can be reversed or written upside down such as an "m" would look like a "w" if flipped. These are all devices I think Z would have used….simple but tricky and easy once you knew his trick.

 
Posted : September 25, 2014 7:27 pm
Page 14 / 25
Share: