[My first post – I thought I would mix things up with some probability theory…]
I have been reading and viewing various material on analysis of z340, and I’ve noticed something people are doing that should probably come with a health warning.
It will typically go like this…
“I’ve noticed an interesting pattern in z340’s cipher text. (e.g. pivots, repeated symbols, something odd when you add 10 etc). If I look for this pattern in a large set of random ciphers it only appears 1% of the time. So that’s quite interesting”
The implication is that because a given pattern only occurs 1% of the time by chance, it might be significant. It’s not much of a step to go on and convince yourself that there’s a 99% chance it is significant, and caused by something in the way the message was enciphered.
Here’s the problem.
When we spot a pattern and work out how likely that pattern is to occur by chance, we have chosen the pattern. What we are not doing is considering all of the patterns that have not appeared.
The best example of this is the Birthday Coincidence problem. How many people do you need at a party for there to be a 50% chance that 2 of them have the same birthday? And the answer is the surprisingly low…23 people. But if you pick a specific person and ask what is the probability that someone else at that party has the same birthday, it’s only about a 6% chance (that at least one of the remaining 22 was born on that date). What can appear as only a 6% chance, is in fact a 50% chance when you don’t care what the date/pattern is.
So, whilst doing a random pattern trial on the cipher is valid and useful, be careful not to be convinced you’ve found something significant when it may not be quite as you thought.
Unless anyone disagrees of course…
Hi cipherkid,
You’re absolutely right. It is quite possible that we are all chasing a ghost. Doranchak has written something similar to you:
I would add to smokie’s comments that the shuffle test is best at identifying which phenomena really are happening purely by chance (for example, the Jazzerman patterns). It’s such an easy test to implement, so it’s a bit of "low hanging fruit". Where things get difficult, however, is making conclusions when the test suggests phenomena are rare. For example, you could say that the string "HER>pl^VPk" is rare because it seldom or never comes up in shuffles. But the number of similarly rare strings is astronomical. So the fact that we saw one rare example doesn’t mean much. Here you have a case of a specific very rare pattern representing a very common category. So when we say, "the pivots are very rare", could they possibly belong to a very common category of similarly interesting patterns that appear in shuffles but we just aren’t looking for them?
Because of that difficulty, I’ve moved away from making overly conclusive statements about things we observe in the Z340. To me it is better to collect all the interesting observations and compare them to examples from known encipherment schemes. And the observations themselves help guide the search for potential encipherment schemes (hopefully).
Source: viewtopic.php?f=81&t=3070&hilit=kasiski&start=40#p48406
z340 is a great hobby from which you can learn a lot (programming, math, statistics etc.). As with everything in life, you have to be careful not to overdo it. For my part, I know that z340 may never be solved and there may even be no solution at all. I am aware that I may invest my time in something that will never be finished. As long as I know that, everything is fine.
Translated with http://www.DeepL.com/Translator
Largo, absolutely right..the whole case is like that..
QT
*ZODIACHRONOLOGY*
Funnily enough, it was whilst watching Mr Oranchak’s excellent 2018 ACA Presentation on youtube from last year that made me think of this. Specifically because I noticed how careful he was being not to draw any positive inferences. A master class in how to treat patterns correctly.
The analogy I like to use is a golf ball getting hit into a big field of grass.
Try to pick which blade of grass gets hit with the golf ball. From the perspective of the blade of grass: He won the lottery! What are the odds? Extremely low.
But the ball HAD to land on grass. A rare thing in particular happened, but in general HAD to happen.
Many of the things we find in the 340 belong to a field of grass, because that interesting pattern we like belongs to a category of individually rare things.
But the really tricky thing is to figure out those categories, and to see if they are strongly associated with certain cipher construction methods.
I would have to defend trying to attach significance to a pattern. If you were trying to decide how you were going to spend the next 1,000 hours working on a solution, and tying up your computer for the next 6 months on one particular approach, you would want to make a good decision. Failure after putting that time into any one hypothesis can be painful, and I am just saying that it is part of trying to avoid wasting time and the disappointment. It is the starting point for a hypothesis, trying to figure out if one of the classical ciphers caused the pattern.
I do agree. I guess it’s that middle ground between spotting a pattern that is so common it is almost definitely a coincidence, and something so rare it is screaming for further investigation. That ground around the 1% by random chance is where we might see false hope, but then what else have we got to go on?
That probably wasn’t the world’s best first post, given that it provided nothing constructive.
CK
No problem. Welcome to the discussion.
I am really very thankful for this site. I like the record, even if scattered, and I like the sandbox and collaboration and openness to new people.
For me it’s the journey that is fun. I’ve learned alot and solved some fun problems while coding. I’m sure I won’t crack it.
There are some interesting parts and patterns in Z340. However, I don’t find it very likely that you end up with a cipher looking like Z340 if you encode something real by hand. I believe it’s fake. But I’m fine with that.
Since I’m putting all my unhelpful stuff in this threat…
I’m aware that it’s very difficult for a human to create a truly random pattern if they try. It’s fine for short sequences of things, but as we are asked in invent longer and longer sequences we just can’t help not putting in the less likely patterns as we go, and this is detectable.
I think the classic experiment is to toss a coin for heads and tails, vs ask someone to invent a sequence of fake H and T from their mind. The human version never has the expected distribution of sequence combinations compared to truly random, and some analysis can usually identify the real coin sequences.
Which feels mighty close to what we see in z340 – particularly some of those horizontal left to right patterns, and lengths of non-repeats. Feels like someone is just trying a bit too hard to not repeat a symbol.
Anyway, on with the journey.
I’m aware that it’s very difficult for a human to create a truly random pattern if they try. It’s fine for short sequences of things, but as we are asked in invent longer and longer sequences we just can’t help not putting in the less likely patterns as we go, and this is detectable.
Which feels mighty close to what we see in z340 – particularly some of those horizontal left to right patterns, and lengths of non-repeats. Feels like someone is just trying a bit too hard to not repeat a symbol.
Very interesting hypothesis. You are suggesting that the sequentially of the 340 is an unintended by-product of Zodiac trying to create random sequences! Would like to test but then we need some unaware test subjects.
It’s so difficult to create mathematical random by hand that it probably doesn’t matter too much knowing the goal in advance.
So the hypothesis would be that Zodiac had 63 symbols (or slightly more) to select from, and then filled the grid left-right, top-bottom, either trying to be “random” as he saw it, or possibly with bias towards avoiding repetition.
I would guess (needs testing) that this would lead to detectable IoC bias horizontal vs vertical (that we see in z340).
I find the distribution of non-repeating sequence lengths (from David Oranchak’s ACA Presentation Video 38:31 – Sorry, I don’t know the source for this), where it drops off after the peak of 17 (cipher grid width) very interesting. It seem to hint to me that Zodiac might have been casually or subconsciously avoiding symbol repetition as he wrote horizontally, and naturally found this easier as he approached the end of a line (quick scan back with the eyes and recent memory), rather than near the start of a new line.
Is there any right-side bias detectable in where symbols do or don’t repeat?
It’s a stretch, but maybe this also leads to the skip-19 bigram peak covariant with the same approach.
I find the distribution of non-repeating sequence lengths (from David Oranchak’s ACA Presentation Video 38:31 – Sorry, I don’t know the source for this), where it drops off after the peak of 17 (cipher grid width) very interesting.
I am the source, it is some of my earliest work dating back to 2014-2015.
It seem to hint to me that Zodiac might have been casually or subconsciously avoiding symbol repetition as he wrote horizontally, and naturally found this easier as he approached the end of a line (quick scan back with the eyes and recent memory), rather than near the start of a new line.
I had the same hypothesis a while ago! In a different setting. In the 408 we can extract the homophone groups with some confidence without solving the cipher but in the 340 we cannot. By simply avoiding repetitions as he wrote horizontally the cipher gets the appearance of sequential homophonic substitution but no homophone groups can be extracted and it turns out that it is also prone to create stronger right-shifted peaks in the non-repeats.
So I came the conclusion that if Zodiac tried to hide unigram repeats he could have done it another way, by not trying to repeat symbols in a given window of his view, without having to keep track of the cycles. The statistics of the 340 seem to agree: low unigram repeats over short/medium distance, high unique sequence peak of 26 at length 17, no apparent homophonic sequences such as in the 408. A silver bullet.
—
Is there any right-side bias detectable in where symbols do or don’t repeat?
Yes, but is also there in the 408 as it is a product of sequential homophonic substitution applied from left-to-right, top-to-bottom.
Here is my original 340 plot where the red graph is the left-to-right, top-to-bottom direction and the green graph is right-to-left, top-to-bottom (mirrored). The other colors are other directions such as by columns and diagonals.
It’s a stretch, but maybe this also leads to the skip-19 bigram peak covariant with the same approach.
Not connected imo.
I find the distribution of non-repeating sequence lengths (from David Oranchak’s ACA Presentation Video 38:31 – Sorry, I don’t know the source for this), where it drops off after the peak of 17 (cipher grid width) very interesting.
I am the source, it is some of my earliest work dating back to 2014-2015.
My hat off to your sir. You’ve done a solid engine strip down on this. I hope you don’t mind a rank amateur joining in.
Is there any right-side bias detectable in where symbols do or don’t repeat?
Yes, but is also there in the 408 as it is a product of sequential homophonic substitution applied from left-to-right, top-to-bottom.
That wasn’t the right-side bias I envisaged.
(This is chasing a ghost but…) If you identify the repeating bigrams in the skip-19 transposition, and then highlight their source location on the original 340, do you see any position bias for the centre of the pair left-right, or top-bottom?
I ask with the hypothesis that as Zodiac picked symbols for his random 340, did he subconsciously pick a symbol paired with a particular symbol on the line above, 2 symbols to the left?
Granted this is a weak argument, and if the theory is it’s a non-cipher crafted by hand, every pattern can just be attributed to choice – which isn’t very helpful. But I figured the above pattern would be telling since it’s not the sort of thing that one might choose to do consciously.
I find the distribution of non-repeating sequence lengths (from David Oranchak’s ACA Presentation Video 38:31 – Sorry, I don’t know the source for this), where it drops off after the peak of 17 (cipher grid width) very interesting.
I am the source, it is some of my earliest work dating back to 2014-2015.
My hat off to your sir. You’ve done a solid engine strip down on this. I hope you don’t mind a rank amateur joining in.
Most welcome.
Is there any right-side bias detectable in where symbols do or don’t repeat?
Yes, but is also there in the 408 as it is a product of sequential homophonic substitution applied from left-to-right, top-to-bottom.
That wasn’t the right-side bias I envisaged.
(This is chasing a ghost but…) If you identify the repeating bigrams in the skip-19 transposition, and then highlight their source location on the original 340, do you see any position bias for the centre of the pair left-right, or top-bottom?
I ask with the hypothesis that as Zodiac picked symbols for his random 340, did he subconsciously pick a symbol paired with a particular symbol on the line above, 2 symbols to the left?
Granted this is a weak argument, and if the theory is it’s a non-cipher crafted by hand, every pattern can just be attributed to choice – which isn’t very helpful. But I figured the above pattern would be telling since it’s not the sort of thing that one might choose to do consciously.
Not sure what you mean. There are 37 period 19 bigrams for LRTB and RLBT. RLBT is reverse of LRTB. And there are 41 period 15 bigrams for RLTB and its reverse LRBT. Period 15 is the same as period 19 with the other directions.
In the *new* Zodiac FBI files there is a magic square (image below) that follows period 15/19. That’s just the kind of stuff you have to deal with in the Zodiac case. It is not a normal magic square either, this special rule used was first discovered by some early arabic mathematician and basically forgotten about until 1990-2000 where 2 guys rediscovered the rule and wrote a paper about it. Largo found out this information and I have forgotten the exact dates and names. The magic square below is from about 1970. The idea is that it could double as a transposition matrix but it has not panned out.
To top of the Zynchronity Gareth Penn had the following in his Zodiac book "Times 17". Apparently he received some one ring phone calls which he marked on his calendar and somehow managed to interpret the rule from that… Me and daikon discovered the period 19 bigrams somewhere in 2015/2016. It just seems really odd that you find stuff like this that point in the direction of period 15/19.