As I was trying to find bigram repeats at period 19 in the Z340 with AZdecrypt’s Merge sequential homophones I found mostly spikes around period 71 (transposed), example:
Period 68: 74, 69 (1.34, 0.46) Period 69: 75, 72 (1.52, 0.99) Period 70: 79, 65 (2.22, -0.23) Period 71: 81, 62 (2.57, -0.76) <--- Period 72: 78, 65 (2.04, -0.23) Period 73: 77, 77 (1.87, 1.87) Period 74: 78, 67 (2.04, 0.11)
It is related to period 5 as 5 * 68 = 340. But 71 is not quite 68.
In 2015 and 2016 I noticed that inserting a 18th column after the 17th column increases period 5 bigrams repeats allot.
Another strange observation, add one random column to the 340 at the end and suddenly bigram repeats period 5 pop up with counts around 40, tried various randomizations of the last column and they all had the same effect
The graph on the bottom is for the 340 in a 18 by 20 grid with a new set of symbols inserted in the 18th column. Look at how period 5 comes into play now. How can this be? I don’t understand it at all, smokie I’m hoping that you could take a look at it. Are we possibly looking at a cipher with 2 different interlaced periods?
Then the cipher would be 360 characters long and period 5 would be 5 * 72 (close to 71). Adding the 18th column also causes period 19 to shift to period 20 and this allows overlap between period 5 and period 20 (old period 19) since period transposition is multiplicative.
Period 5 in 18 by 20 example matrix (not hard to spot period 19 in there):
I thought of adding a 18th column at every possible column position while measuring bigram repeats at period 5:
AZdecrypt combine statistics for: Zodiac 340.txt --------------------------------------------------------- Add column(1), Period(UTP,5): 34 <--- Add column(2), Period(UTP,5): 31 Add column(3), Period(UTP,5): 23 Add column(4), Period(UTP,5): 25 Add column(5), Period(UTP,5): 22 Add column(6), Period(UTP,5): 22 Add column(7), Period(UTP,5): 22 Add column(8), Period(UTP,5): 25 Add column(9), Period(UTP,5): 25 Add column(10), Period(UTP,5): 23 Add column(11), Period(UTP,5): 21 Add column(12), Period(UTP,5): 21 Add column(13), Period(UTP,5): 25 Add column(14), Period(UTP,5): 26 Add column(15), Period(UTP,5): 24 Add column(16), Period(UTP,5): 28 Add column(17), Period(UTP,5): 28 Add column(18), Period(UTP,5): 35 <---
And then used a transposed period 5 Z408 in 18 by 20 with the last column removed resulting in a 17 by 20 control cipher:
AZdecrypt combine statistics for: Zodiac 408.txt --------------------------------------------------------- Add column(1), Period(UTP,5): 36 <--- Add column(2), Period(UTP,5): 32 Add column(3), Period(UTP,5): 31 Add column(4), Period(UTP,5): 33 Add column(5), Period(UTP,5): 25 Add column(6), Period(UTP,5): 22 Add column(7), Period(UTP,5): 20 Add column(8), Period(UTP,5): 21 Add column(9), Period(UTP,5): 22 Add column(10), Period(UTP,5): 20 Add column(11), Period(UTP,5): 22 Add column(12), Period(UTP,5): 20 Add column(13), Period(UTP,5): 21 Add column(14), Period(UTP,5): 24 Add column(15), Period(UTP,5): 28 Add column(16), Period(UTP,5): 31 Add column(17), Period(UTP,5): 31 Add column(18), Period(UTP,5): 37 <---
WOW, I’m hooked!
I then tried to look at whether the column was removed before or after/during homophonic substitution. The results may be in favor of after homophonic substitution but it seems very hard to measure.
If one would try to hill-climb the 18th column to "restore" the cycles then the 63^20 search space would certainly allow to overshoot the target by a large margin. So I tried 1,000,000 shuffles instead for inserting a random 18th column at column position 9 and 18 (position 9 is farthest away from 18) and see how a 2-cycles measurement would respond. While the response is very small (due to the 63^20 search space) it does show around a +4 increase for both the control cipher and the Z340.
Z408 (17 by 20): ------------------------------ Position 9: 2652.77 Position 18: 2651.78 (-0.99) Z408 (hypothesis testing, control cipher): ------------------------------ Position 9: 2434.04 Position 18: 2438.41 (+4.37) Z340: ------------------------------ Position 9: 2051.81 Position 18: 2055.66 (+3.85)
I’ve tried a few short runs with AZdecrypt’s simple transposition solver and the Z340 with a 18th column consisting of 20 new unique symbols but no luck.
Given that I think that the first order of business would be conduct further testing. Sadly my testing code currently has a bug which does not allow me to do any quick testing so I would appreciate some help.
– 1) Determine the odds of period 5 bigrams increasing from 30 to 35 by adding a 18th column of new unique symbols to shuffled Z340’s which have 30 bigrams in place at period 5.
– 2) Can someone figure out another test to see if a 18th column was removed after homophonic substitution? Given the 63^20 search space I say no but perhaps someone can figure out something clever?
– 3a) Undetermined relation with period 19 seems very plausible but how? Perhaps period 19 is a product of period 5?
– 3b) Shoe-horning: a plain text with a allot of period 4 bigrams multiplied by a period 5 transposition would have bigrams at period 20, then after removing a column it becomes period 19.
– 1) Determine the odds of period 5 bigrams increasing from 30 to 35 by adding a 18th column of new unique symbols to shuffled Z340’s which have 30 bigrams in place at period 5.
For thz Z340, the odds are 42 in 100,000 or about 1 in 2,500. Pretty decent.
For the Z408 control cipher, the odds are 12 in 1,000,000 or about 1 in 100,000. More significant since it has to do a bigger jump from 28 to 37 bigrams.
Jarlve,
when you say adding an 18th column consisting of new unique symbols, do you mean symbols not currently in the z340? So the period 5 bigrams would be created from symbols in column 14 and column 1?
Jarlve,
when you say adding an 18th column consisting of new unique symbols, do you mean symbols not currently in the z340? So the period 5 bigrams would be created from symbols in column 14 and column 1?
Yes indeed.
Jarlve, a thought on:
– 2) Can someone figure out another test to see if a 18th column was removed after homophonic substitution? Given the 63^20 search space I say no but perhaps someone can figure out something clever?
If we presume Z is doing the homophonic substitutions in a similar method to 408 (a very hopeful/dangerous tactic I know), one pattern we could expect to notice would be with the cycles we look for. If we identify cycles where breaks seem to happen between lines, this could be due to a symbol in the cycle being missing due to the removed 18th column. We could compare how many of these candidate cycle breaks we get in the 340 (/360?) vs a subset of random 18×20 ciphers with the 18th column removed. Does that make any sense?
Edit: Of course if we’re throwing in a possible period 5 transposition for good measure we’d have to untranspose first, keeping track of where 18th column symbols end up before doing this test – we’re basically looking for how often "cycle break zones" overlap "18th column symbol positions".
why not use the last line of the 408? drop it on top not the bottom.
woops i see you are on columns not rows. same same
the last 18 of the 408
Jarlve, a thought on:
– 2) Can someone figure out another test to see if a 18th column was removed after homophonic substitution? Given the 63^20 search space I say no but perhaps someone can figure out something clever?
If we presume Z is doing the homophonic substitutions in a similar method to 408 (a very hopeful/dangerous tactic I know), one pattern we could expect to notice would be with the cycles we look for. If we identify cycles where breaks seem to happen between lines, this could be due to a symbol in the cycle being missing due to the removed 18th column. We could compare how many of these candidate cycle breaks we get in the 340 (/360?) vs a subset of random 18×20 ciphers with the 18th column removed. Does that make any sense?
Edit: Of course if we’re throwing in a possible period 5 transposition for good measure we’d have to untranspose first, keeping track of where 18th column symbols end up before doing this test – we’re basically looking for how often "cycle break zones" overlap "18th column symbol positions".
No worries about period 5 since it most certainly must have happened before homophonic substitution. One problem with looking at cycle breaks is that they will not show up on the actual position. Another problem is that there are 1000’s of potential cycles and with the Z340’s increased randomness it is really hard to determine the actual cycles.
why not use the last line of the 408? drop it on top not the bottom.
woops i see you are on columns not rows. same same
the last 18 of the 408
Pretty sure that it is filler.
One problem with looking at cycle breaks is that they will not show up on the actual position
True, but would we not expect to see a statistical bump in "break zones" that cover the 18th column position vs those that don’t. Could we not then assess the significance of this bump against a random subset of similar ciphers?
Certainly there are 1000’s of cycles that are present due to randomness, but if we limit the search to cycles of, for example, more than 4 symbols this should cut the number down significantly, no? If (again, dangerous tactic) we are assuming a similar homophonic substitution method, the logical outcome of a larger symbol alphabet despite fewer characters should be an increase in multi-symbol cycles.
Any luck hazarding a guess at the period 5/19(20) connection? I’ve wracked my brains for a little while and nothings popped out.
One problem with looking at cycle breaks is that they will not show up on the actual position
True, but would we not expect to see a statistical bump in "break zones" that cover the 18th column position vs those that don’t. Could we not then assess the significance of this bump against a random subset of similar ciphers?
I don’t think it is possible to identify a break zone since the Z340 has many very long strings without any repeats. But perhaps I am misinterpreting your idea.
Any luck hazarding a guess at the period 5/19(20) connection? I’ve wracked my brains for a little while and nothings popped out.
Only this for now:
– 3b) Shoe-horning: a plain text with a allot of period 4 bigrams multiplied by a period 5 transposition would have bigrams at period 20, then after removing a column it becomes period 19.
The period 5/19(20) connection seems so obvious. At the moment I am more than 50% positive that there is a connection. I still need to run more tests to see if there are any "normal" connections. Some examples:
Does a period 19 transposition increase bigrams at period 5? Tested: answer is no.
Does a period 20 transposition for a 18 by 20 minus one column increase bigrams at period 5? Tested: answer is no.
Playing around with the idea that the plain text was split up into 4 or 5 parts somehow and then recombined somehow but that’s allot of somehow.
sure its filler .. but its as good as any make up column filler.. if we add it may as well be "known trash"
Certainly there are 1000’s of cycles that are present due to randomness, but if we limit the search to cycles of, for example, more than 4 symbols this should cut the number down significantly, no? If (again, dangerous tactic) we are assuming a similar homophonic substitution method, the logical outcome of a larger symbol alphabet despite fewer characters should be an increase in multi-symbol cycles.
I wonder if this can be done via brute force search. Consider all 5-symbol cycles. Z340 has 63 unique symbols so there are 63*62*61*60*59 = about 850 million possible cycles. Somewhat tractable.
For each selection of symbols:
- Compute what the perfect cyclic sequence would be, based on those symbols’ frequencies in Z340.
- Compute the actual sequence, based on how those symbols appear in Z340.
- Compute the edits (insertions of symbols) needed to make the actual sequence match the perfect sequence
- Determine the ranges of possible positions for those edits
- Determine how many of those ranges fit the "missing column" hypothesis, which would affix the "corrections" at positions that differ by multiples of 18.
[/list:u:2eafk274]
Presumably, all the missing symbols would be at positions {18n + k} where n is the row number and k is the column number. So we seek "k" that maximizes the positive hits from the above search.
For example, a perfect sequence for {a,b,c,d,e} would be abcdeabcde but appears in the cipher as abdeabce.
Making it perfect requires these inserts: abcdeabcde.
If the "c" and "d" can be placed 17 positions apart (or a multiple of 17 positions apart), then the sequence might qualify as a "real sequence that is restored when the missing 18th column is restored".
If the "restored" Z340 is like Z408 then it will still have imperfect cycles so you’d have to look for longest sequences of cycle corrections that fit in positions that differ by multiples of 18.
But maybe this, too, would produce numerous false positives due to the "wiggle room" of where the insertions are being done.
I thought that Xmassloth meant 2-symbol cycles where each symbol has at least a frequency of 4.
Even after removing the 18th column from a 18 by 20 Z408 it is still way more cyclic than the Z340. I used to think that the Z340 has about 25% cycle randomization but now think that 30% to 35% is more likely. Not all observations would agree on that though. For example, the number of unigram repeats is very low.
Ran another test with a perfect 2-symbol cycles measurement, it is much in favor of the Z340 not having a 18th column after sequential homophonic substitution:
Z408 control: ------------------------- 9: 1477.72 18: 1489.88 (+12.16) Test cipher: ------------------------- 9: 1217.8704 18: 1227.990416 (+10.12) Z340: ------------------------- 9: 1247.23 18: 1248.00 (+0.77)
Both the following ciphers have a 18th column removed:
Z408 control: 9%P/Z/UB%kOR=pX=B V+eGYF69HP@K!qYeM Y^UIk7qTtNQYD5)S( 9#BPORAU%fRlqEk^L ZJdrpFHVWe8Y@+qG 9KI)6qX85zS(RNtIY lO8qGBTQS#BLd/P#B XqEHMU^RRkcZKqpI) q!85LMr9#BPDR+j=6 N(eEUHkFZcpOVWI5+ L)l^R6HI9DR_TYrd /@XJQAP5M8RUt%L)N EKH=GrI!Jk598LMlN )Z(PzUpkA9#BVW+V tOP^=SrlfUe67DzG% IMNk)ScE/9%%ZfAP# VpeXqWq_F#8c+@9A9 %OT5RUc+_dYq_^SqW ZeGYKE_TYA9%#Lt_H FBX9zXADd7L!=q_e Test cipher: =8E0I#:9]5:'"EWP >I^@[5:#)ZCRY/5?X 7.F)Y"85VB"@UCI:I #7E9Y&0R1)7<'=Y?5 .>)JP1VM&D>#T)SBG 0RD'IW95,HE[R;+'U )5MM#0RY;1*Q]:N)7 8Z@=BL5XC.Y3>PFI? <#*RE!]2-Z1VJ%S/) M5'7.^>1_VT&D>I6 S:"85OF)H7@#E0R!Q 'V?.5KM)SBLI+O#JX LTX5)=1)95K[CR'J PWEF^>65H:L8EVM." BGS%RZ<7'.LUIY#*R )7"'G&X;4)2.I-#QG V=Y9I*?[CIF5%I!PJ ]W#F)E:6R705LY]B% S/41(.J-^W]@!F5E @"*)3/]B$E:ZCIF<&
– 3b) Shoe-horning: a plain text with a allot of period 4 bigrams multiplied by a period 5 transposition would have bigrams at period 20, then after removing a column it becomes period 19.
One way to achieve this would be mixing of periods *somehow*. For example, the cipher could be divided in 2 parts with the first part having a period 4 transposition, then do a period 5 transposition on the whole and there will be a bigram peak at period 5 and 20:
IBUEBEDESEIAOTCILV POLENAMFGCRESNMTMV IEOYICIFSUAAILTAET AENNIVKATAENMUTSHM THISGBAWEUIEITESWL EAHHLMEEENKSANOHIE ITMLIDGEXTIIIEMEVA NMIEOONWYTPTIOLIOL UNTALMAFGBYHEEENLT RRAKHNEMTAGBEERROB IITREIEHNCDLAIAMEN NUNSGPSLFEAEALTRYO NTAOGSIDULONIWNTGC IACYELPOOALIRUNTGO RDRTERAYEMOLTNRCLO EKKTSTISHLOUOEOGEF IIRIFHDHTWAOPCIVWW STENTLHKORIAIRLHGL AITILTULREOISLDPEF TWMLBHIYESELCLEBLO Bigrams: Periodic: (transposition, untransposition) --------------------------------------------------------- Period 1: 145, 145 (-0.35, -0.35) Period 2: 149, 143 (0.45, -0.75) Period 3: 145, 141 (-0.35, -1.16) Period 4: 151, 144 (0.85, -0.55) Period 5: 156, 168 (1.86, 4.29) <--- Period 6: 152, 149 (1.05, 0.45) Period 7: 149, 143 (0.45, -0.75) Period 8: 152, 149 (1.05, 0.45) Period 9: 156, 144 (1.86, -0.55) Period 10: 147, 151 (0.04, 0.85) Period 11: 145, 148 (-0.35, 0.25) Period 12: 144, 151 (-0.55, 0.85) Period 13: 140, 154 (-1.36, 1.46) Period 14: 137, 147 (-1.96, 0.04) Period 15: 142, 151 (-0.95, 0.85) Period 16: 134, 146 (-2.57, -0.15) Period 17: 147, 150 (0.04, 0.65) Period 18: 171, 153 (4.89, 1.26) <--- Period 19: 154, 154 (1.46, 1.46) Period 20: 153, 171 (1.26, 4.89) <---
Though the problem with the above is that the 18th column still has to be removed to make period 20 into 19. And after adding back in the column (by means of new unique symbols for example) bigram repeats should go up for both period 5 and 20 (19). In the Z340 adding in a 18th column decreases the amount of period 20 (19) repeats.
If we identify cycles where breaks seem to happen between lines, this could be due to a symbol in the cycle being missing due to the removed 18th column. We could compare how many of these candidate cycle breaks we get in the 340 (/360?) vs a subset of random 18×20 ciphers with the 18th column removed. Does that make any sense?
Planning to look into this today. I will update along the way as things figure out.
Test, log the first cycle break position on a heat map for every unique 2-symbol cycle, results:
Z340:
Z408: