Zodiac Discussion Forum

Pivot periodic bigr…
 
Notifications
Clear all

Pivot periodic bigram study

26 Posts
7 Users
0 Reactions
7,944 Views
doranchak
(@doranchak)
Posts: 2614
Member Admin
Topic starter
 

Z340 has a pair of “pivot" patterns, each pointing in the same direction:

How often does this occur completely by chance when shuffling the cipher text like a deck of cards? I ran 41,280,000 shuffles. About 1 in 100 shuffles had at least one pivot pointing in any direction. But only about 1 in 240,000 had at least two pivots pointing in the same direction. Here is an example of a pivot pair appearing during a random shuffle:

In a previous analysis of pivots, I estimated the probability of this happening to be around 1 in 280,000, by taking homophonic encipherment into consideration after looking at how often pivot patterns appear in many plaintexts.

If you untranspose Z340 by period 39, the symbols involved in the pivot patterns become repeating bigrams:

Similarly, the pivots look like repeating bigrams if you mirror 340 horizontally and untranspose by period 29:

So I was curious: How does the number of repeating bigrams influence the production of pivot pairs when the transposition is performed? It seems obvious that a higher bigram count will randomly produce pivot pairs more often. To confirm this, I ran another shuffle experiment with these steps:

1) Shuffle the Z340 cipher text
2) Randomly modify the cipher text until there are exactly N bigram repeats (without changing the symbol distribution)
3) Transpose by period 39
4) Count pivots and pivot pairs

I ran the experiment for millions of shuffles and for N = 20, 40, 60 and 80. Here are the results:

Notice that “average shuffles per pivot pair” goes down as we proceed from 40 to 60 to 80 bigram repeats. This suggests that higher bigram counts will cause pivot pairs to appear more often, but it is still a somewhat rare event to occur by chance. Even with 80 bigram repeats, pivot pairs only appeared in about 1 in 124,000 shuffles. But for some reason, the average shuffles per pivot pair for 20 bigram repeats is about 193,000 which is less than the average for 40 and 60 bigram repeats. This seems odd, and is either a reflection of an error in my experiment or some aspect of pivots I haven’t considered.

Nevertheless, I think this experiment affirms what we already know: The pivot pairs are unique and seem likely to reflect some aspect of the encipherment method.

http://zodiackillerciphers.com

 
Posted : January 8, 2017 4:37 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
 

Nice study doranchak!

2) Randomly modify the cipher text until there are exactly N bigram repeats (without changing the symbol distribution)

This must then be some sort of hill climbing?

Notice that “average shuffles per pivot pair” goes down as we proceed from 40 to 60 to 80 bigram repeats. This suggests that higher bigram counts will cause pivot pairs to appear more often, but it is still a somewhat rare event to occur by chance. Even with 80 bigram repeats, pivot pairs only appeared in about 1 in 124,000 shuffles. But for some reason, the average shuffles per pivot pair for 20 bigram repeats is about 193,000 which is less than the average for 40 and 60 bigram repeats. This seems odd, and is either a reflection of an error in my experiment or some aspect of pivots I haven’t considered.

I suppose that the "average shuffles per pivot pair" is based on the counts 20, 16, 19 and 31. With low numbers there is more room for variation/outliers so I would assume that this is the case until you can run at least 10 times as much iterations.

In the transposition misalignment hypothesis we assume that period 29/39 originally were period 2 bigram repeats. If you can manage the time then I’m interested how it would look with the following alteration, possibly with N up to 13 (at least I would like to see 1, 2 and 13 compared):

2) Randomly modify the cipher text until there are 40 period N bigram repeats (without changing the symbol distribution)

AZdecrypt

 
Posted : January 8, 2017 10:00 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
Topic starter
 

This must then be some sort of hill climbing?

Yeah, sort of. It’s a bit more direct than a hillclimber. Let N be the target number of repeats. The steps are:

1) Shuffle the cipher
2) Count repeating bigrams, R.
3) While R<N:
—- A) Pick a random bigram B
—- B) Find a random position P1 that starts with the same symbol that B starts with.
—- C) Find a random position P2 that has the 2nd symbol of B
—- D) Exchange the symbols at P1+1 and P2 which forces B to have another occurrence

Sometimes those steps destroy existing repeats so retries are often required.

I suppose that the "average shuffles per pivot pair" is based on the counts 20, 16, 19 and 31. With low numbers there is more room for variation/outliers so I would assume that this is the case until you can run at least 10 times as much iterations.

Very good point. Perhaps I can get more iterations if I work harder to optimize the code.

In the transposition misalignment hypothesis we assume that period 29/39 originally were period 2 bigram repeats. If you can manage the time then I’m interested how it would look with the following alteration, possibly with N up to 13 (at least I would like to see 1, 2 and 13 compared):
2) Randomly modify the cipher text until there are 40 period N bigram repeats (without changing the symbol distribution)

Nice idea – I will be happy to run this. To be clear, you want me to keep the third step, right? (Transpose by period 39, after a shuffled cipher with 40 period N bigram repeats is created)

http://zodiackillerciphers.com

 
Posted : January 9, 2017 2:46 am
doranchak
(@doranchak)
Posts: 2614
Member Admin
Topic starter
 

Here’s a random shuffle that has been randomly modified until it has 40 period 2 bigrams. Can you confirm that it has 40 repeats for period 2?

<p;c)K3;MWN<G2F+O
+LBFlcypZU+4+CVYE
kMUS#R+2KF8pz5Y|-
GG+1+Y++2+;LM:O^|
<++F6JS+4F+|H9p-j
kJpRcDcFc)NtLVGKT
L2B2J%5dM)HEHlkcj
p_>/&U*lZO9D|N4:>
2y9bk<ROp.OTfcBHB
N>lWK7--q*|8P1YJF
8z^4OM/tRCW.V.y+|
Vbp<p4KT@ZGy(y|++
2+XCOKDdL)F*EVB6R
f&f|/(UOlB*W7zd-(
Mzl+kRF+#(Bz7c^_#
*2dAMz|PScBd5LBBR
U5^.^.q+Z#*^BCC6+
+RXpO>pWT35b|W9SD
fG1(5(4%z_P.8TcBK
l#Nz2FA)zt(5V+tO<

And here’s what it looks like when I transpose it by period 39:

<LU16D5lp*WyEz^B+
f#clR++c)9TP.G)l+
|.O(N;F#YSFMOO8Vy
B-#UR1zKy22FNE|cY
.(**(S+p(2)c++4)H
Df1y+RM2^p5F;ZF;|
Ll4HF+|V7zB#T%A3p
K++tHNBJ|2&lA^>4)
W+pM9Gc>NzV+6dc5^
5_zMU8LHVk:B8bX|k
zqWzt<+5O-Tpyl4p+
f(_BC|.(N4z:pKj2>
^<O(FPZ3P52V||k2>
bKMpCfz*R69TVGCY^
jL_9WO4DO#c*b8++E
G+p2&<-tKK/+d5+DB
tFY-<JB/k7/TLBBdB
WcO+M+Fc%*OqC@dUR
M.XGl<OkG+RJUR-RZ
FW7LCSKpBS+JcdZ.|

Does that match what you would expect? I want to make sure we’re on the same page before kicking off the experiment. Thanks.

The transposition algorithm I’m using is this:

		StringBuffer sb = new StringBuffer(cipher);
		
		int k=0;
		for (int i=0; i<n; i++) {
			for (int j=i; j<cipher.length(); j+=n) {
				sb.setCharAt(j, cipher.charAt(k++));
			}
		}
		return sb.toString();

To transpose in the other direction (i.e., going back to period 1 when you have a cipher with period N bigrams), this is the algorithm I’m using:

		String result = "";
		for (int i=0; i<n; i++) {
			for (int j=i; j<cipher.length(); j+=n) {
				result += cipher.charAt(j);
			}
		}
		return result;

http://zodiackillerciphers.com

 
Posted : January 9, 2017 3:00 am
Jarlve
(@jarlve)
Posts: 2547
Famed Member
 

This must then be some sort of hill climbing?

Yeah, sort of. It’s a bit more direct than a hillclimber. Let N be the target number of repeats. The steps are:

1) Shuffle the cipher
2) Count repeating bigrams, R.
3) While R<N:
—- A) Pick a random bigram B
—- B) Find a random position P1 that starts with the same symbol that B starts with.
—- C) Find a random position P2 that has the 2nd symbol of B
—- D) Exchange the symbols at P1+1 and P2 which forces B to have another occurrence

Sometimes those steps destroy existing repeats so retries are often required.

Okay, this is a rough optimization idea that you may already have in place. After step 1 create an array that holds all the positions of each symbol, list(unique_symbols,positions). You’ll also have to keep the frequency of each symbol somewhere, I usually keep that in the zero element of the positions index. Now use this list to create bigrams without hill climbing and then you won’t need to do all these cpu intensive random rolls. If you worry about destroying existing bigram information then you can keep a list of which positions are already used.

''cipher to list
for i=1 to l 'cipher length
	list(cipher(i),0)+=1
	list(cipher(i),list(cipher(i),0))=i
next i

Nice idea – I will be happy to run this. To be clear, you want me to keep the third step, right? (Transpose by period 39, after a shuffled cipher with 40 period N bigram repeats is created)

Yes.

AZdecrypt

 
Posted : January 9, 2017 12:34 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
 

Here’s a random shuffle that has been randomly modified until it has 40 period 2 bigrams. Can you confirm that it has 40 repeats for period 2?

Yes it has.

Does that match what you would expect? I want to make sure we’re on the same page before kicking off the experiment. Thanks.

I have one small issue, which may be nothing more than a random result but the bigram list of your example cipher looks like this, "++" repeats 6 times.

2-gram frequencies > 1:
--------------------------------------------------
++: 6
LF: 2
cp: 2
F): 2
)t: 2
LB: 2
/U: 2
Ul: 2
OD: 2
Rp: 2
>W: 2
b<: 2
+2: 2
XO: 2
BR: 2
2A: 2
Az: 2
zP: 2
+R: 2
cK: 2
Op: 2
pc: 2
Bl: 2
+V: 2
5|: 2
|+: 2
+F: 2
|9: 2
cc: 2
d): 2
.T: 2
..: 2
.+: 2
G(: 2
(z: 2

AZdecrypt

 
Posted : January 9, 2017 12:45 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
Topic starter
 

Thanks for the optimization tip, Jarlve.

I have one small issue, which may be nothing more than a random result but the bigram list of your example cipher looks like this, "++" repeats 6 times.

That may happen fairly often in shuffles due to the high frequency of + symbols. Do you think I should purposefully exclude them? I’m wonder if that will bias the results.

http://zodiackillerciphers.com

 
Posted : January 9, 2017 3:19 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
 

Thanks for the optimization tip, Jarlve.

I have one small issue, which may be nothing more than a random result but the bigram list of your example cipher looks like this, "++" repeats 6 times.

That may happen fairly often in shuffles due to the high frequency of + symbols. Do you think I should purposefully exclude them? I’m wonder if that will bias the results.

No you shouldn’t exclude them. I was just wondering if something in the selection procedure and mechanics would give rise to allot of "++" bigram repeats. If so, could it be a problem for pivot creation? I guess it would not matter if the test is kept at 40.

AZdecrypt

 
Posted : January 9, 2017 3:35 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
Topic starter
 

Bigrams involving "+" are more likely to appear due to high frequency of + in the cipher text (my shuffles use the symbol distribution of Z340). I ran a test of 10,000 shuffles where the shuffled cipher texts are made to have 40 bigram repeats. Here are the top 100 raw bigram counts across all shuffles:

++=7893
B+=5509
+B=5393
p+=5216
+p=5137
O+=4954
F+=4909
+O=4872
+F=4842
+c=4841
+|=4837
|+=4831
c+=4819
2+=4548
z+=4512
+z=4511
+2=4476
R+=4149
+R=4133
+5=3782
M+=3771
+K=3764
+(=3756
K+=3747
+l=3745
+M=3708
(+=3706
l+=3670
5+=3668
.+=3411
+W=3363
^+=3337
+G=3337
+^=3329
+.=3321
W+=3317
V+=3313
*+=3307
+*=3305
+<=3297
4+=3290
+4=3285
+L=3278
G+=3274
+V=3263
<+=3254
L+=3237
pB=3086
Bp=3026
BB=2985
FB=2933
|B=2915
+#=2915
N+=2898
OB=2894
)+=2893
+T=2890
U+=2872
Bc=2869
BF=2861
d+=2847
C+=2842
+-=2842
+y=2837
+C=2832
B|=2831
+d=2829
+N=2825
-+=2823
+k=2816
+)=2814
k+=2810
T+=2809
BO=2803
+U=2799
cB=2774
y+=2770
#+=2763
pp=2686
p|=2670
|p=2655
pO=2645
cp=2638
Fp=2603
zB=2594
B2=2584
Bz=2573
pF=2568
pc=2560
2B=2537
|F=2503
cO=2478
Op=2477
FO=2452
c|=2436
cF=2430
|O=2407
zp=2404
t+=2404
F|=2404

Does that look OK? I don’t think the algorithm is biased towards "++" but maybe I’m not seeing it. In each bigram injection step it should be picking a bigram to clone completely at random, then performs one symbol swap to complete the 2nd bigram.

http://zodiackillerciphers.com

 
Posted : January 9, 2017 3:47 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
 

Yes, it looks okay.

AZdecrypt

 
Posted : January 9, 2017 4:09 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
Topic starter
 

Here is an early summary of a run in progress.

Haven’t done nearly as many shuffles as with N=1. But an interesting result is that pivot pairs occur most frequently when N=4 and repeats=80. That’s like a 38% drop in the number of shuffles needed to get to a pair at random, when compared to N=1 and repeats=80. I’m also surprised that N=2 repeats=20 was better than N=1 repeats=80. But like you pointed out, when comparing these there may be a lot of variance due to small numbers of occurrences. And I’m not 100% certain my code is free of bugs.

http://zodiackillerciphers.com

 
Posted : January 9, 2017 6:10 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
 

Thanks for the update, I find it interesting and can’t wait to see where it is going.

AZdecrypt

 
Posted : January 10, 2017 1:35 am
doranchak
(@doranchak)
Posts: 2614
Member Admin
Topic starter
 

Here are updated stats with more shuffles:

The experiment for Period 4 + 80 repeats still produces pivot pairs more frequently, apparently. A tentative conclusion from this is: A ciphertext that has 80 repeating bigrams at period 4 has a greater tendency to produce pivot pairs at period 39 than if the same number of bigrams appeared at periods 1, 2, 3, 5 or 13. This doesn’t seem intuitive to me. Can you come up with an explanation? Why would period 4 + period 39 favor pivot patterns? (Assuming it is not a mistake in my experiment code)

Here’s a sample pair found for period 4 + 80 repeats, if you want to help confirm a sample result.

Pair: 4, S4BR, directions: S W , positions: 24 41 23 58 22 75 21 ,  4, 1cM8, directions: S W , positions: 180 197 179 214 178 231 177 ,  

Cipher untransposed, 80 reps: 
+RL2:3XcT4MJZDf(V
f(#^yC-YF^pt*OWLp
+R|#yJL|lBM(Vkt7p
pB/S1UUU;<t.Nz6pO
BOW_<p+RMlFckp/V%
&5AS+<-H4l+S|zK(O
Bcd+b1G*B1G-_<-2Z
C3Kk|Gp+<zY++R8d&
5+q+.+RG*O#yLXc(V
5c*C.K5C.)+R8F5^:
))c2|llFl*2TKN925
A#|c2EGp.T>8^yzO+
F#kd+t79U;NTKy2BM
JOW+z6WHU7)b8f(Nb
+zOB.+FEkpM9D|;|F
2ZF^4cdF2>jE(z6pT
KOB+PW_HNLzYF^4M>
jB+9V%B)|zOBH4>JY
4cdKV|lP+fM@C5WLD
B+qZDRG+P/S+<-c*R

Cipher at period 4: 
+&)2R5)ZLAcF2S2^:
+|43<lcX-ldcHFFT4
l24l*>M+2jJSTEZ|K
(DzNzfK96((2pVO5T
fBAK(c#O#d|B^+c+y
b2PC1EW-GG_Y*pHFB
.N^1TLpG>zt-8Y*_^
FO<y^W-z4L2OMpZ+>
+CFjR3#B|Kk+#kd9y
|+VJGt%Lp7B|+9)l<
U|Bz;zMYNO(+TBV+K
HkRy4t82>7dBJp&MY
p5J4B+Oc/qWdS++K1
.zVU+6|URWlUGHP;*
U+<O7ft#)M.yb@NL8
CzXf56c(Wp(NLOVbD
B5+BOcz+W*Oq_CBZ<
..DpK+R+5FGRCE+M.
kPl)p/F+MScR9+k8D
<pF|-/5;cV^|*%:FR

Cipher at period 39: 
+A|d4SNp(+EHp_z>|
|7|(RB4S+HfN6OO_p
RlcpV&c4clTzVccWF
G^4+K+BB+yJB+6PtL
cVcCKC)RF^)F3H*Ef
O#+-B>FLCkV|zT4p+
+|;#8(bzB+Ep9||22
<F>ZK5OyG.zO2F+J+
;Bt&OKU*)CWD+ZR+/
+-*RSlFM|9T#bGNt<
Oj#G9zV8Mc1RUMzpB
W<+MFk/%52cT+K6fd
2_^-yMRkt)M+2Y/.W
+.X(5*.5.+85:)^X4
2((B|PY18^p3d%lYK
>pqzl<yfN+O.FkMD;
FZ:-ljD(ABC*TYWZ#
9L<NH75WVUOb5LBqD
GPS<cRL+l2Jz2K^1p
L*-+BypUOkdJdUG7@

Here’s how the pivot pair appears for the sample shuffle at period 39:

http://zodiackillerciphers.com

 
Posted : January 10, 2017 4:18 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
 

If you consider the following: there are N configurations in which period A bigram repeats can translate into a period B pivot pair. Then, perhaps N is different for combinations of A and B. That doesn’t help with intuition but these things are mechanically complicated to say the least.

AZdecrypt

 
Posted : January 11, 2017 2:03 am
(@mr-lowe)
Posts: 1197
Noble Member
 

So with all that you have jointly discussed and learnt is the general consensus in favour of the pivots being a path of some sort of transposition cipher? 1 in 240,000 two pivots in the same direction seems like a big statistical yes. or is it just an anomaly

 
Posted : January 12, 2017 6:29 am
Page 1 / 2
Share: