Zodiac Discussion Forum

Cycle spectrum and …
 
Notifications
Clear all

Cycle spectrum and its implications

13 Posts
3 Users
0 Reactions
2,270 Views
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

EDIT: I updated the terminology.

In doranchak’s Let’s Crack Zodiac – Episode 2 at around 10:12 he shows that simply rearranging the Z340 by placing the symbols in order produces a text with 215 bigram repeats.

I suggest that there exists a cycle spectrum. With on the left side doranchak’s example, or take this simpler perfectly anti-cyclic example of "AAABBBCCC". And on the right side a perfectly cyclic example "ABCABCABC". In between are ALL other possible rearrangements of these symbols. In other words, the spectrum goes gradually from "perfectly anti-cyclic" to "perfectly cyclic".

Both extremes "AAABBBCCC" and "ABCABCABC" have the highest number of bigram repeats. The distributions that exist in the middle of the spectrum should have the fewest bigram repeats. This spectrum relates to language, which has both "anti-cyclic" and "cyclic" features such as double letters and repeating elements, in turn increasing bigram repeats above the average.

If "perfectly anti-cyclic" is applied to the Z340 like doranchak did you get the following text with 215 bigram repeats:

+++++++++++++++++
+++++++BBBBBBBBBB
BBppppppppppp||||
||||||OOOOOOOOOOc
cccccccccFFFFFFFF
FF222222222zzzzzz
zzzRRRRRRRRllllll
l(((((((KKKKKKKMM
MMMMM5555555^^^^^
^VVVVVVLLLLLLGGGG
GGWWWWWW......<<<
<<<******444444kk
kkkTTTTTdddddNNNN
N#####)))))yyyyyU
UUUU-----CCCCCHHH
H>>>>DDDDYYYYffff
ZZZZJJJJSSSS88889
999ttttEEEPPP1117
77___///;;;bbb666
%%::33jj&&qqXXAA@

And if "perfectly cyclic" is applied to the Z340 you get the following text with 264 bigram repeats:

+Bp|OcF2zRl(KM5^V
LGW.<*4kTdN#)yU-C
H>DYfZJS89tEP17_/
;b6%:3j&qXA@+Bp|O
cF2zRl(KM5^VLGW.<
*4kTdN#)yU-CH>DYf
ZJS89tEP17_/;b6%:
3j&qXA+Bp|OcF2zRl
(KM5^VLGW.<*4kTdN
#)yU-CH>DYfZJS89t
EP17_/;b6+Bp|OcF2
zRl(KM5^VLGW.<*4k
TdN#)yU-CH>DYfZJS
89t+Bp|OcF2zRl(KM
5^VLGW.<*4kTdN#)y
U-C+Bp|OcF2zRl(KM
5^VLGW.<*4+Bp|OcF
2zRl(KM5+Bp|OcF2z
R+Bp|OcF2z+Bp|OcF
+Bp+B++++++++++++

While both ends of the spectrum generate texts with many bigram repeats, the "perfectly cyclic" end has the most, and should have the most bigram repeats of all distributions.

When examined, both ends of the spectrum have many interesting properties:

– "Perfectly anti-cyclic" has a cycle score of 0. Anti-cyclic.

– "Perfectly cyclic" has a cycle score of 8429. Highest possible cycle score.

– After applying a period 6 untransposition to "perfectly anti-cyclic" the cycle score increases to 4246. Twice that of the Z340!

– After applying a period 63 untransposition to "perfectly cyclic" the cycle score decreases to 324. Anti-cyclic.

– In short, both "perfectly anti-cyclic" and "perfectly cyclic" can be shifted/transmuted a very large distance in the spectrum – with a periodical transposition – where a distribution that sits in the middle of the spectrum would probably be very hard to shift.
Thus it is expected that ciphers like the Z340 and Z408, which are both "more cyclic" to have more periodical "outliers" with measurements that relate to the spectrum!

– After applying a period 63 untransposition to "perfectly cyclic" the text has 62 doublets.
I suggest that this is why the Z340 has a Kasiski spike of 18 at offset 78, which is identical to having 18 doublets at period 78. As the above observation acts as a mechanism that allows this to happen.

Takeaway: we can definitively expect ciphers with a proper "sequential homophonic substitution" layer to behave like a coincidence generator by default in respect to some measurements. And we should be extra careful to compare it to shuffles.

AZdecrypt

 
Posted : May 18, 2020 9:00 pm
(@tlaz444)
Posts: 13
Active Member
 

Sorry, just so I am following this correctly, ciphers that are more cyclic, are more susceptible to periodic transpositions that could significantly alter the cycle score?
And so then, increasing doublets would be a byproduct of this shift in the cyclic spectrum, which is why we see a Kasiski spike at period of 78?

 
Posted : May 19, 2020 12:30 am
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

Hey tlaz444, I think you follow correctly.

I suggest a spectrum from "perfectly non-cyclic" to "perfectly cyclic". From "AAABBBCCC" to "ABCABCABC".

Observe that these examples, ends, are "transmutable" by periodic transposition, "AAABBBCCC" period 3 transposed becomes "ABCABCABC" and period 3 untransposed brings it back.

Of course this all makes perfect sense. And I think that the "clever" part is that it also applies to ciphers which have a proper "sequential homophonic substitution" layer or similar. And that period transposition can "transmute" it heavily towards the non-cyclic spectrum, in turn promoting the formation of many doublets. That then becomes the most likely explanation for the Kasiski spike in the Z340 and Z408.

AZdecrypt

 
Posted : May 19, 2020 9:31 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

Very interesting analysis, Jarlve. Thanks for posting it. It makes me wonder, why does Z408 have max bigram repeats only at period 1? It has very cyclic behavior. But perhaps it does have some periodic outliers (doublets and/or bigrams) that I’m not recalling at the moment.

EDIT: it does: http://zodiackillerciphers.com/wiki/ind … xamination

A Kasiski examination performed on unigrams in Z408 reveals a spike of 18 repeats at shift of 49
Among 1,000,000 random shuffles, only 2.2% of them had a spike as good or better as the one observed in Z408
The 408 at period 4 transposed or period 102 untransposed has an even more significant (than Z340’s) doublet peak.

So: Is there a way to subtract out, or normalize, the effect of cycles on periodic repeats, in order to distinguish between coincidence generators and evidence of real transposition schemes?

http://zodiackillerciphers.com

 
Posted : May 20, 2020 1:39 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

First of all I would say that a doublet falls in the non-cyclic spectrum since "ABCABCABC" can never produce it. And vice versa, bigrams fall (more) in the cyclic spectrum as "AABBCC" cannot produce it. Of course "AAAABBBBCCCC" can, but the cyclic string will have more.

Since "cycles on periodic transposition" act like a non-cycle (doublets) generator I am not so worried about the validity of periodic bigrams.

Say that our cipher would be non-cyclic (below average) then periodic transposition would act as a cycle (bigrams) generator.

So we do not need to worry about periodic bigrams in the Z340. But we do need to worry about tests like the Kasiski.

AZdecrypt

 
Posted : May 20, 2020 7:07 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

Here’s a measurement I came up with that allows us to capture the "cycle spectrum" from 0 to 1 such that the average shuffle will score 0.5, perfectly anti-cyclic will score 0 and perfectly cyclic will score 1. In that way it should be safe to use in junction with the standard deviation. It is probably unsafe to use as a fitness measurement in hill-climbers because it only captures 2-symbol cycles which individual symbols have similar frequencies (equal +1 or -1 allowed). Allowing all 2-symbol cycles to be measured will make it unsafe for use with the standard deviation. Also possibly unsafe with very short texts.

Even in ciphers with "perfect sequential homophonic cycles" the measurement will never get to 1 because it is still constrained by the plain text underneath.

function m_2cyclespectrum(nba()as long,byval l as integer,byval s as integer)as double
	
	'safe for use with the standard deviation
	'unsafe as fitness measurement in hill-climbers
	'unsafe with very short texts
	
	dim as short i,j,k,p1,p2,a,b,c,d,e,g,cl,fm
	dim as short frq(s)
	dim as double score,maxscore,al
   for i=1 to l
  		frq(nba(i))+=1
   next i
   for i=1 to s
   	if frq(i)>fm then fm=frq(i)
   next i
	dim as short map(l,fm+1)
	for i=1 to l
		map(nba(i),0)+=1
		map(nba(i),map(nba(i),0))=i
	next i
	for i=1 to s
		map(i,map(i,0)+1)=l+1
	next i
	dim as short z(fm*2)
	for i=1 to s
		for j=i+1 to s
			if frq(i)>1 andalso frq(j)>1 then
				if abs(frq(i)-frq(j))<2 then
					e=0
					p1=1
					p2=1
					al=0
					cl=0
					do
						a=map(i,p1)
						b=map(j,p2)
						if a<b then 
							d=a
							p1+=1
							g=0
						else 
							d=b
							p2+=1
							g=1
						end if
						if d=l+1 then exit do
						cl+=1
						z(cl)=g
					loop
					for k=1 to cl-1
						if z(k)<>z(k+1) then al+=1
					next k
					al-=1
					score+=al
					maxscore+=cl-2
				end if
			end if
		next j
	next i
	return (score/maxscore)

end function

AZdecrypt

 
Posted : May 23, 2020 6:21 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

Armed with the standard deviation safe "cycle spectrum" measurement we can now do some cool stuff like estimating odds without having to go through millions or billions of calculations.

Z340: cycle spectrum = 0.6072 (5.17 standard deviations above the mean, odds are about 1 in 2,000,000)

Z408: cycle spectrum = 0.6487 (8.15 standard deviations above the mean, odds much greater than 1 in 390,682,215,445)

W.B.Tyler 2: cycle spectrum = 0.6818 (13.22 standard deviations above the mean, odds are astronomical)

AZdecrypt

 
Posted : May 23, 2020 6:35 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

Can you explain what you mean by being safe to use with standard deviation?

http://zodiackillerciphers.com

 
Posted : May 23, 2020 10:21 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

Can you explain what you mean by being safe to use with standard deviation?

From Wikipedia:

In a certain sense, the standard deviation is a "natural" measure of statistical dispersion if the center of the data is measured about the mean.

In my measurement the center is at 0.5. The mean of shuffles will be very close to 0.5, with a possible range from 0 to 1. There may be a slight skew towards the anti-cyclic spectrum though.

Normal distribution, bell curve:

If the distribution is skewed then the standard deviation becomes inaccurate in such that it will not properly convert to odds.

AZdecrypt

 
Posted : May 24, 2020 12:30 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

Gotcha. So, the cycle spectrum measurement passes the normality test. Thanks for the explanation!

http://zodiackillerciphers.com

 
Posted : May 24, 2020 12:58 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

Gotcha. So, the cycle spectrum measurement passes the normality test. Thanks for the explanation!

Oh, there is a name for that. 8-)

Here’s a "cycle spectrum" roll-out from 10,000 Z340 shuffles:

Standard deviation distribution counts:
---------------------------------------------------------
+4: 0
+3: 12
+2: 225
+1: 1342
+0: 3390
=0: 0
-0: 3469
-1: 1346
-2: 205
-3: 11
-4: 0

And if I use my old "2-symbol cycles" measurement observe that there is a skew towards the positive values:

Standard deviation distribution counts:
---------------------------------------------------------
+5: 0
+4: 1
+3: 34
+2: 265
+1: 1256
+0: 3264
=0: 0
-0: 3603
-1: 1424
-2: 147
-3: 6
-4: 0
-5: 0

AZdecrypt

 
Posted : May 24, 2020 5:46 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

I’m curious if a more formal measurement of normality would be useful. For example, the Kolmogorov-Smirnov test (K-S) and Shapiro-Wilk (S-W) test could be used.

Here’s an online version of the K-S test: https://www.socscistatistics.com/tests/ … fault.aspx

http://zodiackillerciphers.com

 
Posted : May 24, 2020 7:39 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

It passed. Though, as I mentioned, there is a very slight skew towards the anti-cyclic spectrum.

If wondering, I multiplied the numbers by 10,000. So the median should be close to 5000. Rather than from 0 to 1.

AZdecrypt

 
Posted : May 25, 2020 9:34 am
Share: