Zodiac Discussion Forum

Homophonic substitu…
 
Notifications
Clear all

Homophonic substitution

1,434 Posts
21 Users
0 Reactions
304.3 K Views
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

I think that what I have learned is that the period 19 repeats involving the + symbol may not be quite as significant. They may not be the ones to study more closely.

It seems to be a tendency for high frequency symbols to carry allot of repeating fragments, perhaps especially if these are 1:1 substitutes in the encoding? I’ve made some test ciphers some time ago to see if it was uncommon for the 340 to have so many repeats tied to the "+" symbol and concluded that it is normal (but somewhat counterintuitive).

Your smokie18e surprised me.

AZdecrypt

 
Posted : December 15, 2015 1:44 am
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Your smokie18e surprised me.

Well, I took the message from your library that had the highest ratio of period 19 to period 1 repeats. Then I had to really work at making a message that had juxtaposed period 1 and period 19 stats. There are more high count symbols than the 340, the difference between period 1 and period 19 repeats still isn’t quite as high as that with the 340, and the repeat probability scores are still lower than the 340 across the board.

On the other hand, it is pretty easy to make a transposition message that has stats comparable to the 340. I don’t think that I disproved the transposition idea. I didn’t confirm it exactly either. I think that smokie18e leaves a small window of doubt as to whether the 340 is a transposition cipher. At some point in the future you may want to consider writing a program that randomly selects messages from your library and populates the key with certain variables that you can adjust so that the distribution of symbol counts is similar to the 340, etc. Do that 1 million times and then see what happens. Based on smokie18e, I am guessing that the 340 will be a statistical outlier, as you would say.

I think that random shuffling a message a lot of times is a valuable tool for making statistical judgments, but making messages to simulate 340 stats is also another good tool in our toolbox. They have slightly different uses, depending on the situation.

I wouldn’t stop working on any transposition analysis based on smokie18e at this point, and still think that the 340 is almost certainly a transposition cipher with skipped or added symbols, a gibberish row, a handful of polyalphabetic symbols ( there is some evidence of this based on cycle scores ), or even multiple keys. I am very interested in exploring how to make a simple cipher that causes all of the improbable 340 stats, including the cycle scores in conjunction with the odd even phenomenon and prime phobia. Perhaps random shuffling isn’t the only way to make a statistical judgment about prime phobia. Or perhaps one cipher could explain all.

EDIT: Jumping a bit ahead of myself here, I would like to conduct a little poll about whether the 340 is a transposition cipher. I know that there are other readers who do not frequently post, but it would be nice to hear some different opinions. Does the evidence show that at least one of the 340 cipher steps is transposition:

A) Beyond a reasonable doubt ( the highest standard );
B) With clear and convincing evidence;
C) With a preponderance of the evidence ( more likely than not );
D) With probable cause ( probably );
E) With reasonable suspicion;
F) With only a scintilla of evidence; or
G) With no evidence at all?

I vote somewhere between A and B.

 
Posted : December 15, 2015 4:52 am
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

I vote C or D. Just like to keep my options open, you never know, the 340 is very strange.

Managed to identify the smokie18e correctly (period 1) with a small adjustment to my 5-gram reps measurement. It adds to the significance of the period 19 scheme in the 340.

340:
Period 1: 325
Period 1 (mirrored): 331
Period 15 (mirrored): 490 <—
Period 19: 486

Smokie18e:
Period 1: 574
Period 1 (mirrored): 592 <—
Period 15 (mirrored): 469
Period 19: 383

for i=1 to total_symbols-4
	for j=i+1 to total_symbols-4
		for k=0 to 4
			if cipher(i+k)=cipher(j+k) then c+=1
		next k
		if c>1 then score+=10^(c-2)
		c=0
	next j
next i

AZdecrypt

 
Posted : December 15, 2015 1:52 pm
(@masootz)
Posts: 415
Reputable Member
 

i vote b or c. we have the advantage of knowing information about the cipher creator. that’s not always the case. we also have an example of another cipher he created that was solved, which shows important characteristics. i’d be surprised if the 408 was less than his full effort and it was solved quickly. using the characteristics of the 408 we can assume he tried to make the 340 more complicated in the same way any person who wasn’t an expert at encipherment would – taking what you know and adding a twist. i highly doubt he read books, took a class, etc to gain more knowledge on different techniques. he’s a pretty lazy criminal and the ciphers are more important to him as a mechanism for thumbing his nose at everyone than as things that contain actual useful information. that said, i think the 340 is the 408 plus transposition. that seems like the easiest and most logical next step given what we know. just my 2 cents.

 
Posted : December 15, 2015 5:31 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Thank for your two cents, Masootz.

Jarlve, I like your subroutine. I think that I understand. It looks like you scroll through the message and compare the symbols for period 1 through 5. Because period 2 bigrams have a lot of the same letters as period 2 bigrams, and so on creating words, the subroutine seems to work really well. Fascinating and makes me think of other ideas. You may have persuaded me to change my vote to B or C.

 
Posted : December 15, 2015 5:57 pm
(@mr-lowe)
Posts: 1197
Noble Member
 

i vote b or c. we have the advantage of knowing information about the cipher creator. that’s not always the case. we also have an example of another cipher he created that was solved, which shows important characteristics. i’d be surprised if the 408 was less than his full effort and it was solved quickly. using the characteristics of the 408 we can assume he tried to make the 340 more complicated in the same way any person who wasn’t an expert at encipherment would – taking what you know and adding a twist. i highly doubt he read books, took a class, etc to gain more knowledge on different techniques. he’s a pretty lazy criminal and the ciphers are more important to him as a mechanism for thumbing his nose at everyone than as things that contain actual useful information. that said, i think the 340 is the 408 plus transposition. that seems like the easiest and most logical next step given what we know. just my 2 cents.

yep well put masootz.. I think it will be a transposition that was simple to perform. but any transposition will be hard to find. as jarlve so eloquently put to me the other day "welcome to the sea of transposition" which puts it into perspective ..

 
Posted : December 16, 2015 9:40 am
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

Sorry guys I don’t have so much time at the moment. This is something that doranchak found, a repeating 5-gram on the same line for period 101. I’ve selected all the involved symbols and some other strange patterns occur. I wonder how many randomizations it takes to produce a 5-gram.

340_p101.txt

H(B(E#F)R5zp>+6|p
K9FlqSk^%ydV;#WP2
+<kUN7|c|t1X5BLGF
_TVBYG.cO2z(BdL;*
N|8-p(RC+GlcB2G>(
JFM#fNDOj^H%#fNDO
5pW+2kY_4S.Nbz<Y.
Z*zcOK+V8f@4A)Lt|
B9+Kyd+;:<y+cMBM+
X+b1U+*ZZ:GR4W29(
FC)BELc>#yVzAUH6Z
J45SK-p-+pz|7lc^U
.lV38+z*^BVJK3+(p
OOOpp+7^+<.RFfKBM
2yq_-G9U2M+R+Rcz/
Tt5+jtLdE1||65DCF
Y<PB++pF&bl4TWkMB
/K|pO)82LR<+^c+Fl
)lRWOJC-|z**Wd5cC
TPk4OFMS>.H2+TD&/

AZdecrypt

 
Posted : December 17, 2015 1:32 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

How did he un-transpose it?

I have also been busy, and I have holiday engagements for the next few weeks. But I am going to do a row shuffle on smokie18e and compare to the 340.

 
Posted : December 17, 2015 1:42 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

I’ve selected all the involved symbols and some other strange patterns occur. I wonder how many randomizations it takes to produce a 5-gram.

Interesting patterns.

I ran a one-million shuffle test on the 340. Result: 180 in 1,000,000 shuffles (0.02%) had at least one period 1 5-gram.

It made me wonder: Does each period have the same 0.02% chance of having a repeated 5-gram during one particular shuffle? If you consider periods from 1 to 170, is there a 170*0.02% = 3.06% chance for a repeated 5 gram to appear at some period? Intuitively, it seems the answer is yes. I’m running another shuffle test, and so far in 10,000 shuffles it found 274 (2.74%) that had a 5-gram at some period from 1 to 170. (UPDATE: I let it run to 100,000 shuffles and 2806 (2.8%) of them had a 5-gram at some period from 1 to 170).

Furthermore, I ran some other measurements on the period 101 untransposed cipher. Compared to the original 340, there are fewer bigrams, fewer repeating fragments, a much weaker homophone cycle response, and more randomness (according to your non-repeat score).

http://zodiackillerciphers.com

 
Posted : December 17, 2015 2:41 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

How did he un-transpose it?

My period=p un-transpose routine works like this:

String result
for (i from 0 to p-1)
   for (j from i to cipher.length-1, step p)
      append char at j to result

(Cipher string array is indexed from 0 to length-1)

http://zodiackillerciphers.com

 
Posted : December 17, 2015 2:50 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

I performed the row shuffle experiment on smokie18e, and compared the results with smokie16d and the 340. I also calculated the mean and standard deviation ( the sigma symbol ) of the results, which are the numbers at bottom. I was a bit surprised that shuffling several of the rows in smokie18e did not cause more period 19 repeats or higher scores. But There were several rows where shuffling did.

Smokie18e values are higher than that of smokie16d and the 340, which are comparable to each other. Random shuffles of individual rows increases the count of period 18 repeats and scores. The values in the boxes are how the original count and score ranks as compared to 30 random shuffles. For example, if the value in a box for row 1 is 1, then after making 30 random shuffles of row 1 the original configuration ranks number 1 as compared to the shuffles.

One more bit of evidence that the 340 is likely to be a transposition cipher.

.

 
Posted : December 17, 2015 3:02 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

.
I had an idea this morning, and it needs some work. But I was thinking about how there are a lot of period 18 bigrams in the first few rows of the 340 that have matching symbols to a lot of period 19 bigrams in the remaining rows of the message (highlighted puke yellow on the left, where symbols 7 and 24 are boxed in red as an example).

One explanation is that the message is about 319-321 plaintext long arranged in 19 rows and he transposed starting in column 17 and worked his way to the left column by column.

Another explanation is that there are two or three skipped plaintext in the first few rows of the message.

But what if the message is really a period 18 transposition scheme and there a lot of added symbols after the first few rows making it look like a period 19 transposition scheme? A message sender and receiver could agree that certain symbol(s) or a subset of certain symbol(s) are nulls. The receiver would take those symbol(s) out before un-transposing. That may explain some of the symbols that are high count but do not cycle well with other symbols ( the +, B, F, and p ).*

* And if there was a counting scheme to this with more than one null symbol used to hide the counting scheme, then that could cause those symbols to prefer non-prime positions.

Add to my to-do list: Make a simple transposition message that is really a period 18 transposition scheme but looks like a period 19 transposition scheme because there are a lot of added high count non-cyclic nulls that largely avoid prime positions because of the routine for adding those nulls.

But before I do that I do have other things to do.
.

 
Posted : December 24, 2015 6:07 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

I made a cool new spreadsheet that I can use to analyze Jarlve’s 100 plaintext message library for plaintext that are prime phobic. I will show results tonight or tomorrow, which suggest that prime phobia may be more statistically significant that before thought.

 
Posted : December 25, 2015 8:51 pm
(@mr-lowe)
Posts: 1197
Noble Member
 

Looking forward to it.. Can you put it up so we can copy and paste

 
Posted : December 26, 2015 3:52 am
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Mr. Lowe, I accidentally posted in this thread when I intended to post in the Recipes for primephobia thread. So go there for the latest results, which I will post soon. As far as pasting so that you can copy, I can paste or write some things. But I use Excel and paste onto Microsoft Paint. I think that I know what you are talking about. The other guys post and it is in green letters on a white background. I don’t know how to do that. See you in the other thread.

 
Posted : December 26, 2015 4:55 am
Page 47 / 96
Share: