Zodiac Discussion Forum

Homophonic substitu…
 
Notifications
Clear all

Homophonic substitution

1,434 Posts
21 Users
0 Reactions
304.3 K Views
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

My own work along these lines is making progress, but it very slow going. At the moment I am using a multiobjective hillclimber to generate ciphers that target many qualities of the real Z340 (I’ve even recently included the oddity of bigram distribution when all odd- and even-numbered positions are removed).

What do you mean with bigram distribution? If you remove all odd and even numbered position you have removed the whole cipher. :)

Once I’m happy with the resulting generated ciphers, I’ll rule out the conventional hypothesis ("Z340’s plaintext is written in the normal way, and enciphered using a method similar to Z408"). Then I plan to investigate Dan Olson’s hypothesis.

What do you think of other languages?

Then if I have time I’ll look at some of the other ideas (wildcard symbols, multiple keys, transposition, etc). The difficulty is that once a new scheme is selected, I have to write new code to make adjustments to the encipherment process I use to extract candidate plaintexts from a large corpus. The scheme greatly affects the constraints imposed by the many features I’m looking for in the generated ciphers (pivots, box corners, even/odd discrepancy, filler, spelling errors, transcription errors, trigram that repeats in the same column, prime phobia of top two symbols, etc).

Always enough work on the 340 and we’re making progress!

AZdecrypt

 
Posted : September 14, 2015 5:05 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

My own work along these lines is making progress, but it very slow going. At the moment I am using a multiobjective hillclimber to generate ciphers that target many qualities of the real Z340 (I’ve even recently included the oddity of bigram distribution when all odd- and even-numbered positions are removed).

What do you mean with bigram distribution? If you remove all odd and even numbered position you have removed the whole cipher. :)

Hah! Yes, I meant to suggest that each would considered separately. :)

Once I’m happy with the resulting generated ciphers, I’ll rule out the conventional hypothesis ("Z340’s plaintext is written in the normal way, and enciphered using a method similar to Z408"). Then I plan to investigate Dan Olson’s hypothesis.

What do you think of other languages?

That is a valid hypothesis but I tend to rank it low on the list of possibilities. I’m not aware of any evidence to suggest Zodiac could have or would have written a message in anything other than English.

Always enough work on the 340 and we’re making progress!

Yes, it has been very interesting following everyone’s progress. Recently it seems the cipher has gotten more active attention from scientifically-minded people. I hope this trend continues.

http://zodiackillerciphers.com

 
Posted : September 14, 2015 6:12 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

@Smokie,

Thanks allot for the in-depth information for the smokie8 and m7p77. Haven’t tried to solve the smokie8, these ciphers may be especially hard. I want to focus a bit on identification for now.

I measured a significant spike in odd/even symbol distribution for the smokie8 and started comparing it to your other ciphers and noticed that the smokie1 was about the same. I learned that it was the plaintext causing this, further more the smokie1/plaintext has a "unique" discrepancy for odd/even between the top and bottom halves and the smokie8 has the same. So I assumed it was the plaintext causing this and not the encoding scheme itself. Just wow, lesson learned I guess. It’s cool, you have been lucky.

Very lucky! :D

In reinspecting the smokie8 for 2 parts (odd/even) I don’t see a allot of cycling. Some of the steps you described must have added quite a bit of randomization to the cycles. I should add something to my routines to spot for cycles that alternate between odd/even/etc. Yes, I’ve inspected various plaintexts with my cycle measurements and they do have this tendency. Just because it’s not random!

Yeah the 340 is amazing, and very uniquely disturbed.

I think you did well on the m7p77, it’s very hard and it’s interesting that this message is identified as an odd/even cipher since it can be considered relatively close to the actual scheme. Consonants have one key, and vowels (including "y") have another key. The even/odd distribution of the plaintext is better than average. You have to excuse me but I lost the specific keys, cycles should have no randomization (but not 100% sure).

If your up to it I’d gladly have another cipher. And do you want a new one?

AZdecrypt

 
Posted : September 14, 2015 6:26 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Jarlve,

I enjoy making messages as much as I do analyzing them. And I like updating my system. I understand that you have made significant strides lately in that department, especially with cycles. I have to leave the office today and may not be able to participate in anything new until tonight, maybe. I might not get home until late.

I was wondering what you think about the 340. I think (today) that he may have alternated two keys because of the MUJ cycle. Can you see any other strong evidence of odd/ even keys except for the fractured cycles and the fact that we were able to mimic the cycle stats pretty close? The MUJ cycle is evidence that Zodiac treated odds differently than he did evens. Otherwise it is an amazing statistical anomaly. And it is funny because we didn’t find it until after getting pretty involved with the project. I think that it was sort of lucky.

By the way, is there any other process that could mimic the MUJ symbols all being on odds? We should talk about alternatives for causing that. What do you think about the 37-38-41 MUJ cycle?

EDIT: I just had an idea and am writing it down before I step out. If there were two keys, the odds key would have 37 38 and 41 for a letter. What about the symbols for the same letter on the evens key? Check, just to see, if the symbols that are exclusive to the evens cycle with the symbols that are exclusive with the odds.

I also think (today) that he may have treated the top and bottom half differently. One thing is that I have been looking at all odds or all evens instead of top half odds and top half evens. I wonder what you think about separating the top and bottom half analysis. What about analyzing only 85 symbols? My cycle numbers are so low that I don’t yet know what to do with them. So I think there is odd-even and top-bottom.

I have to take some time and another look at the bigram repeats. I think (today) that he encoded with two keys and shuffled the bottom half somehow to mask bigram repeats there. Undoubtedly someone has looked for "wanna-be" bigram repeats there. Maybe backwards ones or something like that. I need to look at that subject myself and take some time with it to learn about it and look for possible patterns of disruption. If I don’t do this stuff myself my subconscious just doesn’t work on it as well.

Try snaking at the bottom half by making Row 11 backwards, Row 13 backwards, etc. I am not saying that he did that, but I tried it and it repaired the randomization of MUJ in the bottom half and also increased my overall cycle stats. I have not done anything beyond that but it looked interesting.

I have to run. I will make messages for you whenever I can. I learn a lot by doing so. But I need to think about stuff a little bit first. Be back soon.

 
Posted : September 14, 2015 7:53 pm
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

@smokie,

It’s so hard to say if a cycle is real or not, especially with what we assume is going on. I prefer to look at the big picture and ‘ignore’ the smaller stuff. I’m not so sure about the odd/even scheme for the 340. And if you want to work on the 340 for a while we can do that. Do you have a Google account? If so I’d like to create a Google document/spreadsheet and share editing permissions with you to line up our numbers (just need your Google e-mail or account name).

About 85 characters (1/4th). I think we can work with that, if we keep in mind that smaller datasets are more prone to outliers.

340, 2-symbol cycles:
Normal: 179
Lines 11, 13, 15, 17 and 19 mirrored: 176 (slighty worse)

340, 3-symbol cycles:
Normal: 212
Lines 11, 13, 15, 17 and 19 mirrored: 206 (slightly worse)

My measurement is a bit different than yours, it doesn’t give as much weight to longer perfect cycles as yours.

AZdecrypt

 
Posted : September 14, 2015 10:49 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Actually that email that I sent to you with the Excel file that I could not attach via this site’s messaging system is my gmail account.

I thought up an experiment, very simple and straightforward. It will be a small batch of 170 symbol messages. You will know some of the facts, and will have to figure out the answer to one straightforward question about each message. It may take a little time to get it organized. Once I do one, the rest should go pretty quickly. But maybe a couple of days, depending. I will also post my results for snaking the odd rows 11, 13, 15, 17 and 19, and work on making a new spreadsheet that highlights bigram repeats automatically after I paste the message into the spreadsheet.

I understand about the big picture approach. Sometimes I have to look at trees and sometimes I have to look at the forest. Honestly, I almost never look at the original copy; I look at numbers most of the time. Be back soon.

EDIT: O.k., so here is the MUJ cycle before making any changes.

Doranchak shuffled the 340 1618 times and 17 of those found a three symbol cycle ABCABCABC and with all other symbols sitting on either odd or even cycles. 17/1618 = 1.1%. See: viewtopic.php?f=81&t=2617&start=240.

Basically there is about a 1% chance that this is random. The cycles are in the top half, and the randoms are all in the bottom half. So I mirrored Row 11 to make the symbols cycle there, and also mirrored the following odd numbered rows.

And here are my new cycle stats:

The overall numbers jump dramatically because a couple of cycles change from low scoring to unusually high scoring. And, the interesting thing, is that the odds Part 1 also jumps up. The other numbers change a little. In my next post I will show the highest scoring cycles.

 
Posted : September 15, 2015 2:06 am
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Here are the top cycles before and after mirroring rows 11, 13, 15, 17, and 19:

The 11 – 36 cycle changed dramatically which affected the scores. The 11 – 36 cycle now has a score of 32770. But m7p77 has a score as high as that so I don’t think at the present moment that this is important.

11 and 36 are not exclusively odd or even in the bottom half. That’s it for this topic for now unless I find something else.

 
Posted : September 15, 2015 4:26 am
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

I thought up an experiment, very simple and straightforward. It will be a small batch of 170 symbol messages. You will know some of the facts, and will have to figure out the answer to one straightforward question about each message. It may take a little time to get it organized.

Sounds fun! Take all the time you need.

Sometimes I have to look at trees and sometimes I have to look at the forest.

Good analogy. :)

I’ve created a start for the 340 spreadsheet: https://drive.google.com/open?id=1VrAwV … gJ8Rb7sEKs You should have permission to fill in your numbers and if you don’t mind I’d like to create all the necessary fields and maintain the spreadsheet (tell me if you want anything added). I’ve added links to download the transformations of the 340 measured, do you want me to include a numeric version as well? If so, re-numbered by appearance or with the original 340 numbered by appearance kept so you are able to compare cycles more easily?

The spreadsheet may seem ill-formatted on some browsers, I recommend Google Chrome. If anyone wants to contribute to the spreadsheet let me know, you’ll need a Google account/e-mail.

AZdecrypt

 
Posted : September 15, 2015 8:58 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

Jarlve, I updated my spreadsheet so that it highlights bigram repeats both for a whole message and by top half and bottom half. For what it is worth, I did 100 random shuffles of the 340 to see what would happen. I got 8 where there were only one or fewer bigram repeats in one half. One shuffle actually showed zero bigram repeats.

I am going to work on the google docs file, and then the next batch of messages. I have to update my spreadsheet to do that because the other messages were by hand and I want to do it faster now.

 
Posted : September 16, 2015 3:34 am
Jarlve
(@jarlve)
Posts: 2547
Famed Member
Topic starter
 

Smokie, that’s interesting about the bigrams. Though a randomization of a cipher is not an ideal substitute for these kind of tests. The observation is on the back of my queue.

I currently have stronger feelings for the bigram peak at a distance of 19 (count 37 and count 39 at a distance of 15 for the 340 mirrored since this shortens the distance). Assuming it is nothing, a random fluke, how many randomizations of the 340 would it take on average to reproduce this?

I just ran a short test of 1 million randomizations and count 37 appears just 1 time. That’s one in a million! This can no longer be ignored.

340 bigrams for all directions up to a distance of 100:
(red=normal, green=mirrored, blue=verticals, pinkish=diagonals)

Major peak at distance 19 and 15 for mirrored with reflections at 38 and higher for verticals, which is entirely normal. Strangely coincides with the "magic square" at doranchak’s page: http://www.zodiackillerciphers.com/?p=179 Because the distance between the numbers that follow happens to be 19. Something which is in the FBI files and it seems that there is no reference to it at all. I encoded a message with a similar transposition and (ofcourse) bigrams peaked at 19 in almost the same way, but horizontal was a few counts higher. I’ve tried a couple of un-transposition schemes based on this but haven’t gotten a solve or high scoring piece.

AZdecrypt

 
Posted : September 16, 2015 10:25 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

I just ran a short test of 1 million randomizations and count 37 appears just 1 time. That’s one in a million! This can no longer be ignored.

340 bigrams for all directions up to a distance of 100:

Very interesting. When you consider bigrams towards the end of the cipher text, are you allowing the 2nd symbol to "wrap" around to the beginning? If not, I wonder what effect doing so would have on your numbers.

http://zodiackillerciphers.com

 
Posted : September 16, 2015 10:33 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

I ran a similar test and got similar results: [NOTE: My ngrams are allowed to wrap back to the beginning]

Count	Period	Ngrams

37	19	z6(2) #2(3) G+(3) PY(2) BO(2) TB(2) D((2) ^D(2) MF(2) *5(2) N:(2) +B(2) ;+(2) +4(3) 9^(2) (+(3) OF(2) p+(4) Xz(2) .L(2) -R(2) |<(2) YA(2) |T(2) +c(2) +|(2) <S(3) k.(2) +l(2) +k(2) 

32	85	z+(2) #+(2) Fb(2) G2(2) Wc(2) Fp(2) fL(2) cT(3) 5<(2) Sd(2) 5O(2) T+(2) 4B(2) RR(2) (z(2) +C(2) +B(3) Ml(2) ++(2) OE(2) Yp(2) kp(2) J9(2) Z5(2) +c(3) K4(2) +W(2) l+(2) +y(2) 

31	5	2V(2) G#(2) V+(2) G+(2) cK(2) 6l(2) d+(2) B4(2) tB(2) B+(2) 5;(2) BR(2) #Y(2) 4B(2) pO(3) +B(2) +<(2) +2(2) Mc(2) P/(2) 8O(2) /F(2) L+(2) zp(2) +|(2) Zc(2) <B(2) +p(3) +z(2) 

29	54	F.(2) FR(2) 7;(2) 5V(2) c|(2) B+(2) t+(2) SK(2) c:(2) T2(2) <8(2) *c(2) ++(3) 8c(2) 6z(2) p+(2) +L(2) |.(2) +|(3) -B(2) -F(2) .-(3) +k(2) +p(4) 

29	16	z+(3) #+(2) UO(2) Fb(2) G*(2) WH(2) cp(2) BT(3) 4B(2) c+(2) pB(2) +G(2) ^+(3) +5(2) +2(2) 8t(2) OB(2) HR(2) |L(2) +R(2) +P(2) >+(2) |2(2) +|(2) l+(2) +y(2) 

28	97	V+(2) F+(2) F*(2) 1;(2) FO(2) G+(2) RO(2) 5W(2) 4z(2) dR(2) D4(2) CJ(2) ^K(2) p6(2) Kp(2) *+(2) ++(2) (+(2) O^(2) ^c(2) Ol(2) p((2) |+(2) *l(2) +M(2) L)(2) +|(2) +y(2) 

28	39	VL(2) 2f(2) Tt(2) GK(2) H+(2) P|(2) c|(2) 55(2) CB(3) 4l(2) Rb(2) D+(2) c2(2) MF(2) *+(2) ^)(2) +F(2) +B(3) ++(4) |c(2) J.(2) Z+(2) zO(2) ;c(2) 

28	29	y+(2) F5(3) F|(2) GZ(2) S+(3) cK(2) B+(2) 4c(2) RR(2) 5+(2) #z(2) +*(2) M5(2) M2(2) <*(2) )O(2) :R(2) +B(2) Lp(2) +2(2) ++(2) +.(2) .2(2) Yl(2) +U(2) +l(2) 

27	80	z+(2) F#(2) 2c(2) #+(2) V+(2) Wf(2) &+(2) RF(2) B*(2) P^(2) BD(2) 5+(2) M6(2) +((2) N)(2) +<(2) +8(2) +2(3) _y(2) p+(2) G|(2) -V(2) jz(2) |W(2) Zz(2) Kc(2) 

27	74	z6(2) UM(2) 2/(2) S+(2) R+(2) cl(2) B+(3) C-(2) D5(2) R^(2) (|(3) pO(2) *+(2) ^+(2) +<(2) MR(2) +9(2) O+(2) )((2) NO(2) p+(3) .c(2) +W(2) +t(2) 

Trigrams are a different story – BUT, a period 19 does appear:

Count	Period	Ngrams

3	91	*M+(2) M+U(2) /p2(2) 

3	5	<B+(2) P/F(2) ZcK(2) 

2	8	H8R(2) ++P(2) 

2	73	OK+(2) K+|(2) 

2	51	pd2(2) p+l(2) 

2	2	54.(2) O*C(2) 

2	19	Xz6(2) +|T(2) 

2	1	|5F(2) FBc(2) 

1	88	|>O(2) 

1	86	O|O(2) 

1	85	cT+(2) 

1	84	222(2) 

1	74	R^+(2) 

1	72	F+V(2) 

1	71	|c;(2) 

1	7	F+c(2) 

1	64	+++(2) 

1	61	+|+(2) 

1	58	GD+(2) 

1	57	V+2(2) 

1	55	;p;(2) 

1	54	.-F(2) 

1	53	|+z(2) 

1	49	++z(2) 

1	45	(FL(2) 

1	39	++B(2) 

1	36	t2F(2) 

1	33	*+k(2) 

1	28	F*G(2) 

1	25	+K|(2) 

1	24	.+p(2) 

1	21	^|F(2) 

1	18	2+F(2) 

1	14	+p2(2) 

1	13	+F+(2) 

1	11	p+|(2) 

1	10	<+H(2) 

I only got one repeat to appear for quadgrams:

Count	Period	Ngrams

1	73	OK+|(2) 

http://zodiackillerciphers.com

 
Posted : September 16, 2015 10:54 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

Here’s the "p+" bigram occurring 4 different times among the period 19 bigrams:

The other interesting thing about period 19 ngrams is that they cover exactly 20 spots in the grid, which is precisely the number of rows occupied by the cipher. Is there some connection to rows?

http://zodiackillerciphers.com

 
Posted : September 16, 2015 11:12 pm
smokie treats
(@smokie-treats)
Posts: 1626
Noble Member
 

I currently have stronger feelings for the bigram peak at a distance of 19 (count 37 . . . I just ran a short test of 1 million randomizations and count 37 appears just 1 time. That’s one in a million! This can no longer be ignored.

340 bigrams for all directions up to a distance of 100:
(red=normal, green=mirrored, blue=verticals, pinkish=diagonals)

I am very interested and would like to spend some time thinking it over, but not sure exactly what you mean. The y axis is count of 37 bigrams, but what is the x axis? Distance between step (period) 0 bigrams or step (period) x bigrams?

EDIT: Nevermind. Doranchak just showed me.

 
Posted : September 16, 2015 11:22 pm
doranchak
(@doranchak)
Posts: 2614
Member Admin
 

Here’s the next best one, occurring 3 times:

I rank it "next best" because "<" and "S" have relatively low frequency in the cipher text.

Also, note how those repetitions create a false cycle.

http://zodiackillerciphers.com

 
Posted : September 16, 2015 11:23 pm
Page 18 / 96
Share: