**XX. Optimization of number of homophonic substitution iterations ( most efficient )**: viewtopic.php?f=81&t=3196&start=1200

While higher substitution iterations will increase the solve rate, it may not be better in the same time frame, given that every substitution iterations costs time.

From your tests:

Substitution iterations * hill climber iterations * restarts / solves = substitution iterations per solve200,000 * 10,000 * 500 / 12 = 83,333,333,333 substitution iterations per solve (83 billion)

400,000 * 10,000 * 500 / 51 = 39,215,686,274 substitution iterations per solve (39 billion) <— most optimal

600,000 * 10,000 * 500 / 70 = 42,857,142,857 substitution iterations per solve (42 billion)

800,000 * 10,000 * 500 / 80 = 50,000,000,000 substitution iterations per solve (50 billion)

1,000,000 * 10,000 * 500 / 94 = 53,191,489,361 substitution iterations per solve (53 billion)

1,200,000 * 10,000 * 500 / 110 = 54,545,454,545 substitution iterations per solve (54 billion)

1,400,000 * 10,000 * 500 / 130 = 53,846,153,846 substitution iterations per solve (53 billion) <— possible outlier (upwards)

1,600,000 * 10,000 * 500 / 115 = 69,565,217,391 substitution iterations per solve (69 billion) <— possible outlier (downwards)– We could start with a base of 500,000 substitution iterations and slap on another 50,000 for every null & skip considered such that 12 nulls & skips would be tested with 1,100,000 substitution iterations.

– Similarly we could start with a base of 10,000 hill climber iterations and slap on another 2,500 per null & skip so that we end up with 40,000 hill climber iterations at 12 nulls & skips.

– From 10 nulls & skips use 6-grams.

– Temperature 40, shift 0-30%, div 2.

__Links to Test Messages__

**Message with perfect palindromic cycles**. Idea is to create unigram repeat stats similar to the 340: http://www.zodiackillersite.com/viewtop … 6&start=90

**Smokie2017.1**: http://www.zodiackillersite.com/viewtop … &start=140

It is message #31 from Jarlve’s plaintext library, group IV, which has period 15 / 19 and period 39 / 29 repeat stats similar to the 340 regarding differences when reading left right top bottom or right left top bottom. Created during experiment to figure out if the mirrored period 29 spike is statistically significant.

**Smokie39**: http://www.zodiackillersite.com/viewtop … 006#p52006

34 x 10 inscription rectangle, transcription LRTB except that rows 3 and 18 and transcribed RL instead of LR. One polyphone to simulate the +, and two hand made pivots that create a few additional polyphones but no distortion.

**Smokie44ABC**: http://www.zodiackillersite.com/viewtop … &start=300

All have one row that is mostly null, and about 45 symbols. One null row is close to the top or bottom and should theoretically be easiest to solve. One has a null row about 1/4 of the way up or down from the bottom or top, and should be second easiest to solve. One has a null row near the middle, and should be the most difficult to solve.

**Smokie45ABC**: http://www.zodiackillersite.com/viewtop … &start=300

Same as smokie44, but with about 63 symbols.

**Smokie46ABC**: http://www.zodiackillersite.com/viewtop … &start=310

They are all 1:1 substitutes, or plaintext. All have multiple complete inscription rectangles that are the same size, though not too many. Just a few medium sized rectangles. Since 340 doesn’t have a lot of good divisors, they all have some untransposed text at the end of the message, whether at the top or bottom of the cryptogram. Transcription could be LRTB, RLTB, LRBT or RLBT.

**Smokie47ABC +/- 45 symbols**. Same concept as smokie46: http://www.zodiackillersite.com/viewtop … &start=310

**Smokie48ABC +/- 63 symbols**. Same concept as smokie46 and smokie47: http://www.zodiackillersite.com/viewtop … &start=310

**Smokie49ABC**: http://www.zodiackillersite.com/viewtop … &start=340

Smokie 49ABC is 1:1 substitutes, with one column mostly or all gibberish. The inscription rectangle is complete, but I made them so that they all make up at least 323 symbols. One corner in most cases, or possibly one row, at the top or bottom of the message is not transposed. EDIT: Most of the following messages, including smokie50 and smokie51, are close to 340 with only a few gibberish symbols in one corner.

**Smokie50ABC +/- 45 symbols**. Same concept as smokie49: http://www.zodiackillersite.com/viewtop … &start=340

**Smokie51ABC +/- 63 symbols**. Same concept as smokie49 and smokie50: http://www.zodiackillersite.com/viewtop … &start=340

**Smokie52ABC-54ABC**: viewtopic.php?f=81&t=3196&start=350

Smokie52ABC-54ABC all have three rectangles. Inscribe into rectangle 1, transcribe into rectangle 2, then read off the symbols in rectangle 2 and transcribe into rectangle 3. There are really only two different schemes. With both, rectangle 1 and rectangle 2 have the exact same dimensions. For one scheme it just so happens that transcribing into rectangle 3 makes period 2 look like period 15 / 19, and for the other scheme it just so happens that transcribing into rectangle 3 makes period 3 look like period 15 / 19. But then with mirroring, flipping, and flipping and mirroring, there are eight matrixes to choose from. With all messages, one corner, or maybe even one entire row and a small part of the next row at either the top or bottom are gibberish. Smokie52 ( sorry, no spike at objective period with A and C; sometimes difficult with 1:1 substitutes )

**Smokie55ABC and 56ABC**: http://www.zodiackiller.net/viewtopic.p … 0a7#p52250

Smokie55ABC and 56ABC have 17 x 20 inscription rectangles LRTB, reading the symbols off vertically TBLR, and transcribing into a 17 x 20 transcription rectangle. One polyphone and cycles about the same as the 340. Smokie55ABC is the control group. There are no nulls, so I want to see what happens with the "symbol as null" detector spreadsheet suite. Smokie56ABC have nulls.

__Links to related topics in other threads or on other websites__

**1A. Department of the Army Technical Manual published 1955 Discussion of Route Transposition**: https://www.nsa.gov/news-features/decla … 078809.pdf

**1B. War Department Technical Manual TM 11-484**: http://radionerds.com/images/b/bb/TM_11-485.pdf

**1C. FIELD MANUAL NO 34-40-2 HEADQUARTERS DEPARTMENT OF THE ARMY Washington, DC, 13 September 1990**: http://www.umich.edu/~umich/fm-34-40-2/#pdf

**2. Doranchak’s important probability calculations regarding the period 15 / 19 bigram repeats**: http://www.zodiackillersite.com/viewtop … son#p46618

**3. Doranchak’s important probability calculations regarding the pivots**: http://zodiackillerciphers.com/wiki/ind … tle=Pivots

**4. Non-cipher way to hide period 1 bigrams with wildcard symbols ( purple haze message ) pages 5-6 starting here**: viewtopic.php?f=81&t=2617&start=40

**5. Moonrock’s cycle classifications**: http://www.zodiackiller.net/viewtopic.php?f=81&t=3179

**6. List of cryptography websites**: viewtopic.php?f=81&t=2916

**7. Doranchak’s test cipher library**: https://docs.google.com/spreadsheets/d/ … itMiA/edit

**8. Bifid experiments**: http://www.zodiackillersite.com/viewtop … fid#p47397

**9. Dick Tracy Secret Coder toy**: http://sliderulemuseum.com/Manuals/M102 … r_Inst.pdf

**10. 10. Dick Tracy Raisin Bran decoder toy thread**: http://www.zodiackillersite.com/viewtop … Dick+Tracy

**11. Lawrence Secret Code Maker instructions**: http://www.sliderulemuseum.com/Manuals/ … t_1945.pdf

__To Do List__

□ Experiment to find out if merging two symbols, and finding the percentage increase in period x bigram repeats, multiplied by the cycle score, will help to identify true cycles

□ Concise discussion of the relationship between key efficiency, symbol count distribution, bigram repeat diffusion after encoding, inscription rectangle dimensions, generating route transposition tracing maps, and using values from testing of the above to determine if any particular transposition could be what Zodiac did

□ Concise discussion of the + symbol including cycles, representation on the period 15 / 19 bigram list, 24 randomly placed nulls as represented on the period 15 / 19 bigram list, and polyphones in the 408

□ Cross period comparison between period 29 and all other periods; specifically compare to period 30 for evidence of misalignment / skipped symbols at regular intervals in bottom half of message

□ Find out if plaintext direction can be detected with mean probability of bigram repeats scores; work with plaintext first, then encoded plaintext

□ Find out if the plaintext direction can be detected with un-transposed message; continue with experiment here: viewtopic.php?f=81&t=2617&hilit=direction&start=750

□ Find out if the pivots are nulls; delete the pivots, un-transpose and try to solve

□ Finish the cycle fingerprint project which may give some perspective on whether Zodiac changed his key at certain points throughout the message

□ Make some messages trying to match the 340 cycle fingerprint with a cycled key and creating a spike in period n bigram repeats ( any period )

□ Simple way to mask bigram repeats by just switching symbols close to each other in adjacent rows, creating a spike in period 15 / 19 bigram repeats; detection

□ Compare to masking experiment here: viewtopic.php?f=81&t=2617&start=50

□ Look for correlation between period 1 repeat void lower left of message and areas where cycles end; other periods and areas

□ Continue figuring out ways to detect complicated transpositions

□ Find out if a multiple rectangle route transposition can create a period n of period m bigram repeats that would create a coincidence count spike

□ Misunderstood grille cipher without transcription experiment

.

__Issue Outline__

*( some concepts more likely than others )*

I. Transposition

A. Bigram repeat statistics

1. Period 19 / 39; period 15 / 29

a. Count

b. Scoring methods

i. Probability score

ii. Comparison with random shuffle tests

B. Route Transposition

1. One inscription rectangle

a. One transcription rectangle

b. Multiple transcription rectangles

2. Multiple inscription rectangles

a. One transcription rectangle

b. multiple transcription rectangles

3. Various inscription routes, including alternating columns or rows

4. Knight’s Tour; other non-linear route transpositions

5. Polyliteral transposition

C. Route Transposition Complications

1. Incomplete Inscription rectangle

2. Other geometric shapes

3. Random transcription skips or nulls, intentional or error

4. Regular / interval transcription skips or nulls

5. Gibberish rows / columns

a. Row / column shuffle tests to detect gibberish

D. Other Classical Transpositions

1. Scytale

2. Anagram small message chunks

a. Multiple inscription shapes ( see above )

b. Rail fence

c. Grille

E. No Transposition

1. Plaintext with high period 15 / 19 repeat counts

2. Key that diffuses period 1 repeat plaintext efficiently and period 15 / 19 plaintext inefficiently

3. Multiple keys used at different positions, some keys with few ciphertext and some keys with many ciphertext

F. Other Transposition Issues

1. Detectable periods

a. Period 1x, 2x, nx

b. Ciphertext pairs shared by multiple periods

i. Incomplete inscription rectangle

2. Period 29 / 39 spike phenomenon

a. Transcription skips or nulls, random or regular

b. Period repeat count distribution changes caused by mirroring and / or flipping

c. Grille cipher

3. Comparing graphs of period 19, 38, 57, etc. and untransposed period 1, 2, 3, etc. to detect transposition scheme

4. Phenomenon of relatively high count period 1-5 repeats after route transposition

II. Substitution

A. Ciphertext Distribution Statistics

1. Low and medium count ciphertext

2. High count cyclic ciphertext

3. High count non-cyclic ciphertext

B. Homophonic Substitution

1. All homophonic substitution

2. Some polyalphabetic symbols that map to multiple plaintext

a. 408 solid triangle

b. Hillclimbing segregation of polyalphabetic symbols

3. Key change at intervals

4. One direction encoding

5. Multiple direction encoding

6. Multiple keys used to encode alternating plaintext, or in multiple parts

B. Cycle Statistics

1. Score Methods

a. Consecutive alternations probability score

b. Percentage of ciphertext that consecutively alternate

c. Ln(consecutive alternations probability score)

2. L=2 versus L=3-5

3. Random shuffle tests

4. Test messages with varying degrees of random symbol selection within ciphertext groups

5. Test messages with 100% random symbol selection within ciphertext groups

6. Hill climbing cycle groups

III. Other Classical Ciphers

A. Detectable periods with inefficient plaintext diffusion

1. Bifid with even plaintext period

B. Detectable periods with efficient plaintext diffusion

1. Vigenere

a. Kasiski examination, coincidence counting and column IoC analysis

C. No detectable periods with efficient plaintext diffusion

a. Ciphers supported by no evidence

IV. Miscellaneous

A. Language

1. English

2. Other

B. Hoax

1. Intentional period 15 / 19 repeats

2. Unintentional period 15 / 19 repeats

3. Intentional cycles

4. Unintentional cycles

Thanks for organizing this thread, smokie. Can you provide a summary of why you think Z340 isn’t transposed prior to encoding ? Do you instead think he transposed it after encoding?

I think that Zodiac may have transposed the plaintext before encoding because your shuffle tests show that the spike in period 15 / 19 repeats is difficult to duplicate without transposition, and because of the low count of period 1 repeats. However, you and Jarlve have tried to untranspose the message thousands of different ways, and it will not solve. I am not clear on exactly what all you guys have tried, but I assume it is a fairly broad spectrum of possibilities.

The regional biases of the repeats could be a clue that he used multiple inscription and / or transcription rectangles. The pivots seem to be a clue, and I speculate that if they are not created by the cipher, then it is more likely that he put them there intentionally than if they are random. Your testing shows that two random pivots are highly improbable.

I also suspect that Zodiac may have cycled his key throughout the message, or changed his key at some point in the message, for example at the halfway point where two – symbols appear in columns 1 and 17. I have been thinking about whether this could create a spike in period n repeats, and testing this idea is on my to do list.

I do not think that Zodiac transposed the message after encoding because the message is still very cyclic. On average the message is about 70% to 75% perfect cycles, or in other words, he randomly selected his symbols within homophonic cycles about 25% to 30% of the time. The message is more cyclic at the top, and becomes less cyclic toward the bottom. However, even though there are a lot of two symbols cycles, there are not very many three symbol cycles. The fact that symbols 37, 38 and 41 all land on odd numbered positions, and that they are one of the highest scoring three symbol cycles is very interesting to me.

I do not see any evidence of columnar transposition, as opposed to route transposition. Perhaps I should change the name of this thread.

I believe that somewhere in the message is another statistical fact. There is some evidence, somewhere, that will give another clue as to what he did. The idea is to keep exploring a lot of different ideas until we find it.

EDIT: I would also add that the cryptography books of the era described route transposition with examples of smaller inscription shapes, such as small 5 x 5 squares. Military cryptographers preferred small shapes to help eliminate mistakes. Assuming Zodiac learned cryptography from a book, then he would have seen these examples.

EDIT 2: I do not currently believe that there was a step involving some type of polyalphabetic diffusion. I tried to make Vigenere messages with five letter keywords, hoping to create a period 15 bigram spike, but the plaintext was too diffused before I got to the homophonic substitution. I would generally rule out Vigenere and Playfair, among others like them.

Reading right left top bottom OR left right bottom top:

Spikes at 15, 29 and 110 ( pivot symbols are period 29 ).

*******

EDIT: See doranchak’s shuffle experiment where he compares the mirrored 340 repeat statistics to Higgs Boson statistics:

viewtopic.php?f=81&t=2617&p=46618&hilit=boson#p46618

*******

Reading top bottom left right OR bottom top right left:

Spikes at 41, 2, 5 and 189 ( pivot symbols are period 102, but not as pronounced reading the message this way ).

Reading top bottom right left OR bottom top left right:

Spikes at 39, 2, 98 ( pivot symbols are period 98 ) and maybe 187.

__2. Bigram repeat counts compared to mean probability score of bigram repeats__

The probability score of the bigram repeats for each period, where in a bigram repeat AB AB, the score would be:

LN ( 1 / ( ( ( COUNT OF A / 340 ) * ( COUNT OF B / 340 ) ) ^ 2 ) ).

Now, let’s sort things out to see what spikes may be more statistically significant than others by comparing the count of bigram repeats at each period to the mean probability score of the bigram repeats at each period.

340 reading left right top bottom OR right left bottom top. Repeat counts are at top, mean probability score at bottom:

Spikes are at period 64, 19 and 204. The fact that, out of all of these periods, period 19 ranks # 1 in count and # 2 in mean probability score seems very compelling to me. Not only are there more of them, but the symbols are relatively low in count and therefore less likely to arrange themselves into this pattern as compared to all other periods. Even though they do.

Period 64 is very interesting to me also because it is a perfect diagonal and reminds me of the pivots. The score is driven up by symbols 17 and 19. There are only five symbol 17, but four of them are arranged in this pattern with symbol 19 ( the + ). See marked red on left. Another interesting pattern marked blue on right.

I wonder if an arrangement of 5 x 5 inscription squares, written into a 15 x 20 grid, with some type of transposition and then re-drafted into 17 columns, could create period 19 repeats, the period 64 repeats, and the pivots.

EDIT: left right top bottom bigrams, sorted by count on the left, and mean probability score on the right, top twenty. The only other period that ranks in the top twenty for both lists is period 1.

340 reading right left top bottom OR left right bottom top. Repeat counts are at top, mean probability score at bottom:

Spikes appear at periods 196, 223, 209 and 15, among others. Periods 15 and 17 are on the top twenty lists for both repeat counts and mean probability scores.

Period 29 ranks # 107 and period 110 ranks # 203 for mean probability score. Some periods are diffused more than others, so I don’t know how significant the period 29 bigram spike really is.

Period 64 from left right top bottom is now period 72 for right left top bottom. It ranks # 21 for mean probability score.

I find it interesting that period 30 ranks # 13, since with route transposition that would be period 2 transposed ( period 15 x 2 ).

EDIT: There are five repeats at period 209, with a possible regional bias.

340 reading top bottom left right ( EDIT: OR bottom top right left ), bigram repeats counts at top and mean probability scores at bottom.

Period 41 is what looks like period 15 and period 19 with the other reading directions. And there is the predicted spike on the mean probability score chart lining up with period 41. The mean probability score chart also has spikes at periods 55, 220, 226 and a massive spike at period 264.

The top twenty comparison shows periods 1, 20 and 41 in both.

EDIT: The file above should have been named 340.top.bottom.left.right.comparison.png

These are the period 264 repeats that cause the massive spike, message drafted into 66 columns making them line up. I can also make them line up with 33 columns and 44 columns, but not in the same row.

Hi smokie,

your thread is very interesting and I like your analytic approach. First I thought that the spikes could be some random noise but then I read David’s posting about the Higgs Boson test. It is always fascinating to look at those sheets and think "there is something…". I am curious what’s next

Regards

Largo

Thanks! I am not giving up yet. I think that if Zodiac transposed his plaintext before encoding, there may be another statistical clue somewhere in the message that can help us figure out what the transposition scheme is. But there is a lot of statistical information and I am trying to figure out another way to analyze it and pinpoint what is significant and what is not. I am working on a to-do list and have a lot of ideas.

Hey smokie, what is route transposition?

A route cipher, what we were working on.

I have been looking at your new program and I really like being able to see the message solved before my eyes in the right side panel. There is a certain amount of satisfaction to see what the program is doing.

I am still working on the 340.

EDIT: Jarlve, you might find this thread interesting:

Hey smokie, I have briefly glanced over the thread before. Can you summarize?

If you open the 340 in AZdecrypt and click on stats -> sequential you will see a statistic called midpoint shift. The measures how much symbols have shifted away from the midpoint of the cipher, which is about position 170.

Midpoint shift: - Raw: 7703 - Normalized: 0.2665397923875433

At 0.266 it is quite low for the 340 and I believe this softly rules out non-trivial polyalphabetism at the encoding level and also regional bias of cycles.

My hypothesis for the encoding is that Zodiac mainly tried not to repeat symbols in a certain view window and that it is from left-to-right, top-to-bottom. There are a couple of measurements that are suggestive of this. There are only 18 unigram repeats. The 408 in a 17 by 20 grid has 21.

We would expect the 408 to be lower in unigrams repeats than the 340 because the raw index of coincidence score is a good indication for the potential of repeats. The raw ioc for the 408 is 2108 and for the 340 it’s 2236. So the question then is, why does the 340 @ 63 symbols has more potential for repeats than an equal part of the 408 @ 54 symbols?

The answer simply is that the 340’s key (unigram symbol frequencies) is less flat and the flatness measurement is available in AZdecrypt under stats -> unigrams.

408, 17 by 20, flatness: 0.854241338112306 340, 17 by 20, flatness: 0.6685691569412501

Ok, but then why is the key of the 340 less flat than the 408? The high occurance of the "+" symbol plays part in that but in general the key of 340 just has a lower flatness. Given the low midpoint shift and no real other strong indications of polyalphabetism (to my knowledge) one could assume that the selection of symbols was just more random (no extensive cycling). Which follows the logic of the hypothesis and observation.

I could suggest that the size of his view window in which he tried not to repeat symbols probably was around 17, since we find the highest amount of unique strings of that length in the 340.

Sequence frequencies: -------------------------------------------------- Length 1: 5 Length 2: 6 Length 3: 8 Length 4: 9 Length 5: 11 Length 6: 11 Length 7: 14 Length 8: 15 Length 9: 15 Length 10: 18 Length 11: 19 Length 12: 20 Length 13: 20 Length 14: 21 Length 15: 22 Length 16: 23 Length 17: 26 <--- Length 18: 20 Length 19: 15 Length 20: 9 Length 21: 9 Length 22: 8 Length 23: 5 Length 24: 3 Length 25: 3 Length 26: 2 Length 27: 1 Length 28: 1 Length 29: 1

Ofcourse, possibly, other things may be at work but I like to keep it simple.