Thanks for your feedback – I will try to incorporate your suggestions.
Skipping backwards skips 10 instead of 20.
OK, fixed.
This seems to produce exactly the same results.
I think they are the same for the first few (since max seems strongly correlated to sum), but diverge at chart #6 when "Max increase" is selected (watch the Max values jump around as you scroll through).
Also, select 4-grams and then switching between the two sorting options a couple of times produces strange results.
That is probably due to the fact that changes to repeating 4-grams are rare, so the stats for them aren’t as useful. Let me know if you can point out any specific mistakes.
I don’t understand how these are sorted. Why does 1 of 1953 has a higher sigma then 2 of 1953? Many of the pages after 1 have much higher maximum increases.
It’s just how the math works out for situations where only a few periods produce +1 increases, and a single spike is found to be at +2. I think it highlights the problem with calculating sigma for only a few samples. And, a higher max increase might not have higher sigma because it doesn’t deviate from the mean as much. Also, it’s possible that none of the spikes are sufficiently significant, otherwise they would have appeared at the beginning of the charts (for instance, if a max of 20 was found when the average was 3 or something).
Only untransposed/untransformed periods are included.
Yes, I think I will run it again to generate data for transposed periods as well, since you found that anomaly for those.
Is it possible to find the maximum increase for a period?
OK – I replaced the "show period p spikes first" sort with "sort by period p". When it is selected, the number of increased bigrams at the selected period is displayed below the controls. The max increase for the period will be in the first chart displayed.
Would it be possible to weigh increases in repeats in another way that includes the signifance of the from-to range? For instance merging "+B" shows the highest increase with +16 at period 114. But when looking at the original 340, this period has only 15 repeats. So it goes up from 15 to 31, that may be special in its own way but I wonder if for instance going from 30 to 38 would be more statistically significant in terms of rating the period as interesting considering transposition etc.
That’s a really good point. I’ll have to think about how to approach that. Let me know if you have any suggestions. Right now I’m waiting for full results of a significance experiment based on shuffles, and looking for repeating ngrams from actual merges that diverge significantly from the expected ngrams of merges in shuffles. I was hoping to do 3-symbol merges too but it already takes too long just to run only 100 shuffles per symbol pair and count ngrams for n=2 through 5 across 170 periods (about 20 seconds per symbol group). The number of symbol groups would jump from (63 choose 2 = 1953) to (63 choose 3 = 39711). I need a cloud supercomputer.
I also find the results are difficult to interpret. What ASCII symbols are these that create a 5-gram repeat?
They are: %4
doranchak,
Thanks for changing the periods to max increase.
When at 21 of 1953 it won’t skip backwards and when at 1933 of 1953 it won’t skip forward.
And try this, reload your tool and set show data to 4-grams. You will see three spikes to the right side 157, 158 and 159. Now change sort by to "max increase", nothing changes. Now change it back to "sums of increases…" and notice how the spikes change offset to positions 159, 160 and 161. This happens with IE11 but not with Chrome.
I just coded something up for you. A superfast ngramsize 2 to 5 repeat counter. It counts ngramsize repeats in the array with a single pass while building a new array with enumerated couples to iterate ngramsize+1 on and so forth. It short circuits when less than 2 repeats are found since that is the minimum requirement for a ngramsize+1 repeat. You may only need to worry about the speed that your programming environment can handle the initializing and clearing of the arrays gram and iden. This is the FreeBASIC code where I had to use a constant to define the array sizes to be able to use the erase keyword, (339) which is the cipher length minus the minimum ngramsize-1 considered.
function m_ngramreps(array()as integer,byval l as integer)as string 'array()=cipher numbered by appearance from 1 to 63 for example 'l=cipher length dim as integer i,e,a,b,r,n dim as string reps dim as integer gram(339,339) 'l-1 ,l-1 dim as integer iden(339,339) 'l-1 ,l-1 for n=2 to 5 'n-gram range for i=1 to l-(n-1) 'single pass a=array(i) b=array(i+1) gram(a,b)+=1 if iden(a,b)=0 then e+=1 iden(a,b)=e end if array(i)=iden(a,b) if gram(a,b)>1 then r+=1 next i if r>0 then reps+=str(r)+" " if r<2 then return reps 'short-circuit e=0 r=0 erase gram,iden 'clear arrays next n return reps end function
EDIT:
Function output for 408.txt:
"62 11 2 "
2-grams: 62
3-grams: 11
4-grams: 2
Function output for p1.txt:
"191 87 41 20 "
2-grams: 191
3-grams: 87
4-grams: 41
5-grams: 20
Works perfectly.
I’ve updated the tool to include significance measurements for shuffles:
http://zodiackillerciphers.com/symbol-merge-ngrams/
Select "Shuffle Sigma" in the "Sort by" dropdown list. You’ll then see spikes that seem significant shown first among the charts. Each pair was compared to only 100 shuffles, so the comparison might not be fully adequate, but it does make the more visually obvious spikes show up first.
When at 21 of 1953 it won’t skip backwards and when at 1933 of 1953 it won’t skip forward.
Thanks for reporting that. It is fixed now.
And try this, reload your tool and set show data to 4-grams. You will see three spikes to the right side 157, 158 and 159. Now change sort by to "max increase", nothing changes. Now change it back to "sums of increases…" and notice how the spikes change offset to positions 159, 160 and 161. This happens with IE11 but not with Chrome.
I was able to reproduce this in IE. I noticed that the spikes change offset but they correspond to real data for another set of symbols. I think IE is not guaranteeing the sort order of the data when we change the sort criteria. So when the value it uses for sorting is the same, Chrome gives the same ordering but IE does not, for some reason. Strictly speaking, the result of the sort is still correct. I suppose I could add a secondary criteria to force a guaranteed order but it doesn’t seem like that big of a problem.
[code="Jarlve"]I just coded something up for you. A superfast ngramsize 2 to 5 repeat counter.
Thank you for sharing that. I’m always impressed with how much speed you get out of your algorithms. My ngram counter has a lot of unnecessary overhead that I stapled onto it from other experiments (such as a tracking a sorted list of frequency distributions). I should whittle it down for speed and incorporate your algorithm.
Thanks doranchak,
I’ve added some new stuff to AZdecrypt which reveal interesting behaviours about the encoding properties of the 340. The following ciphers are compared with my 2-symbol cycles measurement and a couple of transposition operations: the 340, a 340 character part of the 408, the smokie2 and a cipher of my own with 18% encoding randomization.
1. Directional:
This goes through the typical directional transposition. It stands out that the 340 shows high scores for some of the diagonal transpositions (1627). The other ciphers do not exhibit this behaviour.
340: Directional: (transposition, untransposition) -------------------------------------------------- Normal: 2152, 2152 <--- Mirror: 2029, 2029 Flip: 2029, 2029 Reverse: 2152, 2152 <--- -------------------------------------------------- Columnar 1: 1125, 1341 Columnar 2: 1208, 1325 Columnar 3: 1208, 1325 Columnar 4: 1125, 1341 -------------------------------------------------- Diagonal 1: 1010, 1336 Diagonal 2: 1590, 1281 Diagonal 3: 1627, 1478 Diagonal 4: 1027, 1520 Diagonal 5: 1027, 1520 Diagonal 6: 1627, 1478 Diagonal 7: 1590, 1281 Diagonal 8: 1010, 1336 -------------------------------------------------- Transposition average: 1471 Untransposition average: 1557.75 408: Directional: (transposition, untransposition) -------------------------------------------------- Normal: 2861, 2861 <--- Mirror: 2281, 2281 Flip: 2281, 2281 Reverse: 2861, 2861 <--- -------------------------------------------------- Columnar 1: 982, 1144 Columnar 2: 1089, 1294 Columnar 3: 1089, 1294 Columnar 4: 982, 1144 -------------------------------------------------- Diagonal 1: 983, 1044 Diagonal 2: 951, 1036 Diagonal 3: 905, 1137 Diagonal 4: 928, 1067 Diagonal 5: 928, 1067 Diagonal 6: 905, 1137 Diagonal 7: 951, 1036 Diagonal 8: 983, 1044 -------------------------------------------------- Transposition average: 1372.5 Untransposition average: 1483 smokie2: Directional: (transposition, untransposition) -------------------------------------------------- Normal: 2197, 2197 <--- Mirror: 1796, 1796 Flip: 1796, 1796 Reverse: 2197, 2197 <--- -------------------------------------------------- Columnar 1: 918, 1002 Columnar 2: 987, 932 Columnar 3: 987, 932 Columnar 4: 918, 1002 -------------------------------------------------- Diagonal 1: 1129, 1089 Diagonal 2: 1098, 1028 Diagonal 3: 1047, 935 Diagonal 4: 1087, 985 Diagonal 5: 1087, 985 Diagonal 6: 1047, 935 Diagonal 7: 1098, 1028 Diagonal 8: 1129, 1089 -------------------------------------------------- Transposition average: 1282.37 Untransposition average: 1245.5 jarlve18: Directional: (transposition, untransposition) -------------------------------------------------- Normal: 2178, 2178 <--- Mirror: 1824, 1824 Flip: 1824, 1824 Reverse: 2178, 2178 <--- -------------------------------------------------- Columnar 1: 1044, 928 Columnar 2: 1102, 973 Columnar 3: 1102, 973 Columnar 4: 1044, 928 -------------------------------------------------- Diagonal 1: 1158, 1165 Diagonal 2: 1242, 1035 Diagonal 3: 1161, 1166 Diagonal 4: 1103, 1213 Diagonal 5: 1103, 1213 Diagonal 6: 1161, 1166 Diagonal 7: 1242, 1035 Diagonal 8: 1158, 1165 -------------------------------------------------- Transposition average: 1351.5 Untransposition average: 1310.25
2. Offset row order:
This shifts the entire cipher downwards. The 340 seems normal here.
340: Offset row order: (transposition) -------------------------------------------------- Offset rows 1: 2080 Offset rows 2: 2112 Offset rows 3: 2118 Offset rows 4: 2009 Offset rows 5: 2044 Offset rows 6: 2129 Offset rows 7: 2113 Offset rows 8: 2101 Offset rows 9: 2125 Offset rows 10: 2139 Offset rows 11: 2121 Offset rows 12: 2012 Offset rows 13: 2091 Offset rows 14: 2013 Offset rows 15: 2150 Offset rows 16: 2077 Offset rows 17: 2014 Offset rows 18: 2070 Offset rows 19: 2065 Offset rows 20: 2152 <--- -------------------------------------------------- Transposition average: 2086.75 408: Offset row order: (transposition) -------------------------------------------------- Offset rows 1: 2861 <--- Offset rows 2: 2833 Offset rows 3: 2731 Offset rows 4: 2655 Offset rows 5: 2636 Offset rows 6: 2665 Offset rows 7: 2725 Offset rows 8: 2599 Offset rows 9: 2640 Offset rows 10: 2702 Offset rows 11: 2646 Offset rows 12: 2656 Offset rows 13: 2644 Offset rows 14: 2633 Offset rows 15: 2480 Offset rows 16: 2503 Offset rows 17: 2562 Offset rows 18: 2572 Offset rows 19: 2710 Offset rows 20: 2861 <--- -------------------------------------------------- Transposition average: 2665.7 smokie2: Offset row order: (transposition) -------------------------------------------------- Offset rows 1: 2135 Offset rows 2: 2123 Offset rows 3: 2011 Offset rows 4: 2013 Offset rows 5: 2007 Offset rows 6: 2022 Offset rows 7: 2026 Offset rows 8: 2026 Offset rows 9: 1980 Offset rows 10: 2054 Offset rows 11: 2061 Offset rows 12: 2101 Offset rows 13: 2029 Offset rows 14: 2115 Offset rows 15: 2096 Offset rows 16: 2001 Offset rows 17: 2061 Offset rows 18: 2040 Offset rows 19: 2144 Offset rows 20: 2197 <--- -------------------------------------------------- Transposition average: 2062.1 jarlve18: Offset row order: (transposition) -------------------------------------------------- Offset rows 1: 2092 Offset rows 2: 2058 Offset rows 3: 1983 Offset rows 4: 2038 Offset rows 5: 2092 Offset rows 6: 2025 Offset rows 7: 1984 Offset rows 8: 1893 Offset rows 9: 1946 Offset rows 10: 2050 Offset rows 11: 2113 Offset rows 12: 2075 Offset rows 13: 2033 Offset rows 14: 2051 Offset rows 15: 2090 Offset rows 16: 2035 Offset rows 17: 2049 Offset rows 18: 2108 Offset rows 19: 2170 Offset rows 20: 2178 <--- -------------------------------------------------- Transposition average: 2053.15
3. Offset column order:
This shifts the entire cipher rightwards. The 340 peaks at offset 15, though it is not much higher than its base. There is also less indication of suppression in the 340 scores at the midrange offsets compared to the other ciphers.
340: Offset column order: (transposition) -------------------------------------------------- Offset columns 1: 2085 Offset columns 2: 2025 Offset columns 3: 2080 Offset columns 4: 2066 Offset columns 5: 2090 Offset columns 6: 1995 Offset columns 7: 2040 Offset columns 8: 2024 Offset columns 9: 1963 Offset columns 10: 1994 Offset columns 11: 2031 Offset columns 12: 2023 Offset columns 13: 2094 Offset columns 14: 2113 Offset columns 15: 2212 <--- Offset columns 16: 2157 Offset columns 17: 2152 -------------------------------------------------- Transposition average: 2067.29 408: Offset column order: (transposition) -------------------------------------------------- Offset columns 1: 2752 Offset columns 2: 2720 Offset columns 3: 2698 Offset columns 4: 2523 Offset columns 5: 2515 Offset columns 6: 2503 Offset columns 7: 2427 Offset columns 8: 2449 Offset columns 9: 2362 Offset columns 10: 2412 Offset columns 11: 2429 Offset columns 12: 2453 Offset columns 13: 2507 Offset columns 14: 2538 Offset columns 15: 2596 Offset columns 16: 2699 Offset columns 17: 2861 <--- -------------------------------------------------- Transposition average: 2555.52 smokie2: Offset column order: (transposition) -------------------------------------------------- Offset columns 1: 2088 Offset columns 2: 2015 Offset columns 3: 1968 Offset columns 4: 1968 Offset columns 5: 1922 Offset columns 6: 1965 Offset columns 7: 1867 Offset columns 8: 1854 Offset columns 9: 1856 Offset columns 10: 1829 Offset columns 11: 1851 Offset columns 12: 1870 Offset columns 13: 1926 Offset columns 14: 2035 Offset columns 15: 2052 Offset columns 16: 2097 Offset columns 17: 2197 <--- -------------------------------------------------- Transposition average: 1962.35 jarlve18: Offset column order: (transposition) -------------------------------------------------- Offset columns 1: 2095 Offset columns 2: 2085 Offset columns 3: 2089 Offset columns 4: 1987 Offset columns 5: 1973 Offset columns 6: 1975 Offset columns 7: 1956 Offset columns 8: 1922 Offset columns 9: 1874 Offset columns 10: 1874 Offset columns 11: 1863 Offset columns 12: 1912 Offset columns 13: 1990 Offset columns 14: 1997 Offset columns 15: 2099 Offset columns 16: 2100 Offset columns 17: 2178 <--- -------------------------------------------------- Transposition average: 1998.17
4. Period row order:
This changes the order of the rows according the period operation. In the 340 the scores in general do not drop as low as the other cipher, are less suppressed. The lowest values observed for each cipher, usually near the midrange periods: 340: 1297, 408: 1096, smokie2: 1069, jarlve18: 1115. The average scores are also much higher for the 340 due to this behaviour when compared to the other ciphers excluding the 408 which has a much higher base score.
340: Period row order: (transposition, untransposition) -------------------------------------------------- Period rows 1: 2152, 2152 <--- Period rows 2: 1809, 1451 Period rows 3: 1392, 1578 Period rows 4: 1297, 1629 Period rows 5: 1629, 1297 Period rows 6: 1796, 1691 Period rows 7: 1578, 1392 Period rows 8: 1409, 1513 Period rows 9: 1570, 1605 Period rows 10: 1451, 1809 Period rows 11: 1498, 2000 Period rows 12: 1451, 2081 Period rows 13: 1581, 1750 Period rows 14: 1593, 1618 Period rows 15: 1552, 1640 Period rows 16: 1603, 1784 Period rows 17: 1646, 1855 Period rows 18: 1856, 1922 Period rows 19: 1958, 1974 -------------------------------------------------- Transposition average: 1622.15 Untransposition average: 1723.21 408: Period row order: (transposition, untransposition) -------------------------------------------------- Period rows 1: 2861, 2861 <--- Period rows 2: 1786, 1539 Period rows 3: 1262, 1340 Period rows 4: 1445, 1256 Period rows 5: 1256, 1445 Period rows 6: 1096, 1501 Period rows 7: 1340, 1262 Period rows 8: 1494, 1578 Period rows 9: 1415, 1605 Period rows 10: 1539, 1786 Period rows 11: 1531, 1773 Period rows 12: 1433, 1882 Period rows 13: 1555, 1978 Period rows 14: 1586, 1863 Period rows 15: 1812, 2211 Period rows 16: 2114, 2272 Period rows 17: 2142, 2391 Period rows 18: 2307, 2525 Period rows 19: 2516, 2811 -------------------------------------------------- Transposition average: 1710 Untransposition average: 1888.36 smokie 2: Period row order: (transposition, untransposition) -------------------------------------------------- Period rows 1: 2197, 2197 <--- Period rows 2: 1450, 1276 Period rows 3: 1230, 1350 Period rows 4: 1167, 1214 Period rows 5: 1214, 1167 Period rows 6: 1189, 1124 Period rows 7: 1350, 1230 Period rows 8: 1166, 1069 Period rows 9: 1160, 1313 Period rows 10: 1276, 1450 Period rows 11: 1383, 1281 Period rows 12: 1375, 1442 Period rows 13: 1509, 1680 Period rows 14: 1569, 1814 Period rows 15: 1659, 1778 Period rows 16: 1742, 1853 Period rows 17: 1771, 1865 Period rows 18: 1894, 1983 Period rows 19: 2034, 2122 -------------------------------------------------- Transposition average: 1491.31 Untransposition average: 1537.26 jarlve18: Period row order: (transposition, untransposition) -------------------------------------------------- Period rows 1: 2178, 2178 <--- Period rows 2: 1435, 1329 Period rows 3: 1425, 1241 Period rows 4: 1496, 1175 Period rows 5: 1175, 1496 Period rows 6: 1115, 1219 Period rows 7: 1241, 1425 Period rows 8: 1280, 1504 Period rows 9: 1357, 1308 Period rows 10: 1329, 1435 Period rows 11: 1349, 1358 Period rows 12: 1326, 1476 Period rows 13: 1322, 1718 Period rows 14: 1360, 1888 Period rows 15: 1540, 2037 Period rows 16: 1670, 1875 Period rows 17: 1779, 1899 Period rows 18: 1895, 1970 Period rows 19: 2010, 2112 -------------------------------------------------- Transposition average: 1488.52 Untransposition average: 1612.78
5. Period column order:
This changes the order of the columns according the period operation. Again the 340 has a peak not a the main period although the difference is small. Other ciphers show a drop in scores around the midrange periods but the 340 does not.
340: Period column order: (transposition, untransposition) -------------------------------------------------- Period columns 1: 2152, 2152 Period columns 2: 2097, 2177 <--- Period columns 3: 2126, 2116 Period columns 4: 2067, 2058 Period columns 5: 2023, 1955 Period columns 6: 2116, 2126 Period columns 7: 2073, 2049 Period columns 8: 2022, 1990 Period columns 9: 2177, 2097 Period columns 10: 2162, 2087 Period columns 11: 2167, 2036 Period columns 12: 2102, 2127 Period columns 13: 2107, 2088 Period columns 14: 2164, 2091 Period columns 15: 2154, 2034 Period columns 16: 2198, 2092 <--- -------------------------------------------------- Transposition average: 2119.18 Untransposition average: 2079.68 408: Period column order: (transposition, untransposition) -------------------------------------------------- Period columns 1: 2861, 2861 <--- Period columns 2: 2687, 2686 Period columns 3: 2612, 2685 Period columns 4: 2557, 2580 Period columns 5: 2580, 2587 Period columns 6: 2685, 2612 Period columns 7: 2703, 2627 Period columns 8: 2660, 2623 Period columns 9: 2686, 2687 Period columns 10: 2641, 2639 Period columns 11: 2558, 2650 Period columns 12: 2593, 2620 Period columns 13: 2644, 2596 Period columns 14: 2653, 2742 Period columns 15: 2661, 2756 Period columns 16: 2718, 2765 -------------------------------------------------- Transposition average: 2656.18 Untransposition average: 2669.75 smokie2: Period column order: (transposition, untransposition) -------------------------------------------------- Period columns 1: 2197, 2197 <--- Period columns 2: 2071, 1991 Period columns 3: 1982, 2005 Period columns 4: 1991, 1943 Period columns 5: 2031, 1917 Period columns 6: 2005, 1982 Period columns 7: 2058, 1932 Period columns 8: 2086, 1989 Period columns 9: 1991, 2071 Period columns 10: 1955, 2008 Period columns 11: 1956, 2080 Period columns 12: 1977, 2006 Period columns 13: 2003, 2021 Period columns 14: 2011, 2003 Period columns 15: 2022, 2042 Period columns 16: 2138, 2091 -------------------------------------------------- Transposition average: 2029.62 Untransposition average: 2017.37 jarlve18: Period column order: (transposition, untransposition) -------------------------------------------------- Period columns 1: 2178, 2178 <--- Period columns 2: 2030, 2060 Period columns 3: 1946, 2011 Period columns 4: 1972, 1984 Period columns 5: 2049, 2051 Period columns 6: 2011, 1946 Period columns 7: 1951, 1996 Period columns 8: 1992, 1924 Period columns 9: 2060, 2030 Period columns 10: 2048, 2030 Period columns 11: 2000, 2042 Period columns 12: 2013, 2034 Period columns 13: 2047, 2026 Period columns 14: 2038, 2119 Period columns 15: 2113, 2107 Period columns 16: 2136, 2094 -------------------------------------------------- Transposition average: 2036.5 Untransposition average: 2039.5
Conclusions:
In 4 out of 5 observations the 340 deviates in a consistent manner
Behaviour 1: some of the diagonal scores are quite high.
Behaviour 2: the operations in general seem to suppress the encoding scores much less for the 340 than the other ciphers.
Behaviour 3: encoding scores peak are not always where they are supposed to be.
Are these behaviours connected somehow?
Hypothesis 1: the row order of the 340 has been altered after the encoding.
Hypothesis 2: the encoding direction in the 340 is not quite as we think it is and possibly follows a route that has a diagonal component to it.
Interesting results, Jarlve. The bump in encoding score for diagonals is curious. There’s something that "feels" diagonal about the pivots (namely, a pivot’s line of symmetry is a diagonal line). I also noticed that a "Snake" transposition (read line 1 from left to right, line 2 from right to left, line 3 from left to right, etc) has only a small effect on your cycle score (I measured it as 2080.25). I also tried to find operations in my old transposition experiments that produced high repeating bigrams AND high cycle scores. PeriodColumn(2) FlipVertical() Diagonal(1) produces 41 repeating bigrams but kills your cycle score (1164.80).
This goes through the typical directional transposition. It stands out that the 340 shows high scores for some of the diagonal transpositions (1627)
To get an idea of significance, I shuffled 10,000 times and ran my implementation of your cycle score:
Min: 809.8
Max: 1697.0
Mean: 1197.3
Std dev: 126.8
Z340’s cycle score is 2150.7 (it’s different from yours a little for some reason), which is 7.5 std deviations from the mean of shuffles.
Your diagonal cycle score of 1627 is 3.4 std deviations from the mean of shuffles.
Here are the charts for increases of transposed periodic ngrams resulting from symbol merges, at this new link:
http://zodiackillerciphers.com/symbol-m … transpose/
The previous link still shows the untransposed periodic ngram data if you want to compare them.
Both links include 5-grams now.
Select "shuffle sigma" in the sort by dropdown to view the more significant spikes in the data.
Only one pair of symbols, "(W", produces an increase in transposed 5-grams when merged. Interestingly, the 5-grams appear in the same columns, at transposed period 68 which is a multiple of 17:
(I think that is the equivalent of untransposed period 5)
The same 5-grams appear at periods 69 and 70:
I’m still puzzled about the significance of this. Maybe these are just phantoms. It may be related to your earlier question, Jarlve — if we are already looking at bigram anomalies at certain periods, then formation of larger-order ngrams might be more likely since there are more building blocks lying around.
And, more generally: Wouldn’t we be seeing more convincing spikes if we were merging correctly selected pairs of symbols to form homophones? I.e., newly forming bigrams should be appearing at the anomalous periods. Maybe it’s not sufficient to merge a single set of symbol pairs together at a time. We might need to expand the experiment to merge more than two symbols together, or even to consider more than one homophone at once (k sets of symbols, where each set contains m symbols). I’m planning to move away from this exploration for the moment though because I need to work on the cipher type identifier.
Side note: Maybe you’ve done this already, but here’s a list of equivalencies between untransposed and transposed periods:
untransposed period 2 = transposed period 170
untransposed period 4 = transposed period 85
untransposed period 5 = transposed period 68
untransposed period 10 = transposed period 34
untransposed period 11 = transposed period 31
untransposed period 17 = transposed period 20
untransposed period 20 = transposed period 17
untransposed period 31 = transposed period 11
untransposed period 34 = transposed period 10
untransposed period 68 = transposed period 5
untransposed period 85 = transposed period 4
untransposed period 170 = transposed period 2
Interesting results, Jarlve. The bump in encoding score for diagonals is curious. There’s something that "feels" diagonal about the pivots (namely, a pivot’s line of symmetry is a diagonal line). I also noticed that a "Snake" transposition (read line 1 from left to right, line 2 from right to left, line 3 from left to right, etc) has only a small effect on your cycle score (I measured it as 2080.25). I also tried to find operations in my old transposition experiments that produced high repeating bigrams AND high cycle scores. PeriodColumn(2) FlipVertical() Diagonal(1) produces 41 repeating bigrams but kills your cycle score (1164.80).
Is PeriodColumn(2) a transposition/untransposition or transformation/untransformation? I want to take another look at it. I wonder about the discrepancy between our measurements. It seems that in the 340 there are allot of short 2 symbol cycles, what do you think?
This goes through the typical directional transposition. It stands out that the 340 shows high scores for some of the diagonal transpositions (1627)
To get an idea of significance, I shuffled 10,000 times and ran my implementation of your cycle score:
Min: 809.8
Max: 1697.0
Mean: 1197.3
Std dev: 126.8Z340’s cycle score is 2150.7 (it’s different from yours a little for some reason), which is 7.5 std deviations from the mean of shuffles.
Your diagonal cycle score of 1627 is 3.4 std deviations from the mean of shuffles.
Thanks doranchak,
I just ran a test with 10000 iterations that for every iteration encodes a plaintext with 18% randomization while matching 340 ioc. And then transposes diagonal 3 and measures. For the 340 this diagonal direction sits at 1627 while the test highest was 1540. I feel this must be something, but what?
Combinations processed: 10000/10000 Measurements: - Summed: 10043332.20806184 - Average: 1004.333220806184 - Lowest: 671.0121436408454 (Plaintext(1), Encode: homophonic substitution(3854), Diagonal(TP,3)) - Highest: 1540.956268463562 (Plaintext(1), Encode: homophonic substitution(499), Diagonal(TP,3))
I’m still puzzled about the significance of this. Maybe these are just phantoms. It may be related to your earlier question, Jarlve — if we are already looking at bigram anomalies at certain periods, then formation of larger-order ngrams might be more likely since there are more building blocks lying around.
Yes, I agree and thought the same. It is more likely for higher order ngrams to appear for periods which have higher repeats.
And, more generally: Wouldn’t we be seeing more convincing spikes if we were merging correctly selected pairs of symbols to form homophones? I.e., newly forming bigrams should be appearing at the anomalous periods. Maybe it’s not sufficient to merge a single set of symbol pairs together at a time. We might need to expand the experiment to merge more than two symbols together, or even to consider more than one homophone at once (k sets of symbols, where each set contains m symbols). I’m planning to move away from this exploration for the moment though because I need to work on the cipher type identifier.
I know that AZdecrypt can create text level ngram repeats at any direction you throw at it with a cipher like the 340. Perhaps the true plaintext direction would allow for a higher maximum formation of ngrams. I’ll see about testing this with AZdecrypt.
I’m thinking about scoring this way, a 2-gram repeat scores 1, a 3-gram repeat scores 2, a 4-gram repeat scores 4, a 5-gram repeat scores 8, etc. My reasoning here is that two 2-gram repeats can potentially form a 3-gram repeat, so the score of two 2-gram repeats should be equal to the score of one 3-gram repeats. The score of two 3-gram repeats equals the score of one 4-gram repeat and so forth. What do you think?
Side note: Maybe you’ve done this already, but here’s a list of equivalencies between untransposed and transposed periods:
untransposed period 2 = transposed period 170
untransposed period 4 = transposed period 85
untransposed period 5 = transposed period 68
untransposed period 10 = transposed period 34
untransposed period 11 = transposed period 31
untransposed period 17 = transposed period 20
untransposed period 20 = transposed period 17
untransposed period 31 = transposed period 11
untransposed period 34 = transposed period 10
untransposed period 68 = transposed period 5
untransposed period 85 = transposed period 4
untransposed period 170 = transposed period 2
Interesting, it correlates with the factors of 340. That was actually one of the first things I learned about periodical transposition, it is multiplicative. Stacking two period 5 transpositions is equal to a period 25 transposition etc.
340 / 2 = 170
340 / 4 = 85
340 / 5 = 68
340 / 10 = 34
340 / 11 = 30.909090…
Yes, simple multiplication and division going on here. I wonder if we could do fractional periods this way. Period transposition equals the multiplication operation and period untransposition would be division, then what would equal addition, subtraction, and square root operators? The square root of 340 is close to 19 (18.43).
I’ve added depth bigram stats to AZdecrypt. This looks at stacked combinations of periodic operations, it turned up something interesting. With the 340 the highest peak occurs at untransposed period 181 while it should be at either 1 and/or 19 (assuming 19 is the true direction) given the multiplicative aspect of the operation. For the 340 it is then untransposed period 181 into untransposed period 38. 181 / 38 = untransposed period 4.76 which is close to transposed period 78. It seems that the 340 prefers a fractional period for some reason. Transposition misalignment? Wow, I’m loving this stuff.
doranchak, you’ll have to increase your periodic range to 339 since interesting stuff is also happening after 170.
Periodical: (transposition, untransposition) -------------------------------------------------- Period 1: 37, 37 Period 2: 37, 30 Period 3: 34, 34 Period 4: 37, 30 Period 5: 33, 37 Period 6: 37, 30 Period 7: 33, 34 Period 8: 31, 32 Period 9: 37, 32 Period 10: 32, 32 Period 11: 34, 37 Period 12: 29, 31 Period 13: 37, 30 Period 14: 32, 31 Period 15: 33, 29 Period 16: 31, 35 Period 17: 36, 35 Period 18: 40, 36 Period 19: 33, 41 Period 20: 35, 36 Period 21: 34, 29 Period 22: 33, 32 Period 23: 31, 33 Period 24: 30, 33 Period 25: 31, 34 Period 26: 33, 33 Period 27: 31, 33 Period 28: 35, 30 Period 29: 30, 31 Period 30: 36, 35 Period 31: 37, 34 Period 32: 32, 34 Period 33: 32, 31 Period 34: 32, 32 Period 35: 33, 34 Period 36: 32, 31 Period 37: 40, 31 Period 38: 34, 37 Period 39: 30, 33 Period 40: 31, 31 Period 41: 29, 31 Period 42: 32, 29 Period 43: 31, 36 Period 44: 33, 30 Period 45: 31, 36 Period 46: 32, 34 Period 47: 30, 31 Period 48: 32, 31 Period 49: 32, 31 Period 50: 31, 36 Period 51: 30, 31 Period 52: 32, 31 Period 53: 31, 34 Period 54: 35, 30 Period 55: 33, 36 Period 56: 34, 34 Period 57: 32, 37 Period 58: 33, 30 Period 59: 32, 34 Period 60: 31, 29 Period 61: 32, 32 Period 62: 30, 34 Period 63: 33, 32 Period 64: 30, 30 Period 65: 32, 31 Period 66: 33, 32 Period 67: 33, 31 Period 68: 37, 33 Period 69: 34, 32 Period 70: 34, 31 Period 71: 35, 32 Period 72: 33, 32 Period 73: 33, 32 Period 74: 32, 33 Period 75: 33, 31 Period 76: 32, 32 Period 77: 34, 30 Period 78: 34, 29 Period 79: 32, 30 Period 80: 32, 30 Period 81: 33, 35 Period 82: 32, 32 Period 83: 33, 35 Period 84: 33, 34 Period 85: 30, 37 Period 86: 32, 31 Period 87: 35, 33 Period 88: 31, 29 Period 89: 32, 32 Period 90: 34, 31 Period 91: 34, 33 Period 92: 31, 33 Period 93: 31, 31 Period 94: 32, 32 Period 95: 31, 31 Period 96: 30, 33 Period 97: 32, 31 Period 98: 33, 32 Period 99: 33, 32 Period 100: 33, 30 Period 101: 33, 35 Period 102: 32, 38 Period 103: 29, 31 Period 104: 31, 35 Period 105: 34, 30 Period 106: 34, 31 Period 107: 33, 32 Period 108: 31, 33 Period 109: 34, 32 Period 110: 33, 33 Period 111: 34, 33 Period 112: 32, 35 Period 113: 33, 32 Period 114: 34, 37 Period 115: 35, 35 Period 116: 34, 35 Period 117: 32, 32 Period 118: 32, 32 Period 119: 33, 34 Period 120: 34, 33 Period 121: 34, 31 Period 122: 33, 33 Period 123: 33, 34 Period 124: 32, 35 Period 125: 31, 34 Period 126: 32, 33 Period 127: 31, 33 Period 128: 30, 32 Period 129: 30, 32 Period 130: 31, 31 Period 131: 30, 34 Period 132: 32, 29 Period 133: 32, 35 Period 134: 30, 29 Period 135: 35, 35 Period 136: 31, 32 Period 137: 34, 30 Period 138: 32, 30 Period 139: 33, 32 Period 140: 33, 33 Period 141: 34, 32 Period 142: 33, 30 Period 143: 32, 33 Period 144: 35, 33 Period 145: 35, 31 Period 146: 33, 36 Period 147: 34, 32 Period 148: 33, 31 Period 149: 35, 32 Period 150: 35, 33 Period 151: 34, 33 Period 152: 34, 33 Period 153: 34, 31 Period 154: 32, 34 Period 155: 36, 35 Period 156: 33, 34 Period 157: 32, 32 Period 158: 33, 39 Period 159: 33, 34 Period 160: 32, 33 Period 161: 35, 33 Period 162: 33, 31 Period 163: 32, 34 Period 164: 32, 35 Period 165: 30, 33 Period 166: 35, 33 Period 167: 33, 36 Period 168: 32, 38 Period 169: 34, 38 Period 170: 30, 37 Period 171: 32, 38 Period 172: 30, 38 Period 173: 33, 39 Period 174: 31, 40 Period 175: 32, 37 Period 176: 31, 38 Period 177: 30, 36 Period 178: 31, 38 Period 179: 30, 38 Period 180: 31, 38 Period 181: 29, 43 <--- Period 182: 31, 38 Period 183: 32, 36 Period 184: 30, 35 Period 185: 30, 36 Period 186: 30, 34 Period 187: 32, 35 Period 188: 30, 38 Period 189: 30, 35 Period 190: 32, 35 Period 191: 31, 33 Period 192: 34, 33 Period 193: 33, 38 Period 194: 36, 33 Period 195: 31, 34 Period 196: 33, 34 Period 197: 29, 34 Period 198: 34, 32 Period 199: 31, 32 Period 200: 34, 29 Period 201: 31, 35 Period 202: 30, 35 Period 203: 31, 33 Period 204: 31, 33 Period 205: 33, 32 Period 206: 29, 34 Period 207: 29, 32 Period 208: 30, 34 Period 209: 32, 34 Period 210: 31, 33 Period 211: 30, 31 Period 212: 32, 31 Period 213: 31, 32 Period 214: 31, 33 Period 215: 31, 33 Period 216: 31, 34 Period 217: 32, 33 Period 218: 31, 30 Period 219: 31, 33 Period 220: 31, 32 Period 221: 32, 34 Period 222: 31, 31 Period 223: 31, 35 Period 224: 31, 32 Period 225: 30, 34 Period 226: 34, 30 Period 227: 31, 31 Period 228: 32, 31 Period 229: 31, 32 Period 230: 32, 32 Period 231: 31, 33 Period 232: 29, 35 Period 233: 32, 33 Period 234: 30, 33 Period 235: 31, 32 Period 236: 30, 35 Period 237: 30, 34 Period 238: 30, 35 Period 239: 30, 34 Period 240: 34, 35 Period 241: 33, 35 Period 242: 33, 35 Period 243: 30, 34 Period 244: 30, 34 Period 245: 31, 34 Period 246: 31, 32 Period 247: 34, 35 Period 248: 32, 35 Period 249: 29, 32 Period 250: 34, 32 Period 251: 34, 33 Period 252: 33, 33 Period 253: 35, 35 Period 254: 32, 33 Period 255: 36, 33 Period 256: 36, 34 Period 257: 39, 30 Period 258: 34, 31 Period 259: 34, 31 Period 260: 34, 33 Period 261: 33, 30 Period 262: 32, 31 Period 263: 32, 32 Period 264: 34, 32 Period 265: 38, 32 Period 266: 32, 31 Period 267: 32, 33 Period 268: 31, 33 Period 269: 32, 37 Period 270: 34, 32 Period 271: 32, 32 Period 272: 33, 32 Period 273: 36, 31 Period 274: 35, 33 Period 275: 33, 32 Period 276: 34, 35 Period 277: 34, 31 Period 278: 36, 33 Period 279: 34, 31 Period 280: 36, 31 Period 281: 33, 34 Period 282: 33, 32 Period 283: 32, 33 Period 284: 35, 34 Period 285: 35, 31 Period 286: 37, 39 Period 287: 36, 34 Period 288: 40, 36 Period 289: 38, 31 Period 290: 40, 38 Period 291: 36, 40 Period 292: 35, 34 Period 293: 38, 36 Period 294: 34, 36 Period 295: 35, 35 Period 296: 33, 31 Period 297: 35, 32 Period 298: 33, 33 Period 299: 36, 37 Period 300: 33, 33 Period 301: 36, 37 Period 302: 33, 39 Period 303: 34, 34 Period 304: 35, 34 Period 305: 38, 35 Period 306: 36, 38 Period 307: 39, 35 Period 308: 39, 37 Period 309: 39, 36 Period 310: 39, 36 Period 311: 40, 34 Period 312: 39, 35 Period 313: 42, 36 <--- Period 314: 38, 33 Period 315: 36, 37 Period 316: 38, 32 Period 317: 37, 37 Period 318: 38, 34 Period 319: 40, 36 Period 320: 39, 32 Period 321: 36, 33 Period 322: 36, 34 Period 323: 35, 34 Period 324: 35, 34 Period 325: 36, 36 Period 326: 37, 35 Period 327: 39, 36 Period 328: 37, 36 Period 329: 37, 33 Period 330: 36, 34 Period 331: 38, 34 Period 332: 35, 37 Period 333: 36, 38 Period 334: 36, 38 Period 335: 36, 37 Period 336: 38, 40 Period 337: 39, 39 Period 338: 37, 38 Period 339: 38, 40 Period 340: 37, 37 -------------------------------------------------- Transposition average: 33.21 Untransposition average: 33.50
Fractional periods do have my interest at the moment and I’m going to run a very big AZdecrypt test with stacked periods (transposed + untransposed) * (transposed + untransposed) for the normal, mirrored, flipped and reversed base directions. It’s going to take a while (459684 combinations per base direction) and I’ll get back with the stats and plots once it is finished.
I’m thinking about scoring this way, a 2-gram repeat scores 1, a 3-gram repeat scores 2, a 4-gram repeat scores 4, a 5-gram repeat scores 8, etc. My reasoning here is that two 2-gram repeats can potentially form a 3-gram repeat, so the score of two 2-gram repeats should be equal to the score of one 3-gram repeats. The score of two 3-gram repeats equals the score of one 4-gram repeat and so forth. What do you think?
It seems to work well this way. So well that I’m going to include it with AZdecrypt under stats and also with the solver. Here’s an example, a 340 character part of the 408 scores 1077 while the 340 can only muster a score of 330. That way people have an additional measure to gauge the solutions they are getting and/or wether these solutions have long repeating fragments or not. I’ve set the ngram range equal to the cipher length minus 1.
Score: 23928.77 Ioc: 0.06533 Ngrams: 1077 ILIKEKILLINGPEOPL EBECAUSEITISSOMUC HFUNITIAMOREFUNTH ANKILLINGWILDGAME INTHEFORRESTBECAU SEMANISTHEMOATDAN GERTUEANAMALOFALL TOKILLSOMETHINGGI VESMETHEMOATTHRIL LINGEXPERENCEITIS EVENBETTERTHANGET TINGYOURROCKSOFFW ITHAGIRLTHEBESTPA RTOFITIATHAEWHENI DIEIWILLBEREBORNI NPARADICEANDALLTH EIHAVEKILLEDWILLB ECOMEMYSLAVESIWIL LNOTGIVEYOUMYNAME BECAUSEYOUWILLTRY Score: 20350.26 Ioc: 0.07520 Ngrams: 330 GSHISCENTOPERNATM ESTELSILARLDEIAVO EDVANTOCARLORSEGI NSSPECTINGSITTHAT SUNTERIMPORTTHEOF STHERCISIMPORITAL SOTABLYTOACANDERP LATIVISITSELETORU MENTATCHTREADYSEA SECONTEITISPEREDS OTHFORSPALESANNAI TEACHIPIONENDTHER ESUNDSTEPOREALYTH CAREEVOTEADANERTT DECEIVEUPSINOCOST PADGEEALISEDVNBAT HANTRESPETRCREPOR TTORPERATINGNFLOS PROMREPRESLIEISPA INAGESONECITYPAYT
As a measure of plaintext direction it finds (Period 101: 24, 44) as highest, that is untransposed period 101. doranchak, I know this is one you found before with the 5-gram.
I got a solution for message 2, but not perfect. It had at least two misalignments I think, unless I was untransposing with the wrong column count. I sort of fixed the misalignments for the time being. This is two 10 x 17 inscription rectangles. I want to report, however, that even with untransposing at 9 x 17 and even to some extent 8 x 17, I got some of the words below. Telling me that if Zodiac used multiple inscription rectangles with either 15 or 19 rows, the 340 should be pretty easy to solve.
EDIT: Some weird stuff in red.
OUSOMUCHIC
HOSEHERFOR
SEVERALREA
SONSACTUAL
LYFIRSTOFF
ILOVEHERSH
EHASBEENTH
ROUGHSOMUC
HANDISONEO
FTHESTRONG
ESTFEMALEH
EROINESILO
VETHEFACTT
HATSHEHASD
ABBLEDONBO
THSIDESOFI
TWELLTHANY
EALWAYSGOE
SFORWHATSH
EINOWSISRI
GHTSHEWILL
ALWAYSBEAN
AVENGERIAL
SOCHOSEHER
BECAUSEIWA
NTEDACOSTU
MECHALLENG
EHERCOSTUM
EHASNEVERB
EENWELLUPD
ATEDTOTHEE
XTENTSHEDU
HTHELINEBS
SINTHEENDH