I found a pdf of the aforementioned book, and wrote a quick script looking for word sequences that would match mnemonic sequences in the cipher, and couldn’t find anything. It wasn’t exhaustive but nothing even matched the initial sequence "YBMF", so that can probably be ruled out.
Nice idea! I gave it a try too but for all 3 books in the trilogy the library book belongs to. Here are the top 10,000 matching sequences:
https://pastebin.com/raw/rEpEL3Bq
None are longer than 7 letters so I suspect the mnemonic doesn’t come from this source, or the text is not a mnemonic at all.
What is Normor score?
It is a test devised by "THE RAT":
http://www.ackgame.com/Normor%20Revisited.pdf
It seems similar to calculating the distance between the observed and expected distributions of letters.
Thanks. It is indeed similar to calculating the distance between the observed and expected distributions of letters such as the chi-squared test.
Here is my code if anyone wants it:
function m_normor(frq()as short)as double 'Normor by THE RAT 'Code: Jarlve 'The Rule: '---------------------------------------------------------------- 'Ciphers that tend to encipher high-frequency letters in whole 'or part with themselves or other high-frequency letters 'have low scores. Those that do not, have high scores. 'frq() array: '-------------------------------- 'index 0 is frequency of letter A 'index 1 is frequency of letter B 'index 2 is frequency of letter C 'etc... dim as integer i,j,k,n,score,nor(25),m=-1 dim as string normalorder="ETAOINSRHLDUCMGFYPWBVKXJZQ" for i=0 to 25 nor(i)=asc(normalorder,i+1)-65 next i for i=0 to 25 for j=0 to 25 if frq(j)>m then k=j 'position frq m=frq(j) end if next j for j=0 to 25 if nor(j)=k then n=j 'position nor exit for end if next j score+=abs(n-i) frq(k)=-1 m=-1 next i return score end function
As I mentioned earlier to doranchak there are some interesting "repeats" of the same form "FYAN", "CYAN", "FYCN", "TYIAN", "PYAN", "TYHN", etc.
That is a very interesting observation. Basically, there seems to be a lot of pairs of sequences that have short edit distances. For example, the edit distance between CYAN and PYAN is one because only one correction is needed to make CYAN into PYAN.
I looked for all such patterns, of length at least 4, having an edit distance of only one. Results:
https://docs.google.com/spreadsheets/d/ … sp=sharing
There are some interesting longer examples, like:
YNTTMI and YNTTMWI
ARYNTTM and AYNTTM
WSTAYS and WSTAS
etcTotal list was about 500 patterns. But in a random shuffle, there were 400 similar patterns. So I’m not sure how significant this is. However, in the shuffle, there were fewer of the longer patterns.
It may be interesting to cluster all of them together rather than considering only pairs. Perhaps the clusters are more significant than the quantity of individual pairs.
Could you calculate some odds? My 5-gram fragment test said 1 in about 2750.
Also, for all the parts together with my transcription for the first part:
498 bigram repeats while the average for shuffles is about 490.
93 trigram repeats while the average for shuffles is about 69.5.
Normally repeats are more significant at the bigram level than the trigram level but with this thing it is the other way around. What does that tell us about its bigrams?
YBMFPTUSYSNADSBYAACPTM YSNEMTMOIYJGTEBYNIAJ APSNDETMYLGAIDISYMI HFBYTFAAWOTYCMCHNEFTY IHSAWOYYHNEAWOWIWLRDI HAJBSASTFYTMMFdOETTD ITWYGMTIPDTTOFYANRSETL OACYANWYAJAHCTDSIHND ONIPEICNCOALOHMOOOT YPMDISW+OKIEYWOP,R,A AOTWFFFYOTYOMIHLYD WICSTOTIhhNCFBYGNLAO WALYCWOWTDSOLIOMWWW YNTLTYMYFYCNAWNMWSSI NDIDCWGOOBOXYAYOFW GYKODCIAALACNSuFMSIAF OJBaNTYIANaWJSWDET OMYHABEODIASMGWIdh+ TOTYAHNdTIAPASYOAYDTS WBAAF,YFNSMTMOTOHYA NSUJDWMABTdEKMMSA IHOMLTPYANWADNWTSWIS AIHNGTSTBFANTDWL IAAYYAARYNTTMIFLYTIA BYWYEWGOTAYNTTMWITA COSSOCAYSLIYATMTTOTT NAAMDOTWYNSOETBRNBAH TMAYSIDKINADRHBOFTW TTFBWSYJSBYKIWBMATO GTDALAYANIATADWCBYD DLOAHWIAFHWSTAYSTF OHSELTODTNOAMaHMA HWCUALFMTFYM.YNWTGTO G.AHTCHASOD.ATRISWNB. WISIAAOHB,BYCSTSIPYAA FITIYWSTASIABBaagi WSYCIATOYSIAWyd,BHS AIIMYMMFUIOIYRYCH CLEFAOWNHOTSADIHIW YCTTACBIWLFYTTiiM,L PIDOE,GSSIiMaTCIDLS DMSFHATHTLTIFTNWIA TOYLMJOMATLOIMFJMW YTOOMdYLDBLPCEAT
Double letter frequency is above average, imo, for English language.
Yep, also very interesting.
Double letter frequency is above average, imo, for English language.
Yep, also very interesting.
Doublets and triplets frequencies are about normal versus shuffles.
Could you calculate some odds? My 5-gram fragment test said 1 in about 2750.
I haven’t had time to do that but it made me think about whether or not "clusters" are significant.
I.e., CYAN pairs with RYAN, and RYAN pairs with RYBN. Each pairing is related by an edit distance of one.
A text can have many such pairs, and they might be arising by chance.
However, you can define a "cluster" as a pile of ngrams that have a path consisting of single edits.
For example, the above can be represented by a cluster: CYAN -> RYAN -> RYBN. A single edit at each step links them together.
I’m curious if large such clusters are present. What is an efficient algorithm for finding the largest clusters? A brute force search to consider all clusters would probably be slow.
The librarian found another page where someone circled letters. She sent me this picture and set the book aside for me to look at tomorrow, otherwise it is out of circulation as "damaged" I think. If the librarians have it ready for me as arranged for, I will look at every page in the book and take photos of anything interesting. I would not encourage contacting them, however. They seem as though they may not want this project to become part of their already busy job for a lot of different people. That is the impression that I got, but I am appreciative of what they will do for me as a member of the library.
That’s cool smokie.
The sequence is then "IAAYYASCNT".
And in the cipher the second part starts with "IAAYYA".
This is odd.
Nice job smokie! This is an important clue. What possessed the person who checked out the book to do this??
very interesting
I went to the library and inspected the book every page. There are no other pages with writings on them, but the circles around the letters on page 22 were made, in my opinion, by the same pen as the symbols. The color of the ink was black, and of the same shade, and same width. I took a lot of photos, the cover, spine, back, and same pages. I would be redundant I suppose to re-show the same stuff and the photos already available are of the same or better quality.
The first page of the cryptogram is after the story and epilogue, on page 438. The second short part is on page 438, EDIT PAGE 439 and the last on page 440. The pages aren’t actually numbered, but I added after the epilogue. Here is something that I noticed, which was much easier in person. Whoever did it crossed out or wrote over a couple of symbols. This is close to the bottom of page 438, the first page of the cryptogram.
I will post other pictures tomorrow, just probably the covers for interest. Otherwise, no new information. The librarians were very interested and I showed them that the first several symbols on page 440 match the same letters circled on page 22.
Some observations:
1. The last 16 characters of the 1st part and the whole 2nd part are almost identical:
AIHNGTSTBFANdTNL AIHNGTSTBFANTDWL
2. The size of the 1st part is 445 and the 3rd part is 395. If one was to rearrange these parts in a 2D grid, the only "perfect" way would be:
Part 1: 5×89 (and inversely).
Part 3: 5×79 (and inversely).
It feels to me like the forced separation by 5 (79 and 89 being primes) could be by design.
YMBFPTYSYSNADSBYAACPTMYSNEMTMOIYJGTEBYNIAJAPSNDETMYLGAIDISYMIHFBYTFAAWOTYCMCHNEFTYIHSAWOY YHNEAWOWIWLTDIHAJBSASTFYTMMFDOETTDITWYGMTIPDTTOFYANRSITLOACYANWYAJAHCTDSIHNDONIPEICNCOACO HMOOOTYPMDISWTOKIEYWOPRAAOTWFFFYOTYOMIHLYDWICSTOTIHHNCFBYHNLAOWALYCWOWTDSOLIOMWWWYNTLTYMY FYLNAWNMWSSINDIDCWGOOBOYAYOFWGYKODLIAALACNSUFMSIATOJBAHTYIANAWJSWDETOMYHABEODIASMGWIHDTTO TYHNDTIAPASYOAYDTSWBAAFYFNSMTMOTOHYANSYJDWMABTDEKMMSAIHOMLTPYANWADNWTSWISAIHNGTSTBFANTDWL AIHNGTSTBFANdTNL IAAYYAARYNTTMIFLYTIABYWYEWGOTAYNTTMWITACOSSOCAYSLIYATMTTOTTNAAMDOTWYNSOETBRNBAH TMAYSIDKINADRHBOFTWTTFBWSYJSBYKIWBMATOGTDALAYANIATADWCBYDDLOAHWIAFHWSTAYSTFOHSE LTODTNOAMaHMAHWCUALFMTFYM.YNWTGTOG.AHTCHASOD.ATRISWNB.WISIAAOHB,BYCSTSIPYAAFITI YWSTASIABBaagiWSYCIATOYSIAWyd,BHSAIIMYMMFUIOIYRYCHCLEFAOWNHOTSADIHIWYCTTACBIWLF YTTiiM,LPIDOE,GSSIiMaTCIDLSDMSFHATHTLTIFTNWIATOYLMJOMATLOIMFJMWYTOOMdYLDBLPCEAT
Some observations:
1. The last 16 characters of the 1st part and the whole 2nd part are almost identical:
AIHNGTSTBFANdTNL AIHNGTSTBFANTDWL
I’m not seeing this. Where did you find these?
YMBFPTYSYSNADSBYAACPTMYSNEMTMOIYJGTEBYNIAJAPSNDETMYLGAIDISYMIHFBYTFAAWOTYCMCHNEFTYIHSAWOY
YHNEAWOWIWLTDIHAJBSASTFYTMMFDOETTDITWYGMTIPDTTOFYANRSITLOACYANWYAJAHCTDSIHNDONIPEICNCOACO
HMOOOTYPMDISWTOKIEYWOPRAAOTWFFFYOTYOMIHLYDWICSTOTIHHNCFBYHNLAOWALYCWOWTDSOLIOMWWWYNTLTYMY
FYLNAWNMWSSINDIDCWGOOBOYAYOFWGYKODLIAALACNSUFMSIATOJBAHTYIANAWJSWDETOMYHABEODIASMGWIHDTTO
TYHNDTIAPASYOAYDTSWBAAFYFNSMTMOTOHYANSYJDWMABTDEKMMSAIHOMLTPYANWADNWTSWISAIHNGTSTBFANTDWL
AIHNGTSTBFANdTNL
IAAYYAARYNTTMIFLYTIABYWYEWGOTAYNTTMWITACOSSOCAYSLIYATMTTOTTNAAMDOTWYNSOETBRNBAH
TMAYSIDKINADRHBOFTWTTFBWSYJSBYKIWBMATOGTDALAYANIATADWCBYDDLOAHWIAFHWSTAYSTFOHSE
LTODTNOAMaHMAHWCUALFMTFYM.YNWTGTOG.AHTCHASOD.ATRISWNB.WISIAAOHB,BYCSTSIPYAAFITI
YWSTASIABBaagiWSYCIATOYSIAWyd,BHSAIIMYMMFUIOIYRYCHCLEFAOWNHOTSADIHIWYCTTACBIWLF
YTTiiM,LPIDOE,GSSIiMaTCIDLSDMSFHATHTLTIFTNWIATOYLMJOMATLOIMFJMWYTOOMdYLDBLPCEAT
Double letter frequency is above average, imo, for English language. Also, following double letters, no -ING structure can be found as repeating. Last line of first part is almost equal to the single line (second part) of the cipher.
‘OTY’ and ‘TDI’ is the maximum repeating trigrams, both occurring only twice e.g. in the first part of the cipher (~450 symbols). Not enough to cover "AND" or "THE" for cleartext.
Surely not a simple substitution cipher.
QT
Not that it’d be important…
By the way: Letters Q, V, X and Z are not present in the cipher text. This would imply a 20-letter alphabet or only 20 letters of the alphabet being used.
Instead, small letters (a, g, i, y, d) as well as , and . occur in the third part of the cipher.
Thus, the cipher consists of 29 different symbols, correct?
These often together such as ‘aagi’, ‘ii’ or ‘yd,’. Could be numbers (if it was a substitution cipher).
The by a . symbol separated sections seem to be separate words, but are not (no common word pattern). It therefore might be assumed that it either is no substitution or that , is just a homophone as any other symbol of the cipher.
5-gram ‘YNTTM’ occurs twice, however very closely to each other (name?).
YMBFPTYSYSNADSBYAACPTMYSNEMTMOIYJGTEBYNIAJAPSNDETMYLGAIDISYMIHFBYTFAAWOTYCMCHNEFTYIHSAWOY
YHNEAWOWIWLTDIHAJBSASTFYTMMFDOETTDITWYGMTIPDTTOFYANRSITLOACYANWYAJAHCTDSIHNDONIPEICNCOACO
HMOOOTYPMDISWTOKIEYWOPRAAOTWFFFYOTYOMIHLYDWICSTOTIHHNCFBYHNLAOWALYCWOWTDSOLIOMWWWYNTLTYMY
FYLNAWNMWSSINDIDCWGOOBOYAYOFWGYKODLIAALACNSUFMSIATOJBAHTYIANAWJSWDETOMYHABEODIASMGWIHDTTO
TYHNDTIAPASYOAYDTSWBAAFYFNSMTMOTOHYANSYJDWMABTDEKMMSAIHOMLTPYANWADNWTSWISAIHNGTSTBFANTDWL
AIHNGTSTBFANdTNL
IAAYYAARYNTTMIFLYTIABYWYEWGOTAYNTTMWITACOSSOCAYSLIYATMTTOTTNAAMDOTWYNSOETBRNBAH
TMAYSIDKINADRHBOFTWTTFBWSYJSBYKIWBMATOGTDALAYANIATADWCBYDDLOAHWIAFHWSTAYSTFOHSE
LTODTNOAMaHMAHWCUALFMTFYM.YNWTGTOG.AHTCHASOD.ATRISWNB.WISIAAOHB,BYCSTSIPYAAFITI
YWSTASIABBaagiWSYCIATOYSIAWyd,BHSAIIMYMMFUIOIYRYCHCLEFAOWNHOTSADIHIWYCTTACBIWLF
YTTiiM,LPIDOE,GSSIiMaTCIDLSDMSFHATHTLTIFTNWIATOYLMJOMATLOIMFJMWYTOOMdYLDBLPCEAT
The trigram ‘YAN’ is the most frequent one (0.576%). IF that trigram was represented by the most frequent cleartext trigrams, e.g. AND, THE, ING, ENT, ION, HER, FOR, THA, NTH, INT, ERE, TIO, TER, EST, ERS, ATI, HAT, ATE, ALL, ETH, HES, VER, HIS, OFT, ITH, FTH, STH, OTH, RES, ONT etc. in case of a substitution one could think about the YNTTM section as following; with YNTTM at least occurring with a frequency of around 0.23% (twice in cipher text):
AD___
TE___
IG___
…
OT___
And here is the problem: ”
Besides the fact that a repeating 5-gram in such cipher text length is already a slight statistical outlier (with the most frequent 5-gram ‘OFTHE’ only to be expected 0.18% instead of 0.23%..), amongst the 30 most frequent 5-grams there is none with a double letter on position 3 and 4 (‘ynTTm’).
Also, from the perspective of the 5-gram
HATTH
CHOOL
OUTTH
UALLY
BUTTH
WILLB
SBEEN
TWEEN
THEEN
COLLE
SUPPO
EBEEN
FOLLO
DIFFE
MISSI
CALLE
MILLI
CALLY
NALLY
CURRE
SUCCE
IALLY
COMME
EALLY
COMMI
LITTL
ENTTO
BETTE
ANTTO
RESSI
as of the most frequent ones (top 1,000 frequent 5-grams – complete list for YNTTM word pattern): No AD___, TE___ or IG___ amongst those is a match between both, 5-grams and frequent 3-grams…another symptom of not being a substitution cipher, imo.
The ‘first’ (statistically) occurring match would be:
‘YAN’ = WHI and ‘YNTTM’ = WILL_.
I doubt, however, that the most frequent trigram of such text could be WHI instead of AND, THE, ING etc..as WHI is usually the ~60th most frequent trigram, not the most frequent one. Also, the W is usually not present with ~10% in a cleartext…I bet my 50 cent that this one is not a substitution cipher.
QT
http://practicalcryptography.com/crypta … equencies/
*ZODIACHRONOLOGY*