Composer Edward Elgar once sent a secret message to a reverend’s daughter..so far, it has been unsolved for approximately 100+ years. Elgar was the composer of wonderful ‘Enigma’ and other musical masterpieces. In one of his compositions, clues were found on how he had encrypted the message.
The cipher:
The method:
It appears that we deal with a simple substitution cipher and can the message in the notes be solved:
"Marco Elgar"
"A very old cipher"
"Do you go to London"
This is actually confirmed by his cleartext note "Do you go to London tomorrow" written next in his notes. Once understood, the method is both, quite easy and efficient: Assume eight directions, two vertically (up & down), horizontally (left & right) and diagonally (four directions). In each direction, he can either use a symbol consisting of one, two or three brackets. He then places the alphabet on his transformation pattern. In his notes, he did so in alphabetical order, e.g. ABC. However, he also shows that it may be reversed (e.g. CBA). So in fact he divides the alphabet into groups of three letters which he may place into the pattern, all of them potentially into two directions. Thus we get 8x7x6x5x4x3x2x1 = 40,320 variations with 2^8 forward/backward settings, thus a total of 10,321,920 potential settings of the alphabet. The one who knows how this setting is done, is able to read the cleartext.
Recently I wrote a script to solve the cipher. The script analyzes all potential settings and compares the result with any dictionary (Aho-Corasick again). The program was not so easy as it is a two-step encryption). To solve the cipher, however, only takes approximately two hours of CPU computation.
On the first run, I searched for at least 2 words of length>5, which should exist in a text of almost 90 letters. No result. On the second run I searched for one word of length>5 only. It found some solutions, however no complete cleartext. Currently the PC is running to search for >4 words of length>3, which should occurr in such a cleartext, too. If that won’t do it, the cipher
– is not encrypted as given in Elgar’s notes (e.g. shuffling of the letters as someone likes to, thus billions of billions potential variations)
– is in a different language, e.g. Latin or French
Latter is not so unlikely as the date written under the cipher is actually the 100 year jubilee’s date of the ‘storming of the bastille’.
We’ll see if it will work..if so, the Edward Elgar society has claimed to pay 5,000 British pounds for a solution D. If not, I’ll try French and Latin, otherwise will give this one up.
Example of a pattern, definition of the cipher’s alphabet (code):
QT
*ZODIACHRONOLOGY*
Too bad the key was never found in the reverends daughter’s possession. She most certainly had it.
Too bad the key was never found in the reverends daughter’s possession. She most certainly had it.
Guess that’s true..
..let’s play a small game:
WHAT WOULD YOU WRITE TO THE DAUGHTER OF A REVEREND, IF YOU KEPT THE MESSAGE SECRET FROM HER FATHER?
Sorry my dear..last Sunday I met your mother inside the confessional box..and we both could not resist..
QT
*ZODIACHRONOLOGY*
TEMPTATION
edit ?
Too bad the key was never found in the reverends daughter’s possession. She most certainly had it.
No, she died never knowing the cipher solution. This is a pretty famous unsolved cipher.
https://en.wikipedia.org/wiki/Dorabella_Cipher
-glurk
——————————–
I don’t believe in monsters.
Too bad the key was never found in the reverends daughter’s possession. She most certainly had it.
No, she died never knowing the cipher solution. This is a pretty famous unsolved cipher.
https://en.wikipedia.org/wiki/Dorabella_Cipher
-glurk
That is interesting! Makes me wonder if she had it, but didn’t know it.
Too bad the key was never found in the reverends daughter’s possession. She most certainly had it.
No, she died never knowing the cipher solution. This is a pretty famous unsolved cipher.
https://en.wikipedia.org/wiki/Dorabella_Cipher
-glurk
That is interesting! Makes me wonder if she had it, but didn’t know it.
Guess her father took it and threw it away..
*ZODIACHRONOLOGY*
Here we go with the Python source code including Aho-Corasick algorithm (pattern search).
However no solution was found in English or French, Italian is currently on-process. It only finds some nonsense cleartexts with up to 5 words in it..however no lexically correct solution.
UPDATE:
No ‘immediate’ solution was found in English, French or Italian language. Therefor one may wonder if Edward Elgar had shuffled the letters in a completely random way over the pattern. Then 24! variations are possible (620,448,401,733,239,439,360,000 variations..the Chinese supercomputer "Tianhe-2" would need approximately 2,000-4,000 years to solve this cipher..).
QT
## Dictionary lex = ['...'] ## Enter your dictionary here in format: ['word1', 'word2',..] ## Trigramme variety for intermediary (rotary) A_ = ["ABC", "CBA"] B_ = ["DEF", "FDE"] C_ = ["GHI", "IHG"] D_ = ["KLM", "MLK"] E_ = ["NOP", "PON"] F_ = ["QRS", "SRQ"] G_ = ["TVW", "WVT"] H_ = ["XYZ", "ZYX"] ## Definition of trigram variables (preset) a = 0 b = 0 c = 0 d = 0 e = 0 f = 0 g = 0 h = 0 ## Intermediary definition (forward/backward) A = A_[a] B = B_[b] C = C_[c] D = D_[d] E = E_[e] F = F_[f] G = G_[g] H = H_[h] ## Variaty of trigram variables rra = [0,1] rrb = [0,1] rrc = [0,1] rrd = [0,1] rre = [0,1] rrf = [0,1] rrg = [0,1] rrh = [0,1] ## Definition of alphabet from intermediary (preset) M1 = A M2 = B M3 = C M4 = D M5 = E M6 = F M7 = G M8 = H ## Definition of rotated alphabet variables M1_rot = A M2_rot = B M3_rot = C M4_rot = D M5_rot = E M6_rot = F M7_rot = G M8_rot = H ## Definition of alphabet dmaster = M1+M2+M3+M4+M5+M6+M7+M8 ## Cleartext chain from alphabet cleartext = dmaster[7]+dmaster[20]+dmaster[10]+dmaster[8]+dmaster[6]+dmaster[13]+dmaster[0]+dmaster[8]+dmaster[15]+dmaster[4]+dmaster[11]+dmaster[22]+dmaster[21]+dmaster[9]+dmaster[22]+dmaster[14]+dmaster[22]+dmaster[22]+dmaster[13]+dmaster[20]+dmaster[20]+dmaster[22]+dmaster[9]+dmaster[3]+dmaster[4]+dmaster[3]+dmaster[12]+dmaster[11]+dmaster[23]+dmaster[0]+dmaster[22]+dmaster[0]+dmaster[13]+dmaster[3]+dmaster[8]+dmaster[15]+dmaster[16]+dmaster[8]+dmaster[10]+dmaster[22]+dmaster[22]+dmaster[10]+dmaster[13]+dmaster[12]+dmaster[21]+dmaster[0]+dmaster[22]+dmaster[11]+dmaster[22]+dmaster[13]+dmaster[1]+dmaster[23]+dmaster[21]+dmaster[9]+dmaster[3]+dmaster[15]+dmaster[15]+dmaster[3]+dmaster[11]+dmaster[23]+dmaster[10]+dmaster[23]+dmaster[13]+dmaster[1]+dmaster[23]+dmaster[10]+dmaster[9]+dmaster[1]+dmaster[2]+dmaster[12]+dmaster[23]+dmaster[10]+dmaster[22]+dmaster[12]+dmaster[0]+dmaster[21]+dmaster[23]+dmaster[12]+dmaster[8]+dmaster[20]+dmaster[12]+dmaster[23]+dmaster[13]+dmaster[8]+dmaster[9]+dmaster[3]+dmaster[8] ## Aho-Corasick multiple pattern search algorithm from collections import deque def init_trie(keywords): """ creates a trie of keywords, then sets fail transitions """ create_empty_trie() add_keywords(keywords) set_fail_transitions() def create_empty_trie(): """ initalize the root of the trie """ AdjList.append({'value':'', 'next_states':[],'fail_state':0,'output':[]}) def add_keywords(keywords): """ add all keywords in list of keywords """ for keyword in keywords: add_keyword(keyword) def find_next_state(current_state, value): for node in AdjList[current_state]["next_states"]: if AdjList[node]["value"] == value: return node return None def add_keyword(keyword): """ add a keyword to the trie and mark output at the last node """ current_state = 0 j = 0 keyword = keyword.lower() child = find_next_state(current_state, keyword[j]) while child != None: current_state = child j = j + 1 if j < len(keyword): child = find_next_state(current_state, keyword[j]) else: break for i in range(j, len(keyword)): node = {'value':keyword[i],'next_states':[],'fail_state':0,'output':[]} AdjList.append(node) AdjList[current_state]["next_states"].append(len(AdjList) - 1) current_state = len(AdjList) - 1 AdjList[current_state]["output"].append(keyword) def set_fail_transitions(): q = deque() child = 0 for node in AdjList[0]["next_states"]: q.append(node) AdjList[node]["fail_state"] = 0 while q: r = q.popleft() for child in AdjList[r]["next_states"]: q.append(child) state = AdjList[r]["fail_state"] while find_next_state(state, AdjList[child]["value"]) == None and state != 0: state = AdjList[state]["fail_state"] AdjList[child]["fail_state"] = find_next_state(state, AdjList[child]["value"]) if AdjList[child]["fail_state"] is None: AdjList[child]["fail_state"] = 0 AdjList[child]["output"] = AdjList[child]["output"] + AdjList[AdjList[child]["fail_state"]]["output"] def get_keywords_found(line): """ returns true if line contains any keywords in trie """ line = line.lower() current_state = 0 keywords_found = [] for i in range(len(line)): while find_next_state(current_state, line[i]) is None and current_state != 0: current_state = AdjList[current_state]["fail_state"] current_state = find_next_state(current_state, line[i]) if current_state is None: current_state = 0 else: for j in AdjList[current_state]["output"]: keywords_found.append({"index":i-len(j) + 1,"word":j}) return keywords_found # Main program from itertools import permutations AdjList = [] init_trie(lex) ## Permutation for rotary set-up combinations = permutations("ABCDEFGH") liste = ([''.join(x) for x in combinations]) ## FCCP-List creation and Aho-Corasick-Analyis print ('FCCP-List') w = 0 for w in range(0,40320): rotary = liste[w] cleartext_rot = dmaster_rot[7]+dmaster_rot[20]+dmaster_rot[10]+dmaster_rot[8]+dmaster_rot[6]+dmaster_rot[13]+dmaster_rot[0]+dmaster_rot[8]+dmaster_rot[15]+dmaster_rot[4]+dmaster_rot[11]+dmaster_rot[22]+dmaster_rot[21]+dmaster_rot[9]+dmaster_rot[22]+dmaster_rot[14]+dmaster_rot[22]+dmaster_rot[22]+dmaster_rot[13]+dmaster_rot[20]+dmaster_rot[20]+dmaster_rot[22]+dmaster_rot[9]+dmaster_rot[3]+dmaster_rot[4]+dmaster_rot[3]+dmaster_rot[12]+dmaster_rot[11]+dmaster_rot[23]+dmaster_rot[0]+dmaster_rot[22]+dmaster_rot[0]+dmaster_rot[13]+dmaster_rot[3]+dmaster_rot[8]+dmaster_rot[15]+dmaster_rot[16]+dmaster_rot[8]+dmaster_rot[10]+dmaster_rot[22]+dmaster_rot[22]+dmaster_rot[10]+dmaster_rot[13]+dmaster_rot[12]+dmaster_rot[21]+dmaster_rot[0]+dmaster_rot[22]+dmaster_rot[11]+dmaster_rot[22]+dmaster_rot[13]+dmaster_rot[1]+dmaster_rot[23]+dmaster_rot[21]+dmaster_rot[9]+dmaster_rot[3]+dmaster_rot[15]+dmaster_rot[15]+dmaster_rot[3]+dmaster_rot[11]+dmaster_rot[23]+dmaster_rot[10]+dmaster_rot[23]+dmaster_rot[13]+dmaster_rot[1]+dmaster_rot[23]+dmaster_rot[10]+dmaster_rot[9]+dmaster_rot[1]+dmaster_rot[2]+dmaster_rot[12]+dmaster_rot[23]+dmaster_rot[10]+dmaster_rot[22]+dmaster_rot[12]+dmaster_rot[0]+dmaster_rot[21]+dmaster_rot[23]+dmaster_rot[12]+dmaster_rot[8]+dmaster_rot[20]+dmaster_rot[12]+dmaster_rot[23]+dmaster_rot[13]+dmaster_rot[8]+dmaster_rot[9]+dmaster_rot[3]+dmaster_rot[8] for a, b, c, d, e, f, g, h in ((a,b,c,d,e,f,g,h) for a in rra for b in rrb for c in rrc for d in rrd for e in rre for f in rrf for g in rrg for h in rrh): ## Rotary split for rotary-dependent alphabet configuration if rotary[0] == "A": M1_rot = A_[a] if rotary[0] == "B": M1_rot = B_[b] if rotary[0] == "C": M1_rot = C_[c] if rotary[0] == "D": M1_rot = D_[d] if rotary[0] == "E": M1_rot = E_[e] if rotary[0] == "F": M1_rot = F_[f] if rotary[0] == "G": M1_rot = G_[g] if rotary[0] == "H": M1_rot = H_[h] if rotary[1] == "A": M2_rot = A_[a] if rotary[1] == "B": M2_rot = B_[b] if rotary[1] == "C": M2_rot = C_[c] if rotary[1] == "D": M2_rot = D_[d] if rotary[1] == "E": M2_rot = E_[e] if rotary[1] == "F": M2_rot = F_[f] if rotary[1] == "G": M2_rot = G_[g] if rotary[1] == "H": M2_rot = H_[h] if rotary[2] == "A": M3_rot = A_[a] if rotary[2] == "B": M3_rot = B_[b] if rotary[2] == "C": M3_rot = C_[c] if rotary[2] == "D": M3_rot = D_[d] if rotary[2] == "E": M3_rot = E_[e] if rotary[2] == "F": M3_rot = F_[f] if rotary[2] == "G": M3_rot = G_[g] if rotary[2] == "H": M3_rot = H_[h] if rotary[3] == "A": M4_rot = A_[a] if rotary[3] == "B": M4_rot = B_[b] if rotary[3] == "C": M4_rot = C_[c] if rotary[3] == "D": M4_rot = D_[d] if rotary[3] == "E": M4_rot = E_[e] if rotary[3] == "F": M4_rot = F_[f] if rotary[3] == "G": M4_rot = G_[g] if rotary[3] == "H": M4_rot = H_[h] if rotary[4] == "A": M5_rot = A_[a] if rotary[4] == "B": M5_rot = B_[b] if rotary[4] == "C": M5_rot = C_[c] if rotary[4] == "D": M5_rot = D_[d] if rotary[4] == "E": M5_rot = E_[e] if rotary[4] == "F": M5_rot = F_[f] if rotary[4] == "G": M5_rot = G_[g] if rotary[4] == "H": M5_rot = H_[h] if rotary[5] == "A": M6_rot = A_[a] if rotary[5] == "B": M6_rot = B_[b] if rotary[5] == "C": M6_rot = C_[c] if rotary[5] == "D": M6_rot = D_[d] if rotary[5] == "E": M6_rot = E_[e] if rotary[5] == "F": M6_rot = F_[f] if rotary[5] == "G": M6_rot = G_[g] if rotary[5] == "H": M6_rot = H_[h] if rotary[6] == "A": M7_rot = A_[a] if rotary[6] == "B": M7_rot = B_[b] if rotary[6] == "C": M7_rot = C_[c] if rotary[6] == "D": M7_rot = D_[d] if rotary[6] == "E": M7_rot = E_[e] if rotary[6] == "F": M7_rot = F_[f] if rotary[6] == "G": M7_rot = G_[g] if rotary[6] == "H": M7_rot = H_[h] if rotary[7] == "A": M8_rot = A_[a] if rotary[7] == "B": M8_rot = B_[b] if rotary[7] == "C": M8_rot = C_[c] if rotary[7] == "D": M8_rot = D_[d] if rotary[7] == "E": M8_rot = E_[e] if rotary[7] == "F": M8_rot = F_[f] if rotary[7] == "G": M8_rot = G_[g] if rotary[7] == "H": M8_rot = H_[h] ## Rotary-dependent alphabet configuration dmaster_rot = M1_rot+M2_rot+M3_rot+M4_rot+M5_rot+M6_rot+M7_rot+M8_rot cleartext_rot = dmaster_rot[7]+dmaster_rot[20]+dmaster_rot[10]+dmaster_rot[8]+dmaster_rot[6]+dmaster_rot[13]+dmaster_rot[0]+dmaster_rot[8]+dmaster_rot[15]+dmaster_rot[4]+dmaster_rot[11]+dmaster_rot[22]+dmaster_rot[21]+dmaster_rot[9]+dmaster_rot[22]+dmaster_rot[14]+dmaster_rot[22]+dmaster_rot[22]+dmaster_rot[13]+dmaster_rot[20]+dmaster_rot[20]+dmaster_rot[22]+dmaster_rot[9]+dmaster_rot[3]+dmaster_rot[4]+dmaster_rot[3]+dmaster_rot[12]+dmaster_rot[11]+dmaster_rot[23]+dmaster_rot[0]+dmaster_rot[22]+dmaster_rot[0]+dmaster_rot[13]+dmaster_rot[3]+dmaster_rot[8]+dmaster_rot[15]+dmaster_rot[16]+dmaster_rot[8]+dmaster_rot[10]+dmaster_rot[22]+dmaster_rot[22]+dmaster_rot[10]+dmaster_rot[13]+dmaster_rot[12]+dmaster_rot[21]+dmaster_rot[0]+dmaster_rot[22]+dmaster_rot[11]+dmaster_rot[22]+dmaster_rot[13]+dmaster_rot[1]+dmaster_rot[23]+dmaster_rot[21]+dmaster_rot[9]+dmaster_rot[3]+dmaster_rot[15]+dmaster_rot[15]+dmaster_rot[3]+dmaster_rot[11]+dmaster_rot[23]+dmaster_rot[10]+dmaster_rot[23]+dmaster_rot[13]+dmaster_rot[1]+dmaster_rot[23]+dmaster_rot[10]+dmaster_rot[9]+dmaster_rot[1]+dmaster_rot[2]+dmaster_rot[12]+dmaster_rot[23]+dmaster_rot[10]+dmaster_rot[22]+dmaster_rot[12]+dmaster_rot[0]+dmaster_rot[21]+dmaster_rot[23]+dmaster_rot[12]+dmaster_rot[8]+dmaster_rot[20]+dmaster_rot[12]+dmaster_rot[23]+dmaster_rot[13]+dmaster_rot[8]+dmaster_rot[9]+dmaster_rot[3]+dmaster_rot[8] if int(len(get_keywords_found(cleartext_rot))) >3: ## Number of keywords to be found in cleartext_rot print (rotary, dmaster_rot, cleartext_rot, get_keywords_found(cleartext_rot)) w = w + 1 print ('List complete')
*ZODIACHRONOLOGY*
Update with the Dorabella cipher..
Did rewrite both, the Python program as well as the dictionary. Latter is now based on word roots, thus goes faster (e.g. searching for ‘SELF’ only instead of ‘HIMSELF’, ‘HERSELF’, ‘MYSELF’ etc..). Assuming that the cipher was written in English, too.
It is possible to create a computable string of 28 letters, checking out all possible variations. Symbols with frequency >5% rather to be placed in the top alphabet segment (ETAOINSHRDLC) and vice versa. One may consider a group of the letters, e.g. ABC, being placed on one segment of the pigpen-cipher-style circle Elgar had used ( http://2.bp.blogspot.com/-CCrOaObtH2E/U … tebook.gif). According to Elgar’s notes there’d exist 2^8 x 8! = 10,321,920 variations.
With Python I was able to computate all of those variations, however no useful result exists. This may have only three reasons:
A – Elgar did not ‘group’ the letters in groups of three, as he did in his notes
B – Elgar had used a different language, e.g. French or Latin
C – Transcription of the cipher went wrong (two or three symbols are not 100% clear if they go e.g. ‘down’ or ‘down to the right’)
Based on the so far ‘knowledge’ that none of the 10,321,920 variations delivers any useful result, I started to check out the ‘A’ possibility meaning that Elgar had shuffled his letters indiscriminately without considering any groups of three letters. As four cipher symbols haven’t been used at all (‘left 1’, ‘left 2’, ‘lower left 3’ and ‘upper right 3’ – the numbers represent the amount of half rings), a total of 20 alphabetical letters may have been used in the key. Thus 20! or 2,432,902,008,176,640,000 variations do exist (the encryption is quite efficient, imo….).
This variety may be reduced due to differentiation of symbols occurring often, average or less (top, middle or lower alphabet frequency). In addition to this, the search procedure can be built up in multiple levels (e.g. finding one word of length > 4 in a string of 10 letters, finding the next one in a second string etc. and finally finding at least three words in a string of 28 letters – which is what I currently do).
In addition to this one may assume that the symbol with approximately 12.5% is representing the letter ‘E’ (‘upper left 2’), a secend letter may be entered into the python program on each individual ‘run’.
One run currently takes approximately 5-6 hours, thus I already was able to set the letter E plus the letter T, which hasn’t provided with any useful results (talking about approximately 500,000 results that may deliver a minimum of e.g. 3 words in the 28-letter string, however those don’t match – e.g. two words overlapping).
There is one section reading somehow like ‘ABBA’, which rather excludes potential cleartext combinations such as TWWT etc (with ‘B’ symbol representing a non-frequent alphabetical letter..thus rather no vowel).
Results currently look like this and should it be possible to find a solution sooner or later based on the chosen method (if things go well):
EWENGOLDACCAWOLONGOLDGZOOLEO [{'index': 4, 'word': 'gold'}, {'index': 14, 'word': 'long'}, {'index': 17, 'word': 'gold'}] EGENSODRAWWAGONONSONRSPOONEO [{'index': 6, 'word': 'draw'}, {'index': 10, 'word': 'wagon'}, {'index': 21, 'word': 'spoon'}] EGENSODRAWWAGONONSONRSPHONEH [{'index': 6, 'word': 'draw'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENRODRAWWAGONONRONRRPHONEH [{'index': 6, 'word': 'draw'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENROLLAWWAGONONRONLRPHONEH [{'index': 2, 'word': 'enroll'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENHODRAWWAGONONHONRHPHONEH [{'index': 6, 'word': 'draw'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENHOURAWWAGONONHONRHPHONEH [{'index': 4, 'word': 'hour'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENHOLDAWWAGONONHONDHPHONEH [{'index': 4, 'word': 'hold'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENLOSSAWWAGONONLONSLPHONEH [{'index': 4, 'word': 'loss'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENLODRAWWAGONONLONRLPHONEH [{'index': 6, 'word': 'draw'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENLOSGAWWAGONONLONGLPHONEH [{'index': 10, 'word': 'wagon'}, {'index': 17, 'word': 'long'}, {'index': 22, 'word': 'phone'}] EGENLORGAWWAGONONLONGLPHONEH [{'index': 10, 'word': 'wagon'}, {'index': 17, 'word': 'long'}, {'index': 22, 'word': 'phone'}] EGENLOHGAWWAGONONLONGLPHONEH [{'index': 10, 'word': 'wagon'}, {'index': 17, 'word': 'long'}, {'index': 22, 'word': 'phone'}] EGENLOLGAWWAGONONLONGLPHONEH [{'index': 10, 'word': 'wagon'}, {'index': 17, 'word': 'long'}, {'index': 22, 'word': 'phone'}] EGENLODGAWWAGONONLONGLPHONEH [{'index': 10, 'word': 'wagon'}, {'index': 17, 'word': 'long'}, {'index': 22, 'word': 'phone'}] EGENLOCGAWWAGONONLONGLPHONEH [{'index': 10, 'word': 'wagon'}, {'index': 17, 'word': 'long'}, {'index': 22, 'word': 'phone'}] EGENLOUGAWWAGONONLONGLPHONEH [{'index': 10, 'word': 'wagon'}, {'index': 17, 'word': 'long'}, {'index': 22, 'word': 'phone'}] EGENLOMGAWWAGONONLONGLPHONEH [{'index': 10, 'word': 'wagon'}, {'index': 17, 'word': 'long'}, {'index': 22, 'word': 'phone'}] EGENLOFGAWWAGONONLONGLPHONEH [{'index': 10, 'word': 'wagon'}, {'index': 17, 'word': 'long'}, {'index': 22, 'word': 'phone'}] EGENLOPGAWWAGONONLONGLPHONEH [{'index': 10, 'word': 'wagon'}, {'index': 17, 'word': 'long'}, {'index': 22, 'word': 'phone'}] EGENLOGGAWWAGONONLONGLPHONEH [{'index': 10, 'word': 'wagon'}, {'index': 17, 'word': 'long'}, {'index': 22, 'word': 'phone'}] EGENLOWGAWWAGONONLONGLPHONEH [{'index': 10, 'word': 'wagon'}, {'index': 17, 'word': 'long'}, {'index': 22, 'word': 'phone'}] EGENDODRAWWAGONONDONRDPHONEH [{'index': 6, 'word': 'draw'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENCODRAWWAGONONCONRCPHONEH [{'index': 6, 'word': 'draw'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENUODRAWWAGONONUONRUPHONEH [{'index': 6, 'word': 'draw'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENMODRAWWAGONONMONRMPHONEH [{'index': 6, 'word': 'draw'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENFODRAWWAGONONFONRFPHONEH [{'index': 6, 'word': 'draw'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENFOURAWWAGONONFONRFPHONEH [{'index': 4, 'word': 'four'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENFORMAWWAGONONFONMFPHONEH [{'index': 4, 'word': 'form'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENPODRAWWAGONONPONRPPHONEH [{'index': 6, 'word': 'draw'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENGODRAWWAGONONGONRGPHONEH [{'index': 6, 'word': 'draw'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENGOLDAWWAGONONGONDGPHONEH [{'index': 4, 'word': 'gold'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENWODRAWWAGONONWONRWPHONEH [{'index': 6, 'word': 'draw'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}] EGENWORDAWWAGONONWONDWPHONEH [{'index': 4, 'word': 'word'}, {'index': 10, 'word': 'wagon'}, {'index': 22, 'word': 'phone'}]
QT
*ZODIACHRONOLOGY*
Sommmmmething new:
So far I had a Python tool programmed to analyze and hopefully solve the Dorabella cipher. The problem with it was that it worked sort of homophone style rather than strictly monoalphabetic (substitution cipher). This has lead to multiple variations which in practice wouldn’t occurr, thus more potential solutions, computation etc.
This is why I deepened into some permutations (wasn’t that easy..) as well as grouped the symbols occurring as frequent, middle or non-frequent. By doing so, computation effort can be kept somehow low, at least.
Program is still running, way less results than before (which is actually good as we are looking for only one, the valid solution). Thought I place the Python code here, it builds up an array of ‘chains’ according to which symbol is at some cipher-destined place as well as represented by any letter from the three alphabetical groups (above). All chains are later analyzed on words of a pigpen dictionary (e.g. HAUE instead of HAVE). So far, no spectacular results have occured. The 12.5% symbol is assumed to be represented by ‘E’, the second most is entered individually (currently ‘T’, may be modified in ‘alph_top’). Symbols are ‘named’ like the direction of the symbols, e.g. ro1 quite German for ‘rechts oben eins’ (upper right one, would be ‘ur1’ in English).
The program covers the first 26 letters of the cipher, which actually should provide at least 4-6 ‘hits’ in the dictionary.
lex = ['HAUE',...##dictionary] alpha_top = ["A", "O", "I", "N", "S", "R", "H", "L", "D", "C"] ## w/o E, ro1 alpha_mid = ["R", "H", "L", "D", "C", "U", "M", "F", "P", "G", "W"] alpha_low = ["F", "P", "G", "W", "Y", "B", "K", "X", "Q", "Z"] oo1 = 0 .. lo3 = 0 ## variable defintions AHO CORASICK ... ## place here Aho Corasick or any other linguistic analyzer function) import re AdjList = [] init_trie(lex) ... print ('FCCP-List') lo2 = 'E' ro1 = input ("ro1: ") from itertools import * for t in permutations(["A", "O", "I", "N", "S", "R", "H", "L"],3): ## Amount of different letters from 'top' for m in permutations (["H", "L", "D", "C", "U", "M", "F", "P", "G", "W"],7): ## Amount of different letters from 'mid' for l in permutations (["G", "W", "Y", "B", "K", "Z"],3): ## Amount of different letters from 'low' rr2 = l[0] ll3 = m[0] ru2 = t[0] rr3 = t[1] rr1 = l[1] uu2 = t[2] oo1 = m[1] lu1 = m[2] ro2 = m[3] ru3 = m[4] lo1 = m[5] ru1 = m[6] uu3 = l[2] chain = rr2+ll3+ru2+rr3+rr1+uu2+oo1+rr3+lu1+ro2+ru3+lo2+lo1+ru1+lo2+uu3+lo2+lo2+uu2+ll3+ll3+lo2+ru1+ro1+ro2+ro1 if int(len(get_keywords_found(chain))) >1: ## Minimum Keywords print (chain, get_keywords_found(chain)) print ('List complete')
The program is now running quite fine, the latest result was e.g.
ZLAIKOWIUDGEPFEBEEOLLEFTDT [{‘index’: 7, ‘word’: ‘iudge’}, {‘index’: 20, ‘word’: ‘left’}]
which obviously is not the final solution, yet. To figure out the correct way of permutation was quite tough as the number of disjunct letters was necessary in the beginning although the chains are built later-on..whatever, it now works. Based on the Dorabella pigpen cipher style, the word ‘judge’ is correctly being found as ‘iudge’ (I=J and U=V), too.
QT
*ZODIACHRONOLOGY*
Some stuff regarding the Dorabella:
http://aplaceofbrightness.blogspot.com/ … poser.html
QT
*ZODIACHRONOLOGY*
Dorabella cipher transcription:
QURAYONACWLEDHEZEEOUUEHSWSILTNENOSACMAREEROIDNELEOPTDHSCCSLTRTOPTRHTVITREOPDTIAUITOAHSA
(Positions 4, 7, 10, 23, 32, 34, 72 85 may vary, depending on why someone reads from the original)
The following is a (working) Python code, which should actually solve the Dorabella cipher.
As shown here, the code is incomplete because some lists (e.g. ‘alphabet’), variables (e.g. R, H, etc.) as well as an appropriate word search algorithm (e.g. Aho-Corasick) has to be set-up previous to the main code. Also, a dictionary is required, as a list, too. The dictionary is the hardest part as eg. duplicates, works, worked, working, worker etc. irritate the results found…so do many past tense forms: ‘play vs. played’ but not ‘pay vs. paid’.
The program assumes a substitution with (26-6)! = 20 different letters being placed on the various symbols or 2.4 x 10^18 variations (2.4 quintillion). Not as easy as Zodiac’s mother but it shows that a substitution cipher can still be hard to crack.
The program helps itself with a trick: It starts searching for any word in a sequence (e.g. ‘chain1’), a pre-selected part of the cipher. If found, all the letters are deleted from the alphabet available (substitution..). Then the program continues to search for any other word in a second, different sequence of the cipher (e.g. ‘chain2’). By doing so, the 2.4 quintillion variations are dramatically reduced to only those, where words have been found. All other combinations automatically discarded. To set up the first chain only approximately 200-300 trigrams plus four alphabetical letters are needed (est. 137m variations, which is computable). FCCP-method.
The program indeed finds results in e.g. two or more different strings. So far, however, it does not deliver a final result with words in all of the strings (although I had expected such an outcome). This can have multiple reasons:
1. Language
If the cipher is in french, latin or suaheli, the program doesn’t work as long as someone doesn’t use such a dictionary. Also, Edward Elgar might have had a different language than we might expect (city names, old English etc.)
2. Dictionary
Currently a ~5,000 word list is used, most of the words are of length >4 letters. If Edward Elgar had used shorter words only in one chain, the result will be null.
3. Encryption method
If Edward Elgar had used a different encryption method than substitution, the program doesn’t work either.
4. Transcription
Transcription is not as easy as it looks..the handwriting of Edward Elgar was quite lousy so especially in the third part of the cipher are some symbols not very well to read. This is better in the rest of the cipher, but transcription errors will kill the program, too.
5. Pre-Settings
Choosing the wrong trigrams (I used the 200-300 most frequent ones) already lead to a dead end. So does the assumption that the symbol occurring with a frequency of 12.64% actually represents the letter ‘E’, which is the second pre-setting (assumption).
Any thoughts are welcome.
QT
Pseudo-Code:
Lex = Dictionary Cipher = QURAYONACWLEDHEZEEOUUEHSWSILTNENOSACMAREEROIDNELEOPTDHSCCSLTRTOPTRHTVITREOPDTIAUITOAHSA Alphabet = A-Z Trig = THE, AND, THO, WHE,... chain1= TRTOPTRHTVITR R, H, V, I = A-Z O, P, T = Trig(0,1,2) Search chain1 for lex If results >0 ...remove R,H,V,I,O,P,T from Alphabet ...chain2 = ...... Show results found
Real code (incomplete, for Pythonists):
print ('FCCP-List') for R, H, V, I, trig in [(R,H,V,I,trig) for R in alphabet for H in alphabet for V in nonfrequent for I in alphabet for trig in trigrams]: chain1 = trig[2]+R+trig[2]+trig+R+H+trig[1]+V+I+trig[2]+R+E+trig[0]+trig[1] # 16 counts if (R is not H) and (R is not V) and (R is not I) and (R is not trig[0]) and (R is not trig[1]) and (R is not trig[2]) and (H is not R) and (H is not V) and (H is not I) and (H is not trig[0]) and (H is not trig[1]) and (H is not trig[2]) and (V is not R) and (V is not H) and (V is not I) and (V is not trig[0]) and (V is not trig[1]) and (V is not trig[2]) and (I is not R) and (I is not H) and (I is not V) and (I is not trig[0]) and (I is not trig[1]) and (I is not trig[2]) and (trig[0] is not R) and (trig[0] is not H) and (trig[0] is not V) and (trig[0] is not I) and (trig[1] is not R) and (trig[1] is not H) and (trig[1] is not V) and (trig[1] is not I) and (trig[2] is not R) and (trig[2] is not H) and (trig[2] is not V) and (trig[2] is not I): if int(len(get_keywords_found(chain1))) >0: ## Minimum keywords in chain1 #print(chain1, get_keywords_found(chain1)) if R in alphabet: alphabet.remove (R) if H in alphabet: alphabet.remove (H) if V in alphabet: alphabet.remove (V) if I in alphabet: alphabet.remove (I) if trig[0] in alphabet: alphabet.remove (trig[0]) if trig[1] in alphabet: alphabet.remove (trig[1]) if trig[2] in alphabet: alphabet.remove (trig[2]) for D, N, L in [(D,N,L) for D in alphabet for N in alphabet for L in alphabet]: chain2 = R+E+E+R+trig[0]+I+D+N+E+L+E+trig+D+H # 16 counts if (D is not N) and (D is not L) and (N is not D) and (N is not L) and (L is not D) and (L is not N) and (D is not R) and (D is not H) and (D is not V) and (D is not I) and (D is not trig[0]) and (D is not trig[1]) and (D is not trig[2]) and (L is not R) and (L is not H) and (L is not V) and (L is not I) and (L is not trig[0]) and (L is not trig[1]) and (L is not trig[2]) and (N is not R) and (N is not H) and (N is not V) and (N is not I) and (N is not trig[0]) and (N is not trig[1]) and (N is not trig[2]): if int(len(get_keywords_found(chain2))) >0: ## Minimum keywords in chain2 #print (chain1, chain2, get_keywords_found(chain1), get_keywords_found(chain2)) if D in alphabet: alphabet.remove (D) if N in alphabet: alphabet.remove (N) if L in alphabet: alphabet.remove (L) for A, S, U in [(A,S,U) for A in alphabet for S in alphabet for U in nonfrequent]: chain3=trig[2]+R+E+trig[0]+trig[1]+D+trig[2]+I+A+U+I+trig[2]+trig[0]+A+H+S+A # 17 counts if (A is not S) and (A is not U) and (S is not A) and (S is not U) and (U is not A) and (U is not S) and (A is not R) and (A is not H) and (A is not V) and (A is not I) and (A is not trig[0]) and (A is not trig[1]) and (A is not trig[2]) and (A is not D) and (A is not N) and (A is not L) and (S is not R) and (S is not H) and (S is not V) and (S is not I) and (S is not trig[0]) and (S is not trig[1]) and (S is not trig[2]) and (S is not D) and (S is not N) and (S is not L) and (U is not R) and (U is not H) and (U is not V) and (U is not I) and (U is not trig[0]) and (U is not trig[1]) and (U is not trig[2]) and (U is not D) and (U is not N) and (U is not L): if int(len(get_keywords_found(chain3))) >0: ## Minimum keywords in chain3 print(chain1, chain2, chain3, get_keywords_found(chain1), get_keywords_found(chain2), get_keywords_found(chain3)) if A in alphabet: alphabet.remove (A) if S in alphabet: alphabet.remove (S) if U in alphabet: alphabet.remove (U) for C, W, Z in [(C,W,Z) for C in alphabet for W in nonfrequent for Z in nonfrequent]: chain4=trig[0]+N+A+C+W+L+E+D+H+E+Z+E+E+trig[0]+U+U+E+H+S+W+S+I+L+trig[2]+N+E+N+trig[0]+S+A+C #31 counts if (C is not W) and (C is not Z) and (W is not C) and (W is not Z) and (Z is not C) and (Z is not W): if int(len(get_keywords_found(chain4))) >0: ##Minimum keywords in chain4 chain5=chain4+"_M_"+A+chain2+S+C+C+S+L+chain1+chain3 print(chain5, get_keywords_found(chain4), get_keywords_found(chain2), get_keywords_found(chain1), get_keywords_found(chain3))
*ZODIACHRONOLOGY*
Update:
Small error was found…it does matter if the program at first iterates the (single) alphabet letters or the trigrams.
That was significant as for the first example most of the trigrams have automatically been ruled out before processing it (just because one single letter occurs in it). If iteration is computed on the trigrams first, however, the remaining letters may still be implemented (but not vice versa, somehow). Way more results now.
for trig, R, H, V, I in [(trig,R,H,V,I) for trig in trigrams for R in alphabet
instead of
for R, H, V, I, trig in [(R,H,V,I,trig) for R in alphabet for H in alphabet...
Leads us to results like this
Please note that the order of the letters’ occurrence is identical (!) to the selected sequences from the Dorabella cipher. Nevertheless, different words can be found (potential cleartext)..any word on any position of chain1 cross-checked with any word on any position of chain2 etc..
QT
*ZODIACHRONOLOGY*
Another update..program works, approx. 7 trillion variations done, for trigram ‘AND’ with ‘E’ as the most frequent symbol (12.6%). A total of ~6,000 results, most of them nearly repetitions. Net results (identical word combinations found) no more than 10-15.
‘ING’ is next, for the repeating trigram, another 7 trillion variatons. One trigram after another, according to their own frequency.
15 cleartext combinations out of 7 trillion..words of length >4 out of strings of length 16 or more, up to 30 letters in a row where such a word, one out of 5,000 can ne find ; in each string (or ‘part’ of the cipher).
Not as fast as other code languages but nice going.
QT
*ZODIACHRONOLOGY*
Dorabella..
trying to crack it as a substitution cipher.
The last four letters of the cipher is sort of an A__A structure.
Assuming the most frequent symbol representing the letter ‘E'(~12.6% there), a word of length >3, only certain words match that certain pattern. We then may add a repeating trigram (list of ~300 common trigrams) plus four symbols as A-Z. We then get a string of length 20 plus a blank plus the A__A structure. Nice idea to search for at least two words of length >4 in such a 20-symbol string. If found, the configuration might be interesting. With a 4500 word dictionary this is equal to
26x22x21x20x19x18x17x16x4500 = 100.6 Trillion
for each repeating trigram. With a pre-configured list of trigrams as well as a pre-configured (Cryptool) list of A__A word endings, Python is currently running and goes for it. Approximately one hour for each trigram (only, because two words have to be found as well as the A__A word ending). Thus, under the given preconditions, covering a total of 100.6 Trillion attempts to get a two-word cleartext from the dictionary in that cipher section. Takes an hour for each trigram.
To exclude accidential findings (e.g. ‘EXCLUDE’ and ‘EXCLUDED’) I had to rework the total dictionary..took hours, but worth it (you may call it word root dictionary). We will see if there will be a result amongst the first few trigrams.
Not many useful results found, so far. But they do look like this:
TMTWITMATCHTMEWIDTHB_HBARB [{‘Pos’: 6, ‘W’: ‘match’}, {‘Pos’: 14, ‘W’: ‘width’}]
TPTWITPATCHTPEWIDTHB_HBARB [{‘Pos’: 6, ‘W’: ‘patch’}, {‘Pos’: 14, ‘W’: ‘width’}]
QT
*ZODIACHRONOLOGY*