Some update..
The following are the last 50 symbols of the Dorabella cipher (transcription):
Assuming the 12.6% frequent symbol to represent the letter ‘E’, a list for the EOPT section of the cipher has been set-up in advance. This containing approximately 300 values, sort of looking like this:
EAND
EING
ETIO
EATI
EFOR
ETHA
ECON
..
For each EOPT, additional letters were then added. A, H, S, then R, V, I, D and finally L, N. This leading to two longer, ‘fictitiously created cipher plaintext’ (FCCP) strings. Those FCCP strings can then be searched for words (Aho-Corasick algorithm).
Using a dictionary of approximately 5,000 word roots (e.g. containing ‘accept’ but not ‘accepted’ – to avoid finding duplicates), since approximately one week the Python program runs for it. The settings I’ve chosen were:
– Most frequent symbol is ‘E’
– OPT is one of the 300 most frequent trigrams (repeating n-gram)
– Last word in the cipher at least of length >2 letters
– All other letters may be A-Z
>> Find minimum two words of length >4 letters in ‘chain 2’ and
>> Find minimum two words of length > 4letters in ‘chain 3’
Results, so far, have been promising, for example (blanks represent not yet added letters)
YLEELRHGNEVERMAGIC__CVALARMALIAWHALERMGAHY_HYICY
Although this is obviously not the correct solution, a total of approximately 43,000 of such results have been found. When deleting duplicates of words found (combined), a list of 219 potential cleartext phrases actually remains on the table.
To get those, a total of
200 x 23 x 22 x 19 x 18 x 17 x 16 x 15 x 14 x 13 = 36.1 Trillion
variations have been checked against the 5,000 word dictionary.
Computer is running since approximately one week. Each word of the dictionary was checked on each position of each string. Thus, someone can imagine what is currently going on inside the Intel i5 processor (2/4 threads)…I estimate the coverage of 5.4 Quintillion trials to place any word of the dictionary on any position of the string. Nobody ever say again that Python is ‘slow’.. All this, of course, is only possible considering ‘IF’ loops in the main program…e.g. IF there are not at least two words found in ‘chain 2’, then ‘chain 3’ is not even checked at all.
After a quick first look on the results, there appears to be no valid cleartext. This may have different causes:
– cleartext word not present in the dictionary
– completely different cleartext language (e.g. French or Latin)
– wrong pre-settings (e.g. only shorter words present in the cleartext)
– transcription error
To search for two words of length >4 in a 18 letter string, however, seems to be a bit ‘risky’. Therefore, the next run will search for only one word of length > 4 in ‘chain 3’. Most likely, this will lead to way more results, we will see.
QT
*ZODIACHRONOLOGY*
Program runs quite well after adding another string plus reducing the amount of words to be found in one section of the cipher (now four words of length >4 to be found at least).
Some results occur, look similar to this
DEUCEMEETSSECHFHRDOLE______ANEENTRULEDETIOUCH__HDONOTIONCOPRONETIUORA_RACHA
‘deuce’, ‘meets’, ‘ruled’, ‘notion’, ‘prone’
66 symbols, 26 letters of ‘cleartext’.
However, no final result, yet. Currently running on #16 of approximately 300 ‘EOPT’ combinations (repeating trigram plus E being the most frequent letter).
A solution could be found even tonight, in five years – or never .
QT
*ZODIACHRONOLOGY*
The Python program is now running since my last post (approx. 17 days, about 3-4 hours a day). So far, it has covered
28,496,491,114,291,200
different settings to replace the cipher’s symbols with letters.
The criteria:
– most frequent symbol = E
– 300 most frequent repeating trigrams (currently at #92)
– last word minimum 3 letters
– minimum of 5 words (total) to be found in three strings of length 21, 20 and 18 letters, thus
Each of the approx. 28.4 quadrillion settings has been cross-checked with the complete ~4,500 word dictionary. Thus, each word of the dictionary has being screened on any potential position of the string shown previously.
197,383 results were found. Of those results, many iterations occur (e.g. finding 5 words but the other letters are iterated from A-Z, thus showing multiple results). When removing such duplicates only
125 combinations of five words found
showed up. From those 125, a total of 81 consisted of at least two overlapping words (e.g. SPEAR and PEARL). Therefore, out of 28.4 quadrillion settings, 43 ‘valid’ combinations of ‘five word results’, which are shown below:
DEUCEMEETSSECHFHRDOLE______ANEENTRULEDETIOUCH__HDONOTIONCOPRONETIUORA_RACHA deuce meets ruled notion prone GEYLEMEETSSELACARGOHE______BNEENTRYHEGETIOYLA__AGONOTIONLOPRONETIYORB_RBLAB meets cargo entry notion prone LEYBEMEETSSEBACARLOHE______DNEENTRYHELETIOYBA__ALONOTIONBOPRONETIYORD_RDBAD meets carlo entry notion prone DEUHEMEETSSEHAZARDOLE______GNEENTRULEDETIOUHA__ADONOTIONHOPRONETIUORG_RGHAG meets hazard ruled notion prone DEUHEMEETSSEHAZARDOCE______GNEENTRUCEDETIOUHA__ADONOTIONHOPRONETIUORG_RGHAG meets hazard truce notion prone DEYHEMEETSSEHAZARDOLE______GNEENTRYLEDETIOYHA__ADONOTIONHOPRONETIYORG_RGHAG meets hazard entry notion prone YEALEMEETSSELUXURYODE______GNEENTRADEYETIOALU__UYONOTIONLOPRONETIAORG_RGLUG meets luxury trade notion prone YEALEMEETSSELUXURYOCE______GNEENTRACEYETIOALU__UYONOTIONLOPRONETIAORG_RGLUG meets luxury trace notion prone KEABEMEETSSEBOWORKICE______YDEEDTRACEKETHIABO__OKIDITHIDBIPRIDETHAIRY_RYBOY meets work trace pride hairy KEABEMEETSSEBOWORKIDE______YCEECTRADEKETHIABO__OKICITHICBIPRICETHAIRY_RYBOY meets work trade price hairy KEABEMEETSSEBOWORKICE______YVEEVTRACEKETHIABO__OKIVITHIVBIDRIVETHAIRY_RYBOY meets work trace drive hairy KEABEMEETSSEBOWORKIDE______YZEEZTRADEKETHIABO__OKIZITHIZBIPRIZETHAIRY_RYBOY meets work trade prize hairy KEABEMEETSSEBOWORKICE______YZEEZTRACEKETHIABO__OKIZITHIZBIPRIZETHAIRY_RYBOY meets work trace prize hairy NEACEMEETSSECOLORNIDE______YZEEZTRADENETHIACO__ONIZITHIZCIPRIZETHAIRY_RYCOY meets color trade prize hairy BEACEMEETSSECOFORBIDE______YZEEZTRADEBETHIACO__OBIZITHIZCIPRIZETHAIRY_RYCOY meets forbid trade prize hairy KEAJEMEETSSEJOWORKICE______YDEEDTRACEKETHIAJO__OKIDITHIDJIBRIDETHAIRY_RYJOY meets work trace bride hairy BEAJEMEETSSEJOFORBIDE______YCEECTRADEBETHIAJO__OBICITHICJIPRICETHAIRY_RYJOY meets forbid trade price hairy DEALEREESBBELOWONDIME______PFEEFSNAMEDESTIALO__ODIFISTIFLIKNIFESTAINP_NPLOP dealer below named knife stain HEACEMEESUUECOLORHINE______BDEEDSRANEHESTIACO__OHIDISTIDCIPRIDESTAIRB_RBCOB color rhine deeds pride rides stair HEACEMEESUUECOLORHINE______GDEEDSRANEHESTIACO__OHIDISTIDCIBRIDESTAIRG_RGCOG color rhine deeds bride rides stair PEACEUEESHHECOLORPINE______GDEEDSRANEPESTIACO__OPIDISTIDCIBRIDESTAIRG_RGCOG peace color deeds bride rides stair HEALECEESBBELOWORHINE______GDEEDSRANEHESTIALO__OHIDISTIDLIPRIDESTAIRG_RGLOG below rhine deeds pride rides stair KEALEHEESBBELOWORKINE______GDEEDSRANEKESTIALO__OKIDISTIDLIPRIDESTAIRG_RGLOG below work deeds pride rides stair HEAVEUEESCCEVOLORHINE______WDEEDSRANEHESTIAVO__OHIDISTIDVIPRIDESTAIRW_RWVOW heave rhine deeds pride rides stair HEAVEUEESCCEVOLORHINE______WDEEDSRANEHESTIAVO__OHIDISTIDVIBRIDESTAIRW_RWVOW heave rhine deeds bride rides stair LEYREDEEMHHERUSUALNBE______TOEEOMAYBELEMINYRU__ULNONMINORNCANOEMIYNAT_ATRUT redeem usual maybe minor canoe WEYIEHEELSSEIRCROWNJE______KDEEDLOYJEWELANYIR__RWNDNLANDINGONDELAYNOK_OKIRK heels crown jewel landing delay WEYIEFEELSSEIRCROWNJE______KDEEDLOYJEWELANYIR__RWNDNLANDINGONDELAYNOK_OKIRK feels crown jewel landing delay WEYIEHEELSSEIRFROWNJE______KDEEDLOYJEWELANYIR__RWNDNLANDINGONDELAYNOK_OKIRK heels frown jewel landing delay WEYIEHEELSSEIRBROWNJE______KDEEDLOYJEWELANYIR__RWNDNLANDINGONDELAYNOK_OKIRK heels brown jewel landing delay WEYIEFEELSSEIRBROWNJE______KDEEDLOYJEWELANYIR__RWNDNLANDINGONDELAYNOK_OKIRK feels brown jewel landing delay SEALEDEECBBELOWONSIME______RTEETCNAMESECHIALO__OSITICHITLIUNITECHAINR_NRLOR sealed below names unite chain SEALEDEECBBELOWONSIME______RFEEFCNAMESECHIALO__OSIFICHIFLIKNIFECHAINR_NRLOR sealed below names knife chain DEALEREECBBELOWONDIME______PTEETCNAMEDECHIALO__ODITICHITLIUNITECHAINP_NPLOP dealer below named unite chain DEALEREECBBELOWONDIKE______PTEETCNAKEDECHIALO__ODITICHITLIUNITECHAINP_NPLOP dealer below naked unite chain DEALEREECBBELOWONDIME______PFEEFCNAMEDECHIALO__ODIFICHIFLIKNIFECHAINP_NPLOP dealer below named knife chain SEALEDEECBBELOWORSITE______GMEEMCRATESECHIALO__OSIMICHIMLIPRIMECHAIRG_RGLOG sealed below rates prime chair SEALEDEECBBELOWORSITE______GZEEZCRATESECHIALO__OSIZICHIZLIPRIZECHAIRG_RGLOG sealed below rates prize chair VEPIECEERBBEINGNHVAFE______TLEELRHPFEVERMAPIN__NVALARMALIAWHALERMPAHT_HTINT piece being fever alarm whale GENIECEERDDEITYTHGAOE______SLEELRHNOEGERMANIT__TGALARMALIAWHALERMNAHS_HSITS niece deity german alarm whale VENIECEERDDEITYTHVAFE______SLEELRHNFEVERMANIT__TVALARMALIAWHALERMNAHS_HSITS niece deity fever alarm whale VEPIECEERDDEITYTHVANE______SLEELRHPNEVERMAPIT__TVALARMALIAWHALERMPAHS_HSITS piece deity never alarm whale VEPIECEERDDEITYTHVAFE______SLEELRHPFEVERMAPIT__TVALARMALIAWHALERMPAHS_HSITS piece deity fever alarm whale
Please note that only the words are fixated on the string, the other letters may vary (although may not repeat if assuming a monoalphabetic substition). So far, with 14 letters in a row, ‘LANDINGONDELAY’ is the longest cleartext string found.
QT
*ZODIACHRONOLOGY*
I regret that I have to be the one to pour cold water on this endeavour regarding the
‘Dorabella Cipher’, but if you view this webpage;
http://scienceblogs.de/klausis-krypto-k … ent-906800
then I have to say that the poster ‘Thomas Ernst’ makes a good case for the ‘Dorabella Cipher’
being a fake.
Thanks for the link.
Thomas Ernst claims the cipher to be fake. However, when doing so, he does not immediately explain why the cipher should be fake. Only later-on he brought up some points which I refer to below:
make as many symbols different from the previous, before repeating one.
This is incorrect. There are double symbols in the cipher, too.
The third, sagging line appears to have been added later, because it has a different symbol distribution from lines one and two.
Well, it is obvious that some letters may in different sections of the cipher, e.g. a non-frequent letter being present only in the third line.
While Elgar’s date of 1897 looks authentic, it was topped with the cipher at a later date.
No reason to assume that. To me, the pen/feather used looks quite authentic.
I have positively no doubt that DP created the “cipher” herself, and added the “Marco” and “Liszt” pieces either in the museum, while going “from room to room”, or she had them prepared as exhibits before the museum was opened.
No reason to assume that either.
Elgar’s beloved spaniel Marco was born in 1924.
Good point regarding the Marco notes. However, nobody ever said that those had been written earlier than 1924? The Dorabella cipher could have been (and most likely was) written in 1897 or 1827. Also, the Dorabella cipher does not necessarily contain anything about Edward Elgar’s dog Marco.
Why would Elgar re-invent a cipher after 1924 when he – supposedly – had already used it in 1897
The Marco notes do not necessarily represent the invention of the cipher. My father had used his own number system over decades (he tried to hide prices, thus the number 4,600 was actually sort of ‘FSZZ’). Besides that, Elgar might have known that his cipher has not yet been solved. The Marco notes could easily represent an enhancement of his encryption method (by using a ‘key’ such as ‘Do you go to London tomorrow’). Over the years, Elgar could have understood that his encryption method was even good not only for personal but also transmittal purposes – by sending the cipher as well as the key separately. I still think it is a very nice encryption method, btw. One could also continue to develop it by not only using one key but e.g. a Caesars matrix of such key, therefore making the encryption method polyalphabetic. Then, a second key would be necessary (how many letters to move the key sentence to the left (or right). It then represents an excellent polyalphabetic encryption method, only to be solved with both, the correct key sentence as well as the serial of numbers (two-step-encryption).
Why is the date of the “Dorabella Cipher” below and not above the text, as was EE’s custom?
Don’t know if that was Elgar’s custom but not enough reason to believe it should be fake, either.
Did Elgar write the “Dorabella Cipher”, or the “Marco”- and “Liszt” scribblings – definitely not!
Another assumption.
Should like to add that both the 1886 Liszt-program as well as the “notebook” cipher-dabblings indeed are housed in the Elgar Birthplace Museum. Which opened – as mentioned above – in 1934, and not, as stated by DP, in 1938. Post-dating the opening of the museum would authenticate DP’s 1937 forgery.
This might be a good reason if the date written on the cipher was 1937. However I do not see any 1937 written on the cipher. To me it looks like ‘July 14, 97’. It could be ‘July 14, 27’, too.
The following plaintext “DO YOU GO TO LONDON TOMORROW?” is NOT enciphered; it rather serves for a frequency count of the vowel “O”, listed as “9” underneath.
This shows that Thomas Ernst has only partially understood the encryption method. The ‘O”s actually define if the ABC, DEF, GHI, KLM etc. settings for each degree should be reversed or not. In the Marco case, this would e.g. be ABC, FED, GHI, MLK as the second and fourth letter of the keyword are an ‘O’ while the first and the second are not. Dorabella is more complicated as there are no such ‘groups’ of letters at all.
the fact that MARCO ELGAR is the first enciphered text
Dorabella cipher was first, IMO.
While the Liszt and Dorabella “ciphers” definitely are frauds, the “Marco”-pages may be authentic
To me, the handwriting as well as the symbols seem to be very identical, thus authentic.
Three possible vowels out of eighteen letters equal – rounded up – 17%.
Don’t think Thomas Ernst knows which symbol represents a vowel and which represents a consonant.
Summarizing this up I’d say that Thomas Ernst believes the cipher to be fake. Mainly because he thinks that the date written on the cipher is ’37 or something like that. No valid reason to be found that the cipher indeed was a fake. Simply because someone can’t crack the cipher, doesn’t make it a fake. Typical amateur behaviour, imo, but nothing wrong with Thomas Ernst’s efforts, too.
UPDATE:
My program runs for almost a month (!) now. As described above, I used some pre-settings regarding the letter ‘E’ as well as a group of the most frequent trigrams. So far, out of 166 of such trigrams, 145 were calculated. Not a real good solution found, yet. Not very much regarding the theoretical 17,576 (26^3) trigrams. Nevertheless, so far a total of
26 x 145 x 22 x 21 x 20 x 19 x 18 x 17 x 16 x 15 x 14 = 680,499,211,392,000
or 680.5 trillion combinations were already cross-checked against an English dictionary (some additional conditions e.g. finding at least two words in a string of length 21 or the last word at least three letters, of course). Good point by doing so is, that we now can be sort of sure that with regard to those settings, there is absolutely no valid solution.
All other combinations still to be calculated (thus the settings have to be modified, e.g. the dictionary or other trigrams or other encryption method etc.).
As you can see below, words can be found (but no complete solution..most results occur because of (still) ‘overlapping’ words in the dictionary). Some E + trigram combinations don’t even deliver any partial results at all:
QT
*ZODIACHRONOLOGY*
The (one month lasting) run has now finished, so far only two groups of results matching the criteria. None of it appears to be a valid solution.
WEPIECEELBBEINGNRWOJE______TVEEVLRPJEWELSOPIN__NWOVOLSOVIODROVELSPORT_RTINT piece being jewel drove sport
TELDEREEVNNEDOCOATSUE______GHEEHVALUETEVISLDO__OTSHSVISHDSMASHEVILSAG_AGDOG elder coats value smash evils
As one can see, the previous setting required some ‘gaps’ in the cleartext. The next run will be a different one, without such gaps. Should be possible to cover a string like the following:
R+E+E+R+O+I+D+N+E+L+E+O+P+T+D+H+S+C+C+S+L+T+R+T+O+P+T+R+H+T+V+I+T+R+E+O+P+D+T+I
thus covering a string of 40 cleartext letters. ‘Task’ is to find at least three words of length >4 inside that string.
Update
According to the previous modification, program starts to run again. Results also look ‘better’ now:
EAND MEEMATSIEGEANDSKULLUGDMDANDMKDDTDMEANSDT [{'Pos': 6, 'W': 'siege'}, {'Pos': 14, 'W': 'skull'}, {'Pos': 33, 'W': 'means'}] MEEMATSIEGEANDSKULLUGDMDANDMKDCTDMEANSDT [{'Pos': 6, 'W': 'siege'}, {'Pos': 14, 'W': 'skull'}, {'Pos': 33, 'W': 'means'}] MEEMATSIEGEANDSKULLUGDMDANDMKDFTDMEANSDT [{'Pos': 6, 'W': 'siege'}, {'Pos': 14, 'W': 'skull'}, {'Pos': 33, 'W': 'means'}] MEEMATSIEGEANDSKULLUGDMDANDMKDPTDMEANSDT [{'Pos': 6, 'W': 'siege'}, {'Pos': 14, 'W': 'skull'}, {'Pos': 33, 'W': 'means'}] MEEMATSIEGEANDSKULLUGDMDANDMKDWTDMEANSDT [{'Pos': 6, 'W': 'siege'}, {'Pos': 14, 'W': 'skull'}, {'Pos': 33, 'W': 'means'}] MEEMATSIEGEANDSKULLUGDMDANDMKDYTDMEANSDT [{'Pos': 6, 'W': 'siege'}, {'Pos': 14, 'W': 'skull'}, {'Pos': 33, 'W': 'means'}] MEEMATSIEGEANDSKULLUGDMDANDMKDBTDMEANSDT [{'Pos': 6, 'W': 'siege'}, {'Pos': 14, 'W': 'skull'}, {'Pos': 33, 'W': 'means'}] MEEMATSIEGEANDSKULLUGDMDANDMKDVTDMEANSDT [{'Pos': 6, 'W': 'siege'}, {'Pos': 14, 'W': 'skull'}, {'Pos': 33, 'W': 'means'}] MEEMATSIEGEANDSKULLUGDMDANDMKDXTDMEANSDT [{'Pos': 6, 'W': 'siege'}, {'Pos': 14, 'W': 'skull'}, {'Pos': 33, 'W': 'means'}] MEEMATSIEGEANDSKULLUGDMDANDMKDJTDMEANSDT [{'Pos': 6, 'W': 'siege'}, {'Pos': 14, 'W': 'skull'}, {'Pos': 33, 'W': 'means'}] MEEMATSIEGEANDSKULLUGDMDANDMKDQTDMEANSDT [{'Pos': 6, 'W': 'siege'}, {'Pos': 14, 'W': 'skull'}, {'Pos': 33, 'W': 'means'}] MEEMATSIEGEANDSKULLUGDMDANDMKDZTDMEANSDT [{'Pos': 6, 'W': 'siege'}, {'Pos': 14, 'W': 'skull'}, {'Pos': 33, 'W': 'means'}]
even a cleartext phrase "at siege and skull" was, although a complete solution is still open, found.
QT
*ZODIACHRONOLOGY*
On the right hand side you see "9 O’s". As it seems that he recognized the matter of repeats. Under that there is "1 2 3 4". It may hint at some kind of polyalphabetism. If periodic it could be detectable after assuming the correct hypothesis.
Start with the second line on the right side of his notes…amount of ‘circles’:
First three circles go like this: 1, 2, 3 circles. ("L")
Second three circles go like this: 3, 2, 1 circles. ("O")
Third three go like 1, 2, 3 again ("N")
Fourth goes 1, 2, 3 ("D")
FIfth goes 3, 2, 1 ("O")
..and so on. Each time there is an "O", the sequence starts with three and ends up with one and vice versa.
The diagram simply shows the directions where the circles actually are heading to (e.g. 45° or 180°). Someone can therefore place the letters on the diagram. If somebody receives such diagram with all the letters written into it, the person would be able to solve the encryption.
Assuming Edward Elgar had left such a (unique) diagram as a ‘key’ somewhere, the cipher would be solved quite easily. Without such a key, Python or similar gets necessary. However, as you can’t calculate the order of e.g. 24 alphabetical letters at once (memory error), a step-by-step decryption procedure (here the 40 letter string) is required.
QT
*ZODIACHRONOLOGY*
..and so on. Each time there is an "O", the sequence starts with three and ends up with one and vice versa.
Say that 123 is L and 321 is O then this is the pattern:
LOLOLOLOLOLLOLOLLOL DOYOUGOTOLONDONTOMORROW LOLOLOLOLOLLOLOLLOL DOYOUGOTOLONDONTOMORROW LOLOLOLOLOLLOLOLLOL DOYOUGOTOLONDONTOMORROW LOLOLOLOLOLLOLOLLOL DOYOUGOTOLONDONTOMORROW LOLOLOLOLOLLOLOLLOL DOYOUGOTOLONDONTOMORROW
Slide it all you want it does not line up with the text.
The ‘L’ and the ‘O’ are not real cleartext letters. The only represent the key, sort of a step in the middle between cipher and cleartext. Elgar differs consonants and the (only) vowel ‘O’.IS
With a consonant, a group of three cleartext letters is e.g. ABC
With an O, the same group of cleartext letters would be e.g. CBA.
The previous of course transscribed on the symbols, the first letter of the previous consisting out of one open ‘circle’, the second out of two, the third out of three. Thus, depending on the keyword, the letter A can be represented by one open circle (if the first letter of the keyword IS NOT an O) or by three open circles (if the first letter of the keyword IS an O.
Please scroll down the picture to see that according to this, the alphabet has been shuffled (ABCFED etc…NOT ABCDEF…!)
The keyword needs to have a length of at least 8 key letters (8×3=24 letters).
The only part of EE’s notes that I cannot really follow is the note "1234 4". No idea what that means, anybody an idea?
QT
*ZODIACHRONOLOGY*
With a consonant, a group of three cleartext letters is e.g. ABC
With an O, the same group of cleartext letters would be e.g. CBA.
I just showed you that the pattern does not match. It seems to me that you are making this rule up.
Even in your example picture you can see this: D O Y O U G O T O L ? ? D. The question marks are mismatched by your rule.
Elgar differs consonants and the (only) vowel ‘O’.IS
Questionable. There is the letter "U" and "Y" also. And he clearly states 9 O’s.
With a consonant, a group of three cleartext letters is e.g. ABC
With an O, the same group of cleartext letters would be e.g. CBA.I just showed you that the pattern does not match. It seems to me that you are making this rule up.
Even in your example picture you can see this: D O Y O U G O T O L ? ? D. The question marks are mismatched by your rule.
First I thought that, too. However to define his 24 letter alphabet consisting of groups of three, he only needs to define the first 8 letters of his key "DOYOUGOT..". Thus, the rest doesn’t matter. The first eight letters of his key are a perfect match to his ABC vs. CBA settings. Leading to the complete alphabet being defined:
D >> ABC
O >> FED
Y >> GHIJ
O >> MLK
U >> NOP
G >> QRS
O >> WUVT
T >> XYZ
.
.
In this example, however, he only uses the letter ‘O’ as a marker to reverse the group of three letters.
Already checked out all possibilities of how he could have placed the groups of the three letters onto the diagram. No solution. IMO he shuffled all letters wildly on the diagram, without using any ‘groups’ of three letters at all.
Doing so leads to 24! potential variations how he could have set the key (diagram).
24! = 620,448,401,733,239,439,360,000
out of which only one setting is correct to transform the cipher into cleartext. Giving the diagram to some reader, however, would have made it very easy to decrypt the cipher.
QT
*ZODIACHRONOLOGY*
That is a possibility. I wonder what complicates the cipher. If it were a simple substitution cipher then it would have been solved by now. Nulls, polyalphabetism or transposition?
I think that the left part of the image depicts some simple substitution examples of the system while the right part actually details a complication.
First I’d say most of the people who see the cipher don’t even read it. Nor do they any transcription of the symbols into ‘readable’ letters. Also, the cipher text is not very long. It also contains little if at all structures of which you could guess any words (‘patterns’). Those also are not separated at all. Finally, the text appears to have some sort of creativity in it e.g. lyrical language..some of the symbols occur more often in the first part, others in the second part of it. Shukoshin vowel/consonant also leads to rather no solid results. Decryption tools, after all, fail, too. Altogether it might be a monoalphabetic one, but still could be something completely different (nonsense, fake, Latin language etc.). Not that easy to crack, imo, but possible if it is monoalphabetic & English text.
QT
*ZODIACHRONOLOGY*
Update regarding the Python attack on Dorabella:
Still focussing on the partial string (transscripted) from the Dorabella cipher:
R+E+E+R+O+I+D+N+E+L+E+O+P+T+D+H+S+C+C+S+L+T+R+T+O+P+T+R+H+T+V+I+T+R+E+O+P+D+T+I
(homophones)
the program has now been modified to become faster. Previously, all homophones had been checked for each alphabetical letter, then word exclusion of the letters to each other (substitution). Now, letters are kicked out from the alphabets used according to the following steps in advance (!). Thus, the program works with partial alphabets only, depending on which homophones had been substituted before. The previous according to the following steps:
1. step: Setting ‘E’ as the most frequent letter for the most frequent homophone
2. step: Selecting 750 most frequent trigrams for the repeating homophone sequence (EOP)
3. step: Iterating five letters from the remaining alphabet (after deleting the letter ‘E’ as well as the letters from the EOP-trigram
4. step: Checking for any English word of length >4 letters in the following string: T+R+T+O+P+T+R+H+T+V+I+T+R+E+O+P+D+T+I (second part of the string above)
5. step: Iterating four additional letters from the – now – remaining alphabet (thus ‘E’, EOP plus five additional letters deleted from the alphabet)
6. step: Checking for any second English word of length >4 letters in the other part of the string: R+E+E+R+O+I+D+N+E+L+E+O+P+T+D+H+S+C+C+S+L
7. step: Checking if a total of three English words of length >4 letters are available in the string above
8. step: Print of the string + words found in the string.
For each trigram this procedure is equal to:
22x21x20x19x18x17x16x15x14 or 180.5m
different variations that are tried out by the program. With regard to a dictionary of approximately 4,500 words on approximately 35 potential positions in the string, this ends up somewhere at
180.5m x 4,500 x 35 or 28.4 quadrillion (10^15)
possibilities of how a cleartext word can be found inside that encrypted string. This, of course, for each trigram will take some time.
However: IF step 4 is ‘not successful’ (thus no word found in the latter part of the string at all), same program handles ‘only’ approximately 14.2 trillion possibilities to search any word in that specific part of the string.
Currently, the Python program does this in less than 25 seconds (checking a word 14.2 trillion times to be placed into the encrypted string, considering substitution, EOP, E etc.).
Nevertheless, chances are good that we could get quite old with that one. But IF the repeating trigram is actually one of the most frequent ones (out of the 750 different), a solution could be found over the next days, weeks, months, too.
Currently the cleartext trigram ‘TIO’ is in computation (already finding multiple potential cleartext in the latter part of the string), while ‘AND’ had been without any result beyond step 4. Also, we may be excited about the most frequent trigrams such as ‘FOR’ or ‘THA’ (if we will ever get there..). Under the previous assumptions, ‘THE’ is no option for the OPT sequence due to the separate assumption of ‘E’ representing the most frequent cleartext letter E – it would violate the ‘idea’ of the cipher being a monoalphabetic substition.
Regarding the computation time: To calculate one trigram completely (!) actually would approximately take 2.5 years. Not a joke, reality. However there are potential improvements which are possible:
– reduce the dictionary to e.g. the most frequent 1,000 words
– implementing repeating bigrams (if available)
– differentiation of frequent, nonfrequent homophones
– shorten the (partial) strings used
– improved computation e.g. with ‘Intel Distribution for Python’ or different languages/technology
Because of the multiple step procedure of ‘adding’ potential letters to the string, however, the computation time is dramatically reduced as larger parts of the computation are only calculated if one word is found in the second part the string (corrected).
QT
*ZODIACHRONOLOGY*