Out of curiosity, what is the goal of producing higher-order ngram models? Are you guys observing higher quality solves using those models? Or are there some other benefits to using them? I was theorizing that if you use ngram models of too high order, the hill climber has trouble converging. I haven’t proven that by any means, but I did some experimentation with LSTM models using Keras. You can train models of any input size, so theoretically you could run the equivalent of a 340-gram model against the 340 with a very low memory footprint, although at that high of an order it’s basically useless. I’ll do a little more research on that but hoped to gain some insight from you guys if possible.
EDIT: Here’s a link to the project where I implement the LSTM model. I hope to make this a sister project of Zenith, but it’s not ready to be released yet. Just wanted to share in case anyone has python/keras experience. https://bitbucket.org/beldenge/zenith
Funny you should mention Keras – I’ve been using exactly that for my talk coming this week for the Symposium on Cryptologic History. I’ve been playing around with using Keras+Tensorflow, trying to train it to classify hundreds of thousands of homophonic ciphers where some kind of step is performed before homophonic encipherment. Results are very rudimentary at the moment (strong signs that Z340 is not gibberish, and possible indications of route or transposition applied before homophonic encipherment) but I think this approach has a lot of potential. Glad to hear you are using it too!
Is it simply that the model is built with a larger corpus of data, so the "good" n-grams have more samples?
Yes. For example. I am using a 500 GB corpus at the moment and it is not enough for good 8-grams. beijinghouse is using much more data. His 8-grams have 3,631,818,052 unique n-gram items where mine have only 991,781,102.
Out of curiosity, what is the goal of producing higher-order ngram models? Are you guys observing higher quality solves using those models? Or are there some other benefits to using them? I was theorizing that if you use ngram models of too high order, the hill climber has trouble converging. I haven’t proven that by any means, but I did some experimentation with LSTM models using Keras. You can train models of any input size, so theoretically you could run the equivalent of a 340-gram model against the 340 with a very low memory footprint, although at that high of an order it’s basically useless. I’ll do a little more research on that but hoped to gain some insight from you guys if possible.
EDIT: Here’s a link to the project where I implement the LSTM model. I hope to make this a sister project of Zenith, but it’s not ready to be released yet. Just wanted to share in case anyone has python/keras experience. https://bitbucket.org/beldenge/zenith
Funny you should mention Keras – I’ve been using exactly that for my talk coming this week for the Symposium on Cryptologic History. I’ve been playing around with using Keras+Tensorflow, trying to train it to classify hundreds of thousands of homophonic ciphers where some kind of step is performed before homophonic encipherment. Results are very rudimentary at the moment (strong signs that Z340 is not gibberish, and possible indications of route or transposition applied before homophonic encipherment) but I think this approach has a lot of potential. Glad to hear you are using it too!
Good luck with your talk doranchak! What you say is very interesting and chimes in with our work and thoughts over the years. Great stuff.
Out of curiosity, what is the goal of producing higher-order ngram models? Are you guys observing higher quality solves using those models? Or are there some other benefits to using them? I was theorizing that if you use ngram models of too high order, the hill climber has trouble converging. I haven’t proven that by any means, but I did some experimentation with LSTM models using Keras. You can train models of any input size, so theoretically you could run the equivalent of a 340-gram model against the 340 with a very low memory footprint, although at that high of an order it’s basically useless. I’ll do a little more research on that but hoped to gain some insight from you guys if possible.
EDIT: Here’s a link to the project where I implement the LSTM model. I hope to make this a sister project of Zenith, but it’s not ready to be released yet. Just wanted to share in case anyone has python/keras experience. https://bitbucket.org/beldenge/zenith
Funny you should mention Keras – I’ve been using exactly that for my talk coming this week for the Symposium on Cryptologic History. I’ve been playing around with using Keras+Tensorflow, trying to train it to classify hundreds of thousands of homophonic ciphers where some kind of step is performed before homophonic encipherment. Results are very rudimentary at the moment (strong signs that Z340 is not gibberish, and possible indications of route or transposition applied before homophonic encipherment) but I think this approach has a lot of potential. Glad to hear you are using it too!
Wishing you the best with your talk. That’s exciting stuff! Do you happen to know if it will it be shared online? I would be very interested to listen.
Wishing you the best with your talk. That’s exciting stuff! Do you happen to know if it will it be shared online? I would be very interested to listen.
Thanks. I plan to record the audio and will try to put together a video with the slides on YouTube.
According to https://arxiv.org/pdf/1208.6109.pdf , the average word length in 20th century American is slightly below 5, and British English slightly above. Hence, I would expect a n-gram of length 5 to be in many cases covering the end of one word, and the beginning of another. The shorter the n-grams, the more often the n-gram will be a part of one word, and the longer the n-grams are, the more often it will span over parts of two words, or even more. Longer n-grams I expect to put an algorithmic preference to word sequences that are considered to be meaningful. The preference for n-grams of length 5 I consider to be more than accidental, and due to this effect because it matches the average word length.
With these thoughts and if desiring to increase to make better use of more powerful modern computers, would it not be more beneficial to chose the n-grams as 1 plus the primes? When choosing 1,2,3,5,7, then the pattern would not overlap until 2*3*5*7=210. That is already 61,7% of the length of the 340.
Adding 8 as n-gram, I would expect not to add as much value as 7 does, because it is a multiple of 2 and 4, and its effect would already be partially covered by these, and to be more expensive computationally.
I do not see how the prime pattern relates to the n-grams since every position in the cipher minus the tail is considered, example: EUROSPAR = EUROS, UROSP, ROSPA and OSPAR.
5-grams have a performance sweet spot where 4-grams are not twice as quick as 5-grams and 6-grams are more than twice as slow.
AZdecrypt did this (using beijinghouse’s 8-grams): http://scienceblogs.de/klausis-krypto-k … nt-1572972
A bit of a tour de force, as it was achieved with a light modification to the substitution solver and it’s a good example of how powerful the current package is.
wow great story well done.
Hey Jarl & beijinghouse,
I just read the blog post of Klausis crypto column. Congratulations on the world record and this absolutely great work! That is really impressive! This motivates me a lot to continue working on z340 or other ciphers.
Thanks allot Mr lowe and Largo. Glad to hear from you Largo.
Hi Jarlve, I saw that AZd comes with a polyphonic cipher solver. Since I work on polyphonic number ciphers mainly used in 16th century Italy (to give you an idea of this sort of ciphers, here an expert’s site: http://cryptiana.web.fc2.com/code/polyphonic.htm ), I wonder what kind of polyphonic ciphers can be broken with AZd?
AZdecrypt did this (using beijinghouse’s 8-grams): http://scienceblogs.de/klausis-krypto-k … nt-1572972
A bit of a tour de force, as it was achieved with a light modification to the substitution solver and it’s a good example of how powerful the current package is.
Wow.
Congratulations, this is very cool, always following the great work of you
https://zodiacode1933.blogspot.com/
This motivates me a lot to continue working on z340 or other ciphers.
I feel the same way . Congratulations to both of you for this excellent result!
_pi
Thanks Marclean and _pi! Much appreciated.