A while back’s post was an encrypted message.
Entry tags: gzdq
Thirteen
EZLZJ IQAFZ TQOCX QGMKC MYOSY DOYRX CGZDQ VYCXT QLCXY FCQQO GZORF MCJBM CYZOE LZJGM OOZVF QMDSL EQGFQ CKZEC E CZRQC XQFVQ VYBBF JBQCX QVZFB D
I had just finished a re-read of Cryptonomicon a while back, and there was an appendix written by Bruce Schneier about a cipher used in the book. It mentioned he was the author of Applied Cryptography, which I noticed was on my co-worker’s shelf (we work in the secure payment industry, so this stuff comes up now and again). I borrowed it and read through the first few chapters, which gave a very high level view of cryptography. Fun stuff.
Both Cryptonomicon and Applied Cryptography talk about techniques of cryptanalysis, the act of breaking encryption. So I’ll pretend I don’t know the message and apply one. (Actually, I don’t really remember the message anymore.)
Assuming this is a substitution cypher of english “cleartext” (unencrypted text), we can apply a letter frequency analysis. We also have a small burst of encrypted text where we know what the underlying word is, thanks to
As
Using a text analyzer tool, the frequency of letters in my message is:
Q -15
C -13
Z -11
F, O -8
Y -7
M,X -6
B, E, G, V -5
D, J, L -4
R-3
S, T, K -2
I, A -1
N, P, H, U, W -0
letter frequency in the english language: e t a o i n s r h l d c u m f p g w y b v k x j q z
So Q=E makes sense. If we also assume that C=T and Z=A and know L=Y & E=S, the message looks like:
Q=E, C=T, Z=A, L=Y, E=S
EZLZJ IQAFZ TQOCX QGMKC MYOSY DOYRX CGZDQ VYCXT QLCXY FCQQO SAYA E A E T E T T A E T EYT TEE GZORF MCJBM CYZOE LZJGM OOZVF QMDSL EQGFQ CKZEC E A T T A S YA A E Y SE E T AST S CZRQC XQFVQ VYBBF JBQCX QVZFB D TA ET E E ET E A
Doesn’t look like anything yet.
So, on to digraphs From this site again:
These letters often go together. These are known as digraphs.
th, he, at, st, an, in, ea, nd, er, en, re, nt, to, es, on, ed, is, ti
I know C=T, so I’ll attack those digraphs.
These might be TH, TO or TI
CX 4
CM 1
CG 1
CQ 1 (TE)
CJ 1
CY 1
CK 1
CZ 1
So, assume CX = TH, meaning X=H
EZLZJIQAFZTQOCXQGMKCMYOSYDOYRXCGZDQVYCXTQLCXYFCQQO SAYA E A E THE T HT A E TH EYTH TEE GZORFMCJBMCYZOELZJGMOOZVFQMDSLEQGFQCKZECE A T T A SYA A E YSE ET ASTS CZRQCXQFVQVYBBFJBQCXQVZFBD TA ETHE E ETHE A Q=E, C=T, Z=A, L=Y, E=S, X=H
We see some “THE”s! That’s something. Maybe.
Now the other T Digraphs…
These might be AT, ST, or NT
QC 3 (ET)
MC 2
EC 2 (ST)
OC 1
KC 1
XC 1
YC 1
LC 1 (YT)
FC 1
M is common in the two digraphs (CM and MC). Given the letter frequency of M, I’d say it’s either I, N, or R. N seems to be the best first guess (because MC is frequent, and ST and AT are known), so I’ll try M=N.
EZLZJIQAFZTQOCXQGMKCMYOSYDOYRXCGZDQVYCXTQLCXYFCQQO SAYA E A E THE N TN HT A E TH EYTH TEE GZORFMCJBMCYZOELZJGMOOZVFQMDSLEQGFQCKZECE A NT NT A SYA N A EN YSE ET ASTS CZRQCXQFVQVYBBFJBQCXQVZFBD TA ETHE E ETHE A Q=E, C=T, Z=A, L=Y, E=S, X=H, M=N
I also know that Q=E, so another batch of digraphs:
These might be HE or RE
XQ 3 (HE)
TQ 2
FQ 2
IQ 1
DQ 1
CQ 1 (TE)
QQ 1 (EE)
EQ 1 (SE)
RQ 1
VQ 1
BQ 1
Probably TQ or FQ is RE. F has the closest letter frequency to R, so I’ll try F=R.
EZLZJIQAFZTQOCXQGMKCMYOSYDOYRXCGZDQVYCXTQLCXYFCQQO SAYA E RA E THE N TN HT A E TH EYTH RTEE GZORFMCJBMCYZOELZJGMOOZVFQMDSLEQGFQCKZECE A RNT NT A SYA N A REN YSE RET ASTS CZRQCXQFVQVYBBFJBQCXQVZFBD TA ETHER E ETHE AR Q=E, C=T, Z=A, L=Y, E=S, X=H, M=N, F=R
Ah-hah! At the end of the first line, if Y=I and O=N, then that word is THIRTEEN, which was the subject of the message. Except, we said M=N… let’s pull that out and try the different substitutions…
EZLZJIQAFZTQOCXQGMKCMYOSYDOYRXCGZDQVYCXTQLCXYFCQQO SAYA E RA ENTHE T IN I NI HT A E ITH EYTHIRTEEN GZORFMCJBMCYZOELZJGMOOZVFQMDSLEQGFQCKZECE AN R T TIANSYA NNA RE YSE RET ASTS CZRQCXQFVQVYBBFJBQCXQVZFBD TA ETHER E I ETHE AR Q=E, C=T, Z=A, L=Y, E=S, X=H, F=R, Y=I, O=N
More E digraphs:
These might be EA, ER, EN, ES, or ED
QC 3 (ET)
QV 3
QO 2 (EN)
QG 2
QL
QQ (EE)
QM
QF (ER)
QA
Since we know EA, ER, EN, and ES, perhaps QV=ED. The letter frequency looks good for that match.
EZLZJIQAFZTQOCXQGMKCMYOSYDOYRXCGZDQVYCXTQLCXYFCQQO SAYA E RA ENTHE T IN I NI HT A EDITH EYTHIRTEEN GZORFMCJBMCYZOELZJGMOOZVFQMDSLEQGFQCKZECE AN R T TIANSYA NNADRE YSE RET ASTS CZRQCXQFVQVYBBFJBQCXQVZFBD TA ETHERDEDI ETHEDAR Q=E, C=T, Z=A, L=Y, E=S, X=H, F=R, Y=I, O=N, V=D
Does nothing for me.
Letters that are often doubled, as in sniff: ll, tt, ss, ee, pp, oo, rr, ff, cc, dd, nn
Consider the BB in the last line. Judging from the letter frequency (and eliminating the letters we know), it could be ll, cc, ff. ll is most common, so we’ll try B=L.
EZLZJIQAFZTQOCXQGMKCMYOSYDOYRXCGZDQVYCXTQLCXYFCQQO SAYA E RA ENTHE T IN I NI HT A EDITH EYTHIRTEEN GZORFMCJBMCYZOELZJGMOOZVFQMDSLEQGFQCKZECE AN R T L TIANSYA NNADRE YSE RET ASTS CZRQCXQFVQVYBBFJBQCXQVZFBD TA ETHERDEDILL LETHEDAR Q=E, C=T, Z=A, L=Y, E=S, X=H, F=R, Y=I, O=N, V=D, B=L
In the second line, towards the end, SE RET could be SECRET. so we’ll try G=C.
EZLZJIQAFZTQOCXQGMKCMYOSYDOYRXCGZDQVYCXTQLCXYFCQQO SAYA E RA ENTHEC T IN I NI HTCA EDITH EYTHIRTEEN GZORFMCJBMCYZOELZJGMOOZVFQMDSLEQGFQCKZECE CAN R T L TIANSYA C NNADRE YSECRET ASTS CZRQCXQFVQVYBBFJBQCXQVZFBD TA ETHERDEDILL LETHEDAR Q=E, C=T, Z=A, L=Y, E=S, X=H, F=R, Y=I, O=N, V=D, B=L, G=C
I’m getting a sense that there are mistakes in there.
Let’s try some trigraphs.
Trigraphs are much like digraphs, but for three letters. Here are the most often seen trigraphs:
the, and, tha, hat, ent, ion, for, tio, has, edt, tis, ers, res, ter, con, ing, men, tho
I feel confident about all the letters in THIRTEEN, so THE should be solved. Perhaps AND? We have all those letters, A & D I don’t feel as solid about. If we look for three letter combinations with N in the middle, perhaps we’ll notice a different possibility for A or D.
QOC (ENT)
YOS (INx)
DOY (xNI)
ZOR (ANx)
ZOE (ANS)
MOO (xNN)
OOZ (NNx)
No duplicates, so not much to work with in terms of frequency.
Perhaps THA?
Keying off TH, which I have confidence in…
CXQ 3 (THE)
CXY 1 (THI)
CXT 1 (THx)
T=A is more unlikely than Z=A, by letter frequency.
If Z=A were wrong, it would be more likely that F, O or Y = A, because of letter frequency. I’ve said F=R, O=N and Y=I, so something is probably wrong here. I feel confident about F=R, O=N and Y=I (because of THIRTEEN). M=A or X=A is possible. What if we change Z=A to M=A?
EZLZJIQAFZTQOCXQGMKCMYOSYDOYRXCGZDQVYCXTQLCXYFCQQO S Y E R ENTHECA TAIN I NI HTC EDITH EYTHIRTEEN GZORFMCJBMCYZOELZJGMOOZVFQMDSLEQGFQCKZECE C N RAT LATI NSY CANN DREA YSECRET STS CZRQCXQFVQVYBBFJBQCXQVZFBD T ETHERDEDILL LETHED R Q=E, C=T, L=Y, E=S, X=H, F=R, Y=I, O=N, V=D, B=L, G=C, M=A
What would Z decrypt to, then? Z=O is the next possibility by letter frequency…
EZLZJIQAFZTQOCXQGMKCMYOSYDOYRXCGZDQVYCXTQLCXYFCQQO SOYO E RO ENTHECO TAIN I NI HTCO EDITH EYTHIRTEEN GZORFMCJBMCYZOELZJGMOOZVFQMDSLEQGFQCKZECE CON RAT LATIONSYO CANNODREA YSECRET OSTS CZRQCXQFVQVYBBFJBQCXQVZFBD TO ETHERDEDILL LETHEDOR Q=E, C=T, L=Y, E=S, X=H, F=R, Y=I, O=N, V=D, B=L, G=C, M=A, Z=O
A-ha! I see CONGRATULATIONS there, if R=G and J=U.
EZLZJIQAFZTQOCXQGMKCMYOSYDOYRXCGZDQVYCXTQLCXYFCQQO SOYOU E RO ENTHECA TAIN I NIGHTCO EDITH EYTHIRTEEN GZORFMCJBMCYZOELZJGMOOZVFQMDSLEQGFQCKZECE CONGRATULATIONSYOUCANNODREA YSECRET OSTS CZRQCXQFVQVYBBFJBQCXQVZFBD TOGETHERDEDILL ULETHEDOR Q=E, C=T, L=Y, E=S, X=H, F=R, Y=I, O=N, V=D, B=L, G=C, M=A, Z=O, R=G, J=U
I just realized I missed an F=R at the bottom.
EZLZJIQAFZTQOCXQGMKCMYOSYDOYRXCGZDQVYCXTQLCXYFCQQO SOYOU E RO ENTHECA TAIN I NIGHTCO EDITH EYTHIRTEEN GZORFMCJBMCYZOELZJGMOOZVFQMDSLEQGFQCKZECE CONGRATULATIONSYOUCANNODREA YSECRET OSTS CZRQCXQFVQVYBBFJBQCXQVZFBD TOGETHERDEDILLRULETHEDOR Q=E, C=T, L=Y, E=S, X=H, F=R, Y=I, O=N, V=D, B=L, G=C, M=A, Z=O, R=G, J=U
Given that the bottom line reads TOGETHER DE DILL RULE THE DOR__, I’m guessing V does not equal D, but rather W.
EZLZJIQAFZTQOCXQGMKCMYOSYDOYRXCGZDQVYCXTQLCXYFCQQO SOYOU E RO ENTHECA TAIN I NIGHTCO EWITH EYTHIRTEEN GZORFMCJBMCYZOELZJGMOOZVFQMDSLEQGFQCKZECE CONGRATULATIONSYOUCANNOWREA YSECRET OSTS CZRQCXQFVQVYBBFJBQCXQVZFBD TOGETHERWEWILLRULETHEWOR Q=E, C=T, L=Y, E=S, X=H, F=R, Y=I, O=N, B=L, G=C, M=A, Z=O, R=G, J=U, V=W
And WOR__ is obviosly WORLD, so B=L, D=D. And, shoot, I already knew B=L, but missed it. Crap.
EZLZJIQAFZTQOCXQGMKCMYOSYDOYRXCGZDQVYCXTQLCXYFCQQO SOYOU E RO ENTHECA TAIN IDNIGHTCODEWITH EYTHIRTEEN GZORFMCJBMCYZOELZJGMOOZVFQMDSLEQGFQCKZECE CONGRATULATIONSYOUCANNOWREAD YSECRET OSTS CZRQCXQFVQVYBBFJBQCXQVZFBD TOGETHERWEWILLRULETHEWORLD Q=E, C=T, L=Y, E=S, X=H, F=R, Y=I, O=N, B=L, G=C, M=A, Z=O, R=G, J=U, V=W, D=D
Some guesswork leads to: I=V, A=B, T=K, S=M, K=P.
EZLZJIQAFZTQOCXQGMKCMYOSYDOYRXCGZDQVYCXTQLCXYFCQQO SOYOUVEBROKENTHECAPTAINMIDNIGHTCODEWITHKEYTHIRTEEN GZORFMCJBMCYZOELZJGMOOZVFQMDSLEQGFQCKZECE CONGRATULATIONSYOUCANNOWREADMYSECRETPOSTS CZRQCXQFVQVYBBFJBQCXQVZFBD TOGETHERWEWILLRULETHEWORLD Q=E, C=T, L=Y, E=S, X=H, F=R, Y=I, O=N, B=L, G=C, M=A, Z=O, R=G, J=U, V=W, D=D, I=V, A=B, T=K, S=M, K=P
The Rot, with key 13.
PLAINTEXT ABCDEFGHIJKLMNOPQRSTUVWXYZ CYPHER MAGDQ?RXY?TBSOZK?FECJIV?L?
We can assume that the codewheel was rotated 13 times, and can be thus unrotated to get the base position.
The C code I came across on the web to do this automatically is here. I modified it to spit out letters instead of numbers.
What I didn’t know when I wrote the message was the real Cap’n Midnight decoder ring was actually alphabetical order, just shifted (“rot”). The folks that wrote the Captain Midnight C code must have been too appalled at the simplicity to match that.
Well, that was a pointless use of several lunch hours.
Huh. I was guessing someone might be with child.
I’m sure that somewhere, someone is with child.
I can think of one or two on my block, at least.