Caesar Cipher
Introduction
There are myriad ways to encrypt text. One of the simplest and easiest to understand is the Caesar cipher. It’s extremely easy to crack but it’s a great place to start for the purposes of introducing ciphers.
A Bit of Terminology
The setup is pretty simple. You start with a message you want to codify so no one else can read it. Say the message is I hope you cannot read this
. This is called the plaintext. Now we need to apply some algorithm to our text so the output is incoherent. For example, the output may be O nuvk eua igttuz xkgj znoy
. This is called the ciphertext. Mapping the plaintext to ciphertext is called encryption. Mapping the ciphertext back to plaintext is called decryption. The algorithm used to encrypt or decrypt is called a cipher.
Caesar Cipher: How it Works
Mapping I hope you cannot read this
to O nuvk eua igttuz xkgj znoy
with the Caesar cipher works like this. First, you start by deciding how much you want to shift the alphabet. Say you choose a shift of six so A becomes G, B becomes H, C becomes I, and so on until you get to the end where Z becomes F. Now you have a way to map any plaintext character to ciphertext. In fact, that’s exactly how I encoded this message:
plaintext: I hope you cannot read this.
ciphertext: O nuvk eua igttuz xkgj znoy.
Here’s a gif that shows the various mappings:
The outer circle represents plaintext letters while the inner circle represents the ciphertext equivalent.
Hopefully you can see right away why this particular cipher is very easy to crack. Just mapping the plaintext to ciphertext while maintaining word lengths and spaces makes the process fairly easy. By converting all the text to lowercase and removing all spaces and punctuation, we can make it a bit more challenging. But just barely. There are only 25 different ways to shift the letters, which means a brute force attack is trivial.
Let’s see what this looks like in code.
The Code
We’ll create a class called CaesarCipher that can encrypt or decrypt text.
class CaesarCipher:
def _clean_text(self, text):
'''converts text to lowercase, removes spaces, and removes punctuation.'''
import string
assert type(text) == str, 'input needs to be a string!'
text = text.lower()
text = text.replace(' ', '')
self.clean_text = "".join(character for character in text
if character not in string.punctuation)
return self.clean_text
def _string2characters(self, text):
'''converts a string to individual characters.'''
assert type(text) == str, 'input needs to be a string!'
self.str2char = list(text)
return self.str2char
def _chars2nums(self, characters):
'''converts individual characters to integers.'''
assert type(characters) == list, 'input needs to be a list of characters!'
codebook = {'a':0, 'b':1, 'c':2, 'd':3, 'e':4, 'f':5, 'g':6, 'h':7, 'i':8, 'j':9,
'k':10, 'l':11, 'm':12, 'n':13, 'o':14, 'p':15, 'q':16, 'r':17, 's':18,
't':19, 'u':20, 'v':21, 'w':22, 'x':23, 'y':24, 'z':25}
for i, char in enumerate(characters):
try:
characters[i] = codebook[char]
except:
pass
self.char2num = characters
return self.char2num
def _nums2chars(self, numbers):
'''converts individual integers to characters .'''
assert type(numbers) == list, 'input needs to be a list of numbers!'
codebook = {0:'a', 1:'b', 2:'c', 3:'d', 4:'e', 5:'f', 6:'g', 7:'h', 8:'i', 9:'j',
10:'k', 11:'l', 12:'m', 13:'n', 14:'o', 15:'p', 16:'q', 17:'r', 18:'s',
19:'t', 20:'u', 21:'v', 22:'w', 23:'x', 24:'y', 25:'z'}
for i, num in enumerate(numbers):
try:
numbers[i] = codebook[num]
except:
pass
self.num2chars = numbers
return self.num2chars
def _preprocessing(self, text):
''''''
clean_text = self._clean_text(text)
list_of_chars = self._string2characters(clean_text)
list_of_nums = self._chars2nums(list_of_chars)
return list_of_nums
def encrypt(self, text, shift=3):
'''return text that is shifted according to user's input.'''
import numpy as np
preprocess = self._preprocessing(text)
nums_shifted = list((np.array(preprocess) + shift) % 26)
return ''.join(self._nums2chars(nums_shifted))
def decrypt(self, text, shift=3):
'''returns text shifted by user-defined shift length.'''
import numpy as np
preprocess = self._preprocessing(text)
nums = self._chars2nums(preprocess)
num_shift = list((np.array(nums) - shift) % 26)
return ''.join(self._nums2chars(num_shift))
Code Breakdown
The CaesarCipher class contains a number of methods. The first is a method called _clean_text which converts all letters to lower case and removes spaces and punctuation. The second, third, and fourth methods called _string2characters, _chars2nums, and _nums2chars should be self-explanatory. The _preprocessing method is a meta-function that incorporates and applies all the aforementioned methods in one sequential process. The last two methods are the most interesting: encrypt and decrypt. They perform as advertised.
Setup
Great, now let’s instantiate our class and put it through its paces.
To instantiate, we’re merely type cc = CaesarCipher()
.
Now to encrypt a message: print(cc.encrypt('I hope you cannot read this.', shift=6))
.
The shift parameter tells the class by how much to shift the letters to encrypt the plaintext. In this case I arbitrarily chose 6. The output is onuvkeuaigttuzxkgjznoy
. That sure doesn’t look like anything I can make out.
Let’s try another one for fun. This one will showcase the preprocessing method in all its glory.
text = 'the QuIcK brown fox jumps over the lazy dog!'
encrypted = cc.encrypt(text, shift=5)
print(encrypted)
The output is ymjvznhpgwtbsktcozruxtajwymjqfeditl
.
Discussion
Now if you’ve given this a little thought, you should see ways to crack this cipher wide open.
The English language is replete with structure. Certain letters appear far more frequently than others. The letter e, for example, is the most common letter in the English language. Therefore, using letter frequencies is a very effective strategy. Another giveaway is double letters of which only so pairings exist. So given longer snippets of text, you can deduce plaintext-to-ciphertext letter mappings with ease.
If all else fails or you just want to find the answer quickly, a brute force search will expose the plaintext.
Let’s see how that works.
# show all decryption possibilities
for i in range(1,26):
print('shift{:2}: {}'.format(i, cc.decrypt(encrypted, shift=i)))
print('')
Which outputs:
shift 1: xliuymgofvsarjsbnyqtwszivxlipedchsk
shift 2: wkhtxlfneurzqiramxpsvryhuwkhodcbgrj
shift 3: vjgswkemdtqyphqzlworuqxgtvjgncbafqi
shift 4: uifrvjdlcspxogpykvnqtpwfsuifmbazeph
shift 5: thequickbrownfoxjumpsoverthelazydog
shift 6: sgdpthbjaqnvmenwitlornudqsgdkzyxcnf
shift 7: rfcosgaizpmuldmvhsknqmtcprfcjyxwbme
shift 8: qebnrfzhyoltkclugrjmplsboqebixwvald
shift 9: pdamqeygxnksjbktfqilokranpdahwvuzkc
shift10: oczlpdxfwmjriajsephknjqzmoczgvutyjb
shift11: nbykocwevliqhzirdogjmipylnbyfutsxia
shift12: maxjnbvdukhpgyhqcnfilhoxkmaxetsrwhz
shift13: lzwimauctjgofxgpbmehkgnwjlzwdsrqvgy
shift14: kyvhlztbsifnewfoaldgjfmvikyvcrqpufx
shift15: jxugkysarhemdvenzkcfieluhjxubqpotew
shift16: iwtfjxrzqgdlcudmyjbehdktgiwtaponsdv
shift17: hvseiwqypfckbtclxiadgcjsfhvszonmrcu
shift18: gurdhvpxoebjasbkwhzcfbiregurynmlqbt
shift19: ftqcguowndaizrajvgybeahqdftqxmlkpas
shift20: espbftnvmczhyqziufxadzgpcespwlkjozr
shift21: droaesmulbygxpyhtewzcyfobdrovkjinyq
shift22: cqnzdrltkaxfwoxgsdvybxenacqnujihmxp
shift23: bpmycqksjzwevnwfrcuxawdmzbpmtihglwo
shift24: aolxbpjriyvdumveqbtwzvclyaolshgfkvn
shift25: znkwaoiqhxuctludpasvyubkxznkrgfejum
A quick scan gives away the plaintext: shift 5: thequickbrownfoxjumpsoverthelazydog
.
Wrap Up
Hopefully you found this a fun introduction to cryptography. It’s a rich and rewarding field with endless applications.
Next time, we’ll build upon what we learned here as we explore a more challenging cipher known as the Vigenere cipher.