Cryptanalysis of AES-ECB (Part-1)
AES is the most commonly used symmetric key encryption scheme throughout the world.Here I am going to list the short comings of the AES-ECB mode of encryption stating some of the ways in which it can be exploited by an attacker.
First I’ll brief you through the working of the ECB mode of encryption:
The Electronic Code Book popularly known as ECB is the simplest encryption mode in AES. AES is a block cipher and hence the encryption process takes place on each block seperately where the block size is usually 16.
The above image gives us an idea of how the process of encryption takes place in ECB mode.
Here the Block Cipher Encryption refers to the AES encryption of which I will be giving another detailed blog post.But for now just know that it is a combination of several complicated mathematical operations.So when we pass in the plaintext along with the key(it is essential to base the encryption operation on a key so that the ciphertext can be decrypted to get back the plain text by reversing the same operations using the key) the encryption box performs a standard encryption technique(here AES) on the plain text returning the ciphertext.
As evident from the figure the plaintext is split into blocks and the encryption is performed on each block seperately giving out corresponding ciphertext blocks.Now what if the plaintext cannot be divided into exact block size?
For example we have a plaintext of length 35 and we want to divide it into blocks of length 16.
We can surely have two blocks (16+16=32) but what will we do with the extra 3 characters? Here is where padding comes into play.As the name suggests padding is the extra stuff we give in to adjust the plaintext size so that it can be perfectly divided into blocks of required size.But during decryption a problem might arise.How will we differentiate the padding from the actual plaintext.
Suppose Alice and Bob shares the same key Alice encrypts the message“send_me_rupees_1000” along with some zeroes as padding.So when bob decrypts the message what he gets will be “send_me_rupees_10000000000000000”. We don’t want this to happen so what we usually do is use the character of difference between the blocklength and length of characters present in the block already.In this case (16–3=13). So the padded text will be: “send_me_rupees_1000”+char(13)*13.
Now Bob sees the pad and can remove the required number of padding characters to get back the original text after decryption.
So in a nutshell the encryption process is :encryption(message+padding) and decryption is decryption(message)-padding.
Another important thing to keep in mind is that if the plaintext fits in the blocksize then the padding will cover the upcoming block.For example suppose our block length is 16 and we have a string “yellow submarine” which has the same length as the block(16). In this scenario we’ll append a new block containing the padding characters.So the resulting plaintext will be “yellow submarine”+16*chr(16).
I hope you got atleast a rough idea of how the ECB mode works so now let us jump into the attacks.
ECB byte-at-a-time attack
Consider an oracle which takes an user input and appends a secret text to it and encrypts the whole text using AES-ECB.An attacker can retrieve the secret from such an oracle with minimal effort.Here we’ll take the secret text to be “secret” for the ease of explaining.So if the user inputs a character “a”, the oracle will return encryption(“a”+”secret”+chr(9)*9). Suppose we give sufficient number of input characters, we’ll end up with a scenario where input+”secret” makes up a string of size equal to the block length(16) and we’ll also have a new block filled with padding characters. Now the question arises how will one calculate this “sufficient number of input characters”? This is pretty evident given the fact when input+”secret” fills a block completely a new block of padding characters is created.Thus as we keep giving input, at a particular input the size of the ciphertext increases by one block(this increase is pretty much evident). In this case when we give “a”*10 as input this phenomenon comes into play and we have a block with just the given input and the secret string and another block full of padding characters.
First block:
Second block:
NOTE: \x10 is chr(16)
We can easily calculate the length of the secret by keeping track of the number of input characters and finally subtracting it from the block length. Here our input is “a”*10 thus the total length of the input string is 10 on subtracting this from the block length(16) we get the length of the secret text as 6 which is indeed the length of the text “secret”. But that is not enough we need to find out the secret text, so what we do is give one more character as input,now the structure of the blocks will be:
First block:
Second block:
NOTE: \x0f is chr(15)
Here the second block contains the character “t” along with the padding characters. Does that ring a bell? Exactly!! it is the same as encryption(“t”+padding). So now you have the encryption of the last character of the secret along with the padding character.Now we do a simple bruteforce trying out all possible characters and once we reach the same character which is present at the end of the secret text(here “t”) we notice that the encrypted text matches with the ciphertext of the second block.Please note that since the padding is standardised we need not worry about getting it wrong.Here is an image to get more clarity on it
This is what happens while you try out all characters from a-z and when u reach “t” you get this:
NOTE:All the representations are of the plaintext to make things clear for you.We compare the ciphertexts of these plaintexts
Which is exactly the case with the last character of our secret text.Thus we can conclude the last character is “t”. So now we have the last character of the ciphertext and it is time to hunt more of those.Upon increasing the input length by adding one more “a” we get:
First block:
Second block:
NOTE:\x0e is chr(14)
Here since you already extracted the final character(here “t”) you can keep it constant and brute force the second last character.
NOTE:Remember the characters can be numbers and symbols too in case of which you need to include them while bruteforcing.
Continue the above mentioned process until you find out the entire secret text.This is pretty easy as you already know the length of the secret text(Mentioned above).
To get more insight into the concept please try out the cryptopals challenge: https://cryptopals.com/sets/2/challenges/12
Also you can refer to the exploit written by me: https://github.com/SarangDileep/Cryptopals/blob/master/Byte_at_a_time/Byte_at_a_time.py
If you have any doubts or queries please feel free to contact me at: sarangdileep2771@gmail.com