The Beauty (RC4) and The BEAST (TLS)
Hard times for Information Security and for the authentication models it had been built upon. The inglorious falls of SecureID and Certification Authority Authentication models were not enough in this troubled 2011 and now it looks like the last authentication bastion was breached after Thai Duong and Juliano Rizzo unleashed their BEAST (Browser Exploit Against SSL/TLS) attack.
The attack exploits a well known vulnerability on CBC mode encryption algorythms (such as AES and 3DES) which affects SSL and TLS 1.0. CBC mode encryption divides the plaintext in fixed size blocks (usually 128 bits). In this mode of operation each block of ciphertext is not directly encrypted, rather, before undergoing the operation, is XORed with the previous Ciphertext. Of course the first block of the message may not be XORed with any previous Ciphertext and for this reason an hard-guessable random vector, called IV or Inizialization Vector is chosen to inizialize the encryption process.
During an encryption session (think for instance to an HTTPS session) several TLS messages are transmitted inside the same encryption channel and here come the troubles: unfortunately TLS 1.0 implementation does not use a new IV for each TLS message, that is the ciphertext of the last block of the previous message is used as the Inizialization Vector of the new message. Unfortunately this approach limits the unpredictability of the Inizialization vector: an attacker could in theory try to guess some plaintext somewhere in the encryption stream and inject a crafted plaintext so that if the encrypted output of that block corresponds exactly to the ciphertext of the block in which the guessed original message was encrypted, this means that the attacker’s guess was right. The attack is made possible in theory just because the CBC mode use the output of the previous block as the IV for the next plaintext block.
From a more formal point of view a nice and very clear description is reported at this link which I report in the following lines:
Consider the case where we have a connection between Alice and Bob. You observe a record which you know contains Alice’s password in block i, i.e., Mi is Alice’s password. Say you have a guess for Alice’s password: you think it might be P. Now, if you know that the next record will be encrypted with IV X, and you can inject a chosen record, you inject:
X ⊕ Ci-1 ⊕ P
When this gets encrypted, X get XORed in, with the result that the plaintext block fed to the encryption algorithm is:
Ci-1 ⊕ P
If P == Mi, then the new ciphertext block will be the same as Ci, which reveals that your guess is right.
The question then becomes how the attacker would know the next IV to be used. However, because the IV for record j is the CBC residue of record j-1 all the attacker needs to do is observe the traffic on the wire and then make sure that the data they inject is encrypted as the next record, using the previous record’s CBC residue as the IV.
So apparently nothing new under the sun, except the fact the attack scenario is higly unlikely since the attacker should find a way to guess some patterns and inject some well known patterns inside the encrypted channel unless…
Unless the attacker could inject a large amount of known malicious data at a time (in order to limit the guessable plaintext in each block) and use a Web server side method to inject them.
This is exactly where the two main features of the BEAST attack rely: what if an attacker could guess where the encrypted password is located inside the encrypted channel, and split the original block in several 16 bytes blocks in which a single byte contains the original character of the password and the remaining 15 bytes contain the malicious known padding? Quite Easy! The attacker should try “only” 2^8 (256) possible values in order to guess the first character and obtain the same encrypted output than the crafted plaintext. Once guessed the first character, he could obtain the IV for the next block from the ciphertext, and guess the next character of the password in the next block with the same method: the first byte is known to be the first character of the password, the second byte is the unknown quantity and the other 14 bytes contain the malicious known padding. Shifting up to the last block the attacker could obtain the password.
Of course in theory there is still a big issue consisting in the injection of the known pattern in the encryption channel. In order to overcome it the attackers used a method (for which so far few details were disclosed) leveraging Web Sockets, a technology which provides for bi-directional, full-duplex communications channels, over a single TCP Socket. In a meshed-up world, Web Sockets are used for instance when a Web Server redirects a browser to another server to get a certain content (for instance an embedded Image). In Web Socket models, the browser handshakes directly with the remote server and verify if the connection is ok from the first server (origin based consent).
The same article mentioned above delineates how Web Sockets may be exploited to perpetrate the attack:
Say the attacker wants to recover the cookie for
https://www.google.com/. He stands up a page with any origin he controls (e.g.,
http://www.attacker.com/. This page hosts JS that initiates a WebSockets connection to
https://www.google.com/. Because WebSockets allows cross-origin requests, he can initiate a HTTPS connection to the target server if the target server allows it (e.g., because it wants to allow mash-ups). Because the URL is provided by the attacker, he can make it arbitrarily long and therefore put the Cookie wherever he wants it with respect to the CBC block boundary. Once he has captured the encrypted block with the cookie, he can then send arbitrary new packets via WebSockets with his appropriately constructed plaintext blocks as described above. There are a few small obstacles to do with the framing, but Rizzo and Duong claim that these can be overcome and those claims seem plausible.
Although TLS 1.1 and 1.2 introduce a randomizaton of the IV for each message, the dramatic thing is that TLS 1.1 has been published in 2006 but it is far from being commonly adopted. The funny thing is that in order to mitigate the attack Web Servers should use a cipher which does not involve CBC mode, as for instance RC4 (back to the future).
Google servers already use RC4 while Chrome developers are testing a workaround. Will RC4 be enough to save the infosec world from the fall of authentication?