This article is just for fun as a result of some of my Stanford classes and is related to the One Time Pad cryptographic encryption and related issues with two time pad and recommended solutions to these issues.
As stated in Wikipedia
In cryptography, the one-time pad (OTP) is an encryption technique that cannot be cracked if used correctly. In this technique, a plaintext is paired with a random secret key (also referred to as a one-time pad). Then, each bit or character of the plaintext is encrypted by combining it with the corresponding bit or character from the pad using modular addition. If the key is truly random, is at least as long as the plaintext, is never reused in whole or in part, and is kept completely secret, then the resulting ciphertext will be impossible to decrypt or break
..note that ^ is the XOR operator.
Breaking the Two Time Pad
OK so now that we have some basis example we can actually see what is the issue with the same key being used more than once. Imagine we will encrypt two different messages with the same key:
The problem is that by definition if you XOR the two ciphers you will in fact get the XOR of both plain texts: e(a) ⊕ e(b) = a ⊕ b
So in order to exploit the ciphers you basically need to guess a possible word and XOR it with the XOR of two ciphers. Now, how to guess a word? The best is to use the Frequency Analysis and use statistically most common words. In the English language the most used word is not surprisingly "the". Actually a nice list of words can be found on Wikipedia. This is exactly an idea behind the guessDictionary in my example:
..hence if we XOR the XOR of two cipher texts with our dictionary we will learn plain text message.
..and the result:
As you can see we have just exploited "This" to be contained within one of the messages.
So can we possibly achieve the secure ciphering with the same key and if so how? The answer is Nonce Based Encryption also known as Many Time Key. Well known application of this method is e.g. Cipher Block Chaining (CBC):
The key thing to notice here is the Initialization Vector, so called "IV". This acts as a random nonce for the same key. It can be either an arbitrary counter or some random number. If it is a counter you just need to make sure it's not predictable (which you also have to ensure for random number in fact). One drawback of CBC is that it is sequential. Means you can't use parallel processing. To be able to use parallel processing use the CBC In Counter Mode instead. The counter mode turns the CBC into Stream Cipher.
I'm not an expert on cryptography but it is clear that cryptography can be very fragile. Whether it was VENONA project where Soviets did use the same keys for multiple messages or TLS 1.0 bug with issue of predictability of IV or many other cases. Cryptography needs a proper implementation and should always be implemented by existing/official algorithms. Do not reinvent a wheel.
CONFIDENTIALITY vs. Integrity
The last but not least - never implement these mechanism alone. While these are very important concept they don't protect you against tampering the ciphers. As such they provide a great confidentiality of a message but no integrity. To provide an integrity you have to introduce a message authentication (MAC) or introduce a public key cryptography concept. This will perhaps be a topic for my next article related to the cryptography.
Blog about my programming experiments, tests and so on.