What are we told about the One Time Pad? - Is it a Myth or is it Reality?

  • Home OTP v. Permutation Download PDF File Conclusion & Comments

  • Some Mathematics and a bit of Logic

    "In Principio erat Verbum and after that we invented numbers."

    This page contains some mathematics and a logical approach to the system we developed and suggest. So, readers that have a dislike for mathematics can skip this page because everything here explained is contained in the previous pages and the accompanying PDF file without going into the mathematics too much. Here the focus is placed on the fact that a different permutation cipher for each single character of a plaintext message during encryption operates the same way as the OTP. In doing so, it will provide the same level of security but only require the exchange of an IV (initialisation value). The key distribution problem that is thrown into the discussion when addressing the OTP is reduced to a one time event instead of sending key and cipher via separate secure channels with each new message.

    Our contact email:wandeethaweetham@gmail.com


    Uncertainty - Entropy:

    When Shannon discussed with a friend the problem which word he should choose for 'uncertainty' in the paper he was going to publish, his friend advised him to use 'entropy' since nobody anyway knew the meaning. The absence of information would give the entropy of a system system or the increase in information would reduce entropy (uncertainty). The expression or term entropy is borrowed from thermodynamics where it is used to describe disorder and providing a numerical measure for the state of disorder. For example, if we take five cards containing on their faces the numbers one to five and shuffle them very well and place them face down on a desk and ask an observer to give us the value number three (card) contains, the probability that the observer gets it right will be 1/5. We can repeat that process as often as we want but the probability for an observer doesn't change. Entropy will always as it highest and the sequence of the cards on the desk doesn't matter because predictability or randomness for the observer doesn't change. If we turn one card around we have reduced entropy and randomness for the observer.


    We know that cipher systems can be called perfectly secure if ciphertext and plaintext are statistically independent or in other words the mutual information between ciphertext and plaintext, without having access to the key results in nought (zero). If an attacker gains no information regardless how much ciphertext he has intercepted and analysed, these systems are called ideally secure. So our task was to develop a system that would not allow known cryptanalytic techniques to be applied, like frequency counts etc., meanings that the generated key was unpredictable and indistinguishable like the key in the original OTP. Before we go into our system let us first clear up the question if the OTP is a stream cipher or something that stands apart from all other ciphers. The first observation is that without any padding the length of the output has to match the length of the input. The second observation is that each output n (bit, bits, byte or character) only depends on the key regardless of previous or following input n. The third observation is that the process used to create the output cipher has to be unpredictable; and in that sense the OTP is a stream cipher.

    The operation of a stream cipher in mathematical terms can be shown by designating X, Y, Z as plaintext alphabet (X), ciphertext alphabet (Y) and the key-stream alphabet as (Z). We define K as the key space and S as the internal state space. Our plaintext symbol becomes xi, the ciphertext symbol yi and key-steam symbol as zi and the internal state as i. F will be the next state function and f the output function that gives us k Є K.


    The resulting key-stream (sequence) is {zi = f (k, si ) : i ≥ 1}, with the internal state being the same all the time. Here let us recall what we said about the permutation cipher; that a single character encrypted (using the English alphabet) by mapping plaintext alphabet against ciphertext alphabet always provides a cipher character that offers a 1/26 probability when trying to guess the plaintext character. By using one unique permutation for each single plaintext character we ensure that ℓ = 1 (length) and that the internal state always matches i ≥ 1. The encryption process in the OTP achieves the same by relying on the human operator and here too we find i ≥ 1. The operator always matches a plaintext character with a randomly selected key character (next state function - F) and uses modular arithmetic to achieve k = c (output function - f) leading to the equation that c = k ⊕ m. As in the permutation cipher, which uses the plaintext as next state function - F -, the OTP uses human intervention to fulfil that function. But what is important is the fact that in both encryption methods a key is always selected from a full set of a ciphertext alphabet (26 characters in the English language language - see Image 1, page 2 - the_one_time_pad.html) ; otherwise he would fall short of the definition coined by Claude Shannon when describing the operation of the OTP. To comply with the first condition requested by the OTP (the key must be truly random) the operator picks the key character and in that moment randomisation takes place.

    OTP - Permutation?

    In the example above we demonstrate why we maintain that the OTP is nothing else than a permutation cipher, that encrypts each plaintext character using a single unique permutation. We look at the first two plaintext characters 'HE' and how the key characters 'XM' are matched with them (knowing that the pool of characters in the English alphabet from which we draw our key characters has to be always 26) it also becomes clear why we need two transmissions. Each new permutation created is a random event and none of the permutations is created by using an IV. The sequences of the remaining characters in the alphabet from which the operator choose the random character are of no consequence. But it requires transmitting each key (character) and each cipher (character) to enable the recipient to decrypt a message (plaintext). The system we developed and suggest uses an IV that requires one initial exchange. After the exchange parties can transmit messages and only require transmitting the cipher; removing the risk of a second transmission and its interception, which would expose our message. Readers have suggested to us that the exchange of the IV (and here the sequence matters) would theoretically make it possible to break our ciphers. We will move to the next part and see if that statement holds true or if we still maintain perfect secrecy as defined by Claude Shannon.


    Now let's take a look at the system we suggest. We know that we have a stream cipher based on permutations that change during the encryption process. We know the mathematics applied when using a stream cipher (see above) . We know that the next state function - F in the OTP is invoked by the actions of a human operator to ensure that entropy and randomness are maintained to grant perfect secrecy; in our system plaintext and randomised strings (permutations) fulfil that function. The IV exchanged between sender and recipient is build around two strings of a modified ASCII set. An adversary would not know the length of these two strings or the modifications in the character sets. The options we have are to extend the ASCII strings above the length of 256; reduce the length if we don't use plaintext that uses the complete ASC set or we can leave it at a length of 256. The important point is that an adversary gains no knowledge of the length of these two ASCII strings. Without having knowledge of the length of the two strings a brute-force attack (meaning testing all possible permutations) on the ciphers will fail. The reason is a simple one, since mathematically it wouldn't be possible to go through all possible permutations. The length of our ASCII strings would always be defined by n with n being any natural number. One of the comments we received was:

    The pre-shared key. Just like a traditional OTP, your system is symmetrical and requires a pre-shared key. With a normal OTP, this key is equal or greater to the size of the plaintext so it cannot be brute-forced (or rather, brute-forcing would result in every possible result). In your system it is not necessary for the key to be as large as the plaintext. No matter how good your permutation cipher is, it does open up the possibility of brute-forcing. I can't say exactly how this would work, but brute-forcing goes from mathematically impossible to mathematically possible. Even if it would take longer than the age of the universe to compute, it looks like it would still be theoretically possible.

    Three short answers in reply to the comment. 1) We don't have a pre-shared key but our two ASCII strings are IVs (initialisation values) that combined with the plaintext create the key during encryption. 2) The cipher we create will also reflect the length of the key (our key is mapped from plaintext alphabet to ciphertext alphabet) since each single plaintext character is encrypted by matching it with a cipher character - plaintext character, key and cipher character always have the same length. 3) If it takes longer than the age of the universe to compute, it will be theoretically and practically impossible to compute. Brute-force means going through all possible combinations and if the universe comes to an end so comes computing, and not being sarcastic yet, mathematics too; at least logic should tell us that.
    Here let us take a short look at the mathematics involved an adversary would face when applying a brute-force attack. An adversary not knowing the length of our two strings would have for a starter to choose n (length for each string), which might not be the length we selected (and that will be the most likely scenario). If, for the sake of argument let us assume we haven't increased the size of our strings, the length of both strings contains 256 characters, an adversary would face for each string P256! permutations which he would need to check. (P256! = 8.5781777534284265411908227168123e+506) An adversary would have to check the first permutation of string number one against all P256! permutations of string number two and that process would need to be repeated with all permutations string one holds. This on its own would require a tremendous amount of computing power and time (Professor Keith Martin -
    Will superfast 'quantum' computers mean the end of unbreakable encryption?) But this is not the end of the story as readers might conclude that have read our PDF file. There we explained that the ASCII extended set contains control characters (control of peripherals), which we might replace with characters that are used during our encryption. An adversary without knowledge about the amount of characters we have replaced would also have to test all possible permutations. For example let us assume we have replaced one control character with the letter A. Certainly will the change have an impact on the encryption process (depending on the length of our plaintext and the length of our exchanged IV strings). But the more important point is the fact that changing one character created a new string and each of the P256! permutations will not match any of the permutations in the unaltered string. Let's assume now that the pool of characters we could replace is 12 (there are more control characters that could be replaced) and the pool of available characters to replace them with is about 100 (that would cover the English alphabet, numbers, punctuation marks etc.) the amount of possible permutations becomes astronomical. We have still not reached the end of our story since the length of our two IV strings isn't bound to be 256 but could be less or more. String one for example could have a length of 256+n and n could be 1, 100 or any rational number. We would have to repeat the same procedure mentioned above to break our cipher. In doing so an adversary would end up with all possible solutions and that would not only include text but images, media files etc, which leads us to the last part. Here we will take a look what other options an adversary has to break our cipher.

    From the OTP we know that C = M ⊕ K because message character (m) and key character (k) generate cipher character (c) using modular arithmetic (mod26 when applied at the English alphabet). We also know that every written language holds statistical properties that cryptanalysts and linguists might use to break a cipher (frequency of character, pattern of characters like 'th' 'er' in the English language etc.). Readers that are interested in the mathematics should look at the papers Claude Shannon published, use Wikipedia or books on mathematics that deal with cryptology. Here we will spare us to reiterate it and focus on a solution that will make it impossible to use these tools to break our cipher. The OTP achieves this by following the first three rules (1. The key must be truly random - 2. The key must be as large as the plaintext [or larger] - 3. The key can never be reused in whole or part) which need to be observed strictly to grant perfect secrecy. Any letter that might jeopardise the even distribution or be part of a pattern can be replaced by human intervention (operator). The system we developed leaves it to chance and the randomisation of strings, depending on the plaintext. Since we know that chance doesn't make a reliable bedfellow we apply a fundamental change to the way we record our cipher characters. Instead of recording characters or the hex number they represent we record the difference between these characters they hold on our strings.

    Difference between characters

    Looking at the example above it becomes clear that linguistic tools will not help to break a cipher created by our system. In the alphabet ordered a to z it is easy to break the cipher but in the permutation below the numbers become meaningless for an adversary. Each number holds the same entropy as a cipher character created with an OTP and the possible solutions to the complete cipher are also the same. If we use the permutation for a second, third or forth time we would lay the foundation for an adversary to gather enough sample data that would allow using linguistic tools. But from what we outlined above and how our permutations work (see image to the right and creating new permutations during the encryption) we know that an adversary can't collect these sample data. It will be easier understood if we take a look at another scientific discipline, which is astrophysics. Until not so long ago we were told that there is only one universe - ours - and beyond that there was nothing. In the last few decades that has changed and now one theory toyed with is the idea of the multiverse. In some of these universes the conditions might be different when looking at the speed of light, gravitation or how atoms interact.
    If we now say that one of the conditions different in these universes is the alphabet (assuming that they all have the same length) and looking at our examples in the above image we understand that an adversary can't break the ciphers two and three. We are using the same language, the same characters and only the distribution of the characters on the string has changed. One minor change that makes the two ciphers resulting from the randomised strings as strong as the modular arithmetic applied to the plaintext and key character in an OTP. As said before we are using the same language and characters, but by introducing the ASCII extended set, with the possibility of added characters and replaced control characters, we change everything. Now the cipher can be any written language, a number, a postcode and depending on the input length of our plaintext could be an image, media file etc.

    We will now take a look at a comment made by one moderator on www.reddit.com who suggested to us that we might not fully understand how powerful language and statistical tools are, used in trying breaking ciphers. As it is true that these tools are powerful and useful when analysing ciphers that in their entirety are based on alphabets and characters (the adversary trying to break the cipher needs to have an inkling about the language he is dealing with), but it seems a bit ambiguous to suggest that these tools can be used on a cipher, resulting from a plaintext that has been created using one of today's word processing packages. Claude Shannon in his papers explained the entropy a language holds (using the English language and its alphabet) and how much text might be needed and intercepted to break one of these ciphers. But the important part is that it was based on a 26 character alphabet. Looking at our website and page we obviously notice that we have punctuation marks, upper and lower case characters, numbers, brackets, italics, bold and underlined text etc.; these are the visible differences when comparing it with a text that solely is based on an alphabet. But there also characters we don't see but are part of the document and they are encrypted too. These characters, we can call them codes, are used to format the document and they are not visible and certainly have nothing to do with a language but depend on the word processing package we use and might differ from package to package. The distance between two points (that's what our cipher characters have become) can not be calculated when the plane on which these points are situated is randomly changing (random strings) all the time.


    Some personal thoughts

    There is no such thing as perfect secrecy in cryptology as long as there is still one person that knows the secret; in that case it becomes an oxymoron. One might say that a secret shared between three people requires two people to be dead to stay secret. But even at that point an adversary determined to gain access to a password, a key etc. will use other means of aquiring the information if a brute force attack fails and the means might involve bribery, theft of computer equipment or the use of physical viloance. Still the world is moving into an era where it seems prudent to have a certain level of privacy that comes close to what is described as perfect secrecy, when exchanging personal opinions with family members or friends. Democracy and the definitions of the values that our ancestors had in mind when giving their blood and lives for enabling us to enjoy these privileges, are slowly dismantled under that dark blanket that is called national security. The idea that the highest duty of the state was to defend these privileges has been turned upside down into a law that requests from us to give up these privileges to ensure the survival of the state. By applying this maxim our democracies are turning into the same political systems that our political leaders tell us we need to be protected from; selling us the same brand of political system, only in a different box with a different label. If history has taught us one lesson it is, that systems that rely on their survival by suppressing the rights and privileges we have in democracies, only have a limited shelf life. The rot from within will one day cause these systems to fail no matter how much effort is taken by their leaders to disable the spread of information that might endanger their survival.


    Page Selection

    Copyright (c) 2014/2017 - Wandee Thaweetham, Chanthaburi, Thailand - Last update 10th January 2017