In this article, we will briefly mention some of the encryption algorithms that are in wide usage today. We will also discuss the key sizes that are required to make these algorithms secure.
Ciphers or encryption algorithms can be classified into secret key and public key categories. Encryption schemes have been available through the ages and have all been what are known as secret-key algorithms. Here, the communicating parties (Alice and Bob in Fig. 1) share a secret key that they use to encrypt any communication between themselves. Usually, the encryption and decryption algorithms use the same key; hence, such algorithms are also called symmetric key algorithms.
Block ciphers such as the AES also fall under this category.
Figure 1 illustrates a schematic of a conventional encryption scheme. The opponent Oscar has access to the insecure channel and, thus, the ciphertext. However, he has no knowledge of the secret key k shared by Alice and Bob.
Secret-key algorithms such as AES are based on two principles: confusion and diffusion. The former introduces a layer of scrambling that creates confusion as to what exactly might be the transmitted message. The latter creates a randomness whereby the effect of changing a small part of the plaintext message will result in changing half of the encrypted ciphertext.
This eliminates matching patterns or frequencies of occurrence of messages. Most secret-key algorithms are thus unbreakable except by brute force. If the length of the key of a secret-key algorithm is n bits, at least 2n-1 steps are required to break the encryption. Today, a key length of 80 bits is considered to be sufficiently safe from brute-force attacks, even though a key size of 128 bits is usually recommended.
DES was the secret-key encryption standard for over 20 years. NIST examined proposals for the AES in 1998. Of the five candidate algorithms, NIST selected Rijndael as the algorithm for AES in October 2000. A variety of factors were considered by NIST to determine the suitability of the algorithm for a standard. Security (resistance to cryptanalysis, mathematical soundness, randomness of the algorithm output), cost (licensing requirements, computational efficiency on various platforms, memory requirements), and algorithm implementation characteristics (ability to handle variable key sizes and block lengths, implementation as stream ciphers and hash functions, hardware and software implementations, and algorithm simplicity) are three categories used for evaluation of these algorithms.
In addition to these standards, several freeware and other secret-key algorithms are available, such as IDEA, RC4 and Blowfish. RC4, in particular, has been widely employed in web browsers, as well as in wireless networks like IEEE 802.11.
Public-key encryption is a radical shift in the way data is encrypted. Diffie and Hellman (DH) introduced the concept in 1977. With secret-key algorithms, we have a situation that is similar to having a locked mailbox for each pair of users. Both users associated with a mailbox share a key that can unlock or lock the mailbox. Consider Fig. 2. Here, if Alice desires to communicate with Dan, she unlocks the mailbox shared between her and Dan, deposits the message, and locks the mailbox again. The message is now accessible only to Alice and Dan, who also has an identical key.
Clearly, the number of mailboxes required for N users like Alice and Dan is N(N-1)/2. For example, we have six mailboxes for four users, as shown in Fig.2.
The situation described above is not the natural way in which we employ mailboxes. Mailboxes are associated with individuals, not pairs of communicating parties. The natural way to employ a mailbox is as described in the following example. Alice owns a mailbox. Only she has a key to lock or unlock the mailbox (i.e. only Alice has complete control over the mailbox). Any other person who wishes to communicate with Alice will deposit the message through a slot in the mailbox. Once the message is deposited in the slot, only Alice has access to it. Even the originator of the message cannot retrieve it, although they may regenerate the message from knowledge of the contents.
Public-key algorithms are similar to this example. Each individual has a pair of keys: the public key and the private key. As the name suggests, everyone knows the public key. So, anyone can employ the public key to encrypt a message intended for the owner of the key.
The public key is like the slot in the mailbox. Only the owner knows the private key. As a result, once the message is encrypted using the public key of the owner, only he can decrypt the message. Not even the originator of the message can decrypt it once the message has been encrypted.
Figure 3 shows the schematic of a public-key encryption scheme. Note that there is no longer a need for secure transfer of the key. Alice encrypts a message intended for Bob with Bob’s public key Kpub,bob. The ciphertext is decrypted by Bob via his private key.
The design criterion for public-key algorithms is as follows. Given a function f(k, x), the following properties always hold:
- it is extremely easy to compute y=f(kpub, x);
- given kpub and y, it is computationally not feasible to determine x=f-1(kpub, y);
- with a knowledge of kprv that is related to kpub, it is easy to determine x=f-1(kpub, y).
Functions that have the above properties are called trapdoor one-way functions. Examples are the factorization problem and the discrete logarithm (DL) problem. The former is based on the fact that it is easy to multiply prime factors to arrive at a composite number (e.g. it is easy to find 7x17x109x151=195.821, but it is quite a hard task to split 30.616.693 into its prime number factors). The latter is based on the fact that it is easy to determine what 223 mod 109 is (the answer is 77). It is quite hard to find out what u is given 2u mod 109=68.
Note that with real number arithmetic, it would have been trivial to determine u, as u=log268. The modulo function, which reduces the operations to be on set of numbers that are nonnegative integers less than 109, makes this problem very hard to solve. Integer factorization is employed in RSA (from Rivest, Shamir, and Adleman) and the DL problem is used in the DH key exchange protocol and digital signatures.
DH key exchange protocol
The DH key exchange protocol is based on the DL problem previously discussed. Let us suppose that Alice wishes to exchange a session key with Bob without sharing any secret with him. Alice chooses a base α and a large prime number p that are publicly known. She only chooses a random private number a. She computes kpubA=αa mod p, which she sends to Bob. Note that given kpubA, α and p, it is computationally impossible to determine a. Similarly, Bob chooses a private random number b and computes kpubB= αb mod p, which he transmits to Alice. Once again, it is extremely difficult to determine b. After obtaining the public keys of each other, Alice and Bob raise these public keys to the exponent corresponding to their own private numbers a and b respectively. That is, Alice will compute
ks = kapubBmod p = αabmod p;
Bob computes
ks= kbpubAmod p = αabmod p
This way, both Alice and Bob have generated a common session key. An adversary Oscar cannot determine this key without solving the DL problem. At least, there is no known solution for obtaining the session key other than by solving the DL problem.
RSA has been the most popular public-key algorithm. It employs integer factorization. The DH key exchange protocol based on DLs is also very popular in wireless networks. This protocol is commonly employed for key exchange for web transactions, e-commerce, and IP security. The digital signature standard (DSS) is also based on DLs. Signature schemes based on RSA are also widely employed.
However, in the case of public-key algorithms, Oscar, the opponent, is aware of Bob’s public key and this adds an additional parameter to the problem. Since public-key algorithms are based on mathematical structures, for small key sizes there are well-known results or tables that can be employed to break the encryption. As such, the key sizes are extremely large compared with secret-key algorithms. Today, for good security, public-key algorithms need keys that are 3 to 15 times larger than their secret-key counterparts.
Because the mathematical bases on which public-key algorithms work are well known, they are susceptible to analytical attacks and require much larger key sizes than secret-key algorithms. The mathematics of elliptic curves is also being employed in encryption schemes, since they need smaller key lengths than RSA.
Table 1 presents the key lengths and the time required to break some of the wellknown public and secret-key algorithms. The values in this table are based on the assumption that $10 million is available for computer hardware. The key sizes in each row are equivalent.
The mathematical operations for public-key algorithms are quite computationally intensive. Consequently, the encryption rates are quite small and public-key algorithms are rarely used for bulk data transfer. Instead, they are employed to exchange a session key between a pair of communicating entities who will then use the session key with secret-key algorithms for the duration of that communication (bulk data encryption). This ensures that a new session key is employed each time a communication is initiated, thereby reducing the possibility of an adversary breaking the encryption scheme.
Although the public key of an honest party like Alice can be made public, its authenticity needs to be verified, since Oscar can claim to be Alice and publish his key as hers. It is common to use digital certificates signed by one of a few trusted certification authorities to verify the authenticity of the public key. This approach is used in modern web browsers for e-commerce applications.