This chapter gets a little tricky and technical, but it really is important in order to understand later material. We’re going to talk about security as it applies to information and computers.
What does it really mean to be “secure”?
Security is the… hmm. Well… I can’t think of any short way of saying this.
I’m sure that most people desire and appreciate security, but outside of a gut instinct I doubt that the average Joe could tell you exactly what is or is not secure—especially when it comes to computer stuff. Part of this difficulty comes from Security being quite an overloaded term.
If something is “secure”, we often think of it as “safe”—but safe in what way?
It could be safe from prying eyes.
Confidentiality is about keeping information secret. This is the type of security that I think most people have in mind when they think about computers. Confidentiality is especially important with the Internet since every message must pass through several intermediate computers before reaching the recipient.
It could be safe from damage or modification.
Integrity is about ensuring that information cannot be secretly modified or modified without authorization. Integrity can be just as important as confidentiality.
Suppose you make a confidential credit card purchase—so no one can steal your credit card information—but the purchase lacks integrity. A third party might not be able to read your credit card number but they could substitute a different one. Since the transaction does not have integrity, the store you purchased from would be unaware of the change and would charge the wrong person or refuse the transaction.
It could be safe from forgery.
Closely related to integrity, authenticity is about ensuring that the source of data is correct.
Suppose you make a confidential credit card purchase with integrity, but you have no authenticity from the site you purchased from. How do you know that the site actually belongs to the store that will ship you your product and not to some scammers who will take your money and run?
It could be safe from disruption.
What use is information if it isn’t available when you need it? If a hacker were able to shut down your bank’s computer systems, it would still be a big problem for you even if none of your money or information was stolen.
It could be safe from deniability.
Unlike the other components of security, non-repudiation concerns the people who are supposed to have access to the information. It means that the parties involved cannot later deny their involvement.
For example, after making a credit card purchase you could attempt to claim that you did not authorize the purchase and should be refunded. Alternatively, the store could claim that they never received the money and so you must make the payment again. Non-repudiation seeks to prevent both kinds of situtations.
While those principles are good for conceptually thinking about security, we need a way to implement them for computers. That is largely done with a single versatile tool: ciphers.
A cipher is basically another format for encoding information. Back when we first started learning about the Internet, I compared the act of formatting data into a packet to putting a letter into a package. Ciphers are similar, but the package is a metal safe with a lock on it. To open it you need a little something extra: a key. For a locked safe, that key could be numbers for a combination lock or a physical key for a padlock, but for a cipher the key is extra information.
A cipher is a data format that can only be decoded using additional (usually secret) information called a key. Ciphers are often called codes by the general public, but this can be confusing to a superuser since “code” has several other technical meanings.
Ciphers might just be a different type of data format, but there are lots of cool and fancy words for talking about them.
Plaintext is just the regular, understandable information that will be used with a cipher. The “text” part is a misnomer since most digital ciphers can store any sort of information: text, image, audio, video, etc.
Ciphertext is the result of using a cipher with plaintext. This is the locked safe.
Encryption is the process of producing ciphertext from plaintext. The reverse process of getting the plaintext back from the ciphertext (usually using the key) is decryption. Ciphertext is often referred to as encrypted information.
While cipher and ciphertext are proper terms for discussing this topic, you won’t hear them very much among regular folk. They are mostly used by those who study ciphers professionally.
Caesar’s cipher is a classic example from the BC years. It was used extensively by Julius Caesar himself to relay battle orders. It is very simple to use even without a computer. First you choose a key, which can be any number. We’ll pick 5. Then you take your message
ATTACK AT DAWN
and encrypt it by replacing each letter with the 5th letter after it in the alphabet, wrapping around from Z back to A if necessary. For example A becomes F, D becomes I, X becomes C, etc. This gives us the ciphertext
FYYFHP FY IFBS
There’s our confidentiality. If our foes intercept our message they will have no idea what is actually being said. But since the true recipient of our message knows that we’re using the Caesar cipher with a key of 5, they can decrypt the message by shifting the letters in the opposite direction by that amount.
Note that we also get a bit of information integrity for free. Someone could try to intercept our message and add false information to it, but without knowing the key there is little chance that what they add would make sense once the message is decrypted.
But Caesar’s cipher—as you might expect for something over 2,000 years old—isn’t very strong. Consider the fact that using a key of 26 is the same as using a key of 0 i.e. nothing gets shifted and your ciphertext would be identical to your plaintext. Similarly using a key of 27 is the same as a key of 1, using 28 is the same as 2, etc. In other words, there are really only 26 different key values and one of them (0) is useless since it does nothing at all. Why is this a problem?
Well, suppose our foes do intercept our ciphertext. If they know that we’re fond of the Caesar cipher, they can guess that we probably didn’t use the key 0 so they try the other 25 keys:
EXXEGO EX HEAR DWWDFN DW GDZQ CVVCEM CV FCYP BUUBDL BU EBXO ATTACK AT DAWN ... IBBIKS IB LIEV HAAHJR HA KHDU GZZGIQ GZ JGCT
The key guess of 1 produces a pretty compelling “EXXEGO EX HEAR”, but I think they might pick out “ATTACK AT DAWN” which—unlike every other key guess— produces three actual English words. If the message were longer, the interceptors could be even more confident of their guess since it would be even less likely that the wrong key would produce a legible phrase. The interceptors now have our plaintext and the key. We’ve been cryptanalyzed!
Cryptanalysis is the study of ciphers, often for the purpose of extracting the key or plaintext from ciphertext.
With the key in hand, the interceptors have completely compromised the security of our cipher. They can read the message or even change it to “FYYFHP FY STTS” to ruin our battle plan.
Caesar’s cipher turned out to be a bust, but we can do better. We can do much better. In fact there is a perfect cipher—one that is completely unbreakable. The key, however, needs to be much larger. Rather than picking just one number for the key, now we pick a random number for every letter in our plaintext. “ATTACK AT DAWN” has 12 letters, so after a little work with a 26-sided die, we get
6 4 5 5 9 12 6 1 0 16 16 0
Note that while a key of zero was useless for the Caesar cipher, now zeros in our key are okay.
Encryption is similar to the Caesar cipher but now each letter is shifted by its corresponding number in the key. The first A is shifted 6 to become G, the first T shifts 4 to become X, the next T shifts 5 to become Y, and so on. The result is
GXYFLW GU DQMN
Why is this suddenly unbreakable? After all, our adversaries can once again try every key as they did before. Well their first challenge is that there are now 2612 = 95,428,956,661,682,176 keys to try. Trying that many keys would be time-consuming even for a computer, but even ignoring the practical details there is a much more problematic obstacle. Suppose our adversaries guess
key: 8 1 0 20 10 17 0 19 16 0 15 2 plaintext: YWYLBF GB NQXL
This is clearly gibberish, so the key is probably wrong. They try some more keys and eventually get
key: 6 4 5 5 9 12 6 1 0 16 16 0 plaintext: ATTACK AT DAWN
Ah ha! This looks legitimate! But then they keep going and get
key: 6 4 5 5 9 12 6 1 16 2 24 0 plaintext: ATTACK AT NOON
or is it
key: 6 4 5 5 9 12 20 22 18 16 16 0 plaintext: ATTACK MY LAWN
and we really can’t rule out
key: 1 9 11 2 0 18 20 22 2 22 19 20 plaintext: FONDLE MY BUTT
The point is that there is now a key for every possible combination of letters—that includes every single grammatically correct English sentence with three words of those lengths. There is no way for our adversaries to know which phrase we meant!
This type of encryption has some downsides, however. For one, you need a whole lot of key. Suppose you wanted to encrypt an entire book—that would require an awful lot of dice rolling. And even if you had that much key, you would need to somehow secretly get the key to the recipient ahead of time without it being intercepted. Lastly, the security of the cipher relies on the fact that you never reuse a key. So even if you go through all of that effort to generate a lot of key and get it to the recipient secretly, after one message you have to throw it out.
As for why one should not reuse key with this cipher… that’s a little tricky.
Suppose you sent a long message each day with this cipher but reused the same key each time. A patient adversary could collect all of your messages and compare them. If they figure out that you’re using the same key each day, then they know that the first letter of each message is shifted by the same amount. If any of your messages start with a one-letter word, then your adversaries would immediately know that the shift needs to be something to give a plaintext of A or I in those messages (since these are the only common one-letter words). Using those two shifts, they can guess the first letters of the messages starting with two-letter words. Since there are only so many two-letter words, this helps them guess the shift for the second letter in the message. And so on.
The crucial difference is this: with only one random key and one plaintext, correctly guessing part of the key gives you no insight to the rest of the key (e.g. “ATTACK MY LAWN”). When you reuse key across multiple messages, it creates patterns among the ciphertext that a clever adversary could detect.
Because of these limitations, before the age of computers this cipher was implemented using mass-produced pads of paper with sheet after sheet of random numbers on them. For example, the pads would be distributed to naval vessels before a long voyage with instructions about which page of the pad was to be used for key each day. This allowed for secure communication so long as the pads never fell into enemy hands. For this reason many pads were designed to quickly dissolve in water, so that they would not be recoverable if the vessel were to sink. Even when used today in digital communications, this cipher is often referred to as a one-time pad.
When computers do encryption they follow very similar steps to those we just learned about: take your plaintext, add key to it, get your ciphertext. But of course now everything needs to be in binary.
Let’s start with the simplest possible binary plaintext: a single bit. I pick 1. Previously we had one number in our key for each letter in our plaintext, so in binary there’s only one sensible choice: we’ll have one random bit of key for each bit of plaintext. Suppose our computer flips a coin and gets the random bit 1 for our key.
Now we have to add them together. Previously that meant shifting letters in the alphabet, but again things are simpler now that it’s all binary. We can add the bits together mathematically.
1 + 1 = ?
Now, in binary, that would normally be 10 but we’re going to add in a weird sort of way. To keep things simple we’ll ignore the 1 that is carried over, leaving just the 0. For the other possible values of binary addition we don’t have to carry a digit, so the addition occurs normally. To recap, here are all the cases for this weird addition:
0 + 0 = 0 0 + 1 = 1 1 + 0 = 1 1 + 1 = 0
There are a few different names for this kind of addition. Sometimes it’s called “addition modulo 2” (or mod 2 addition), which means that we force the answer to always be smaller than 2. It is also called “exclusive or” (or XOR; pronounced “exor”) since A + B equals 1 if A is 1 or B is 1 but not both. As part of an equation, you will often see it written as a plus sign in a circle or as a caret e.g. 1 ⊕ 1 = 0, 1 ^ 1 = 0.
If you think way back to the chapter on adding machines, you might recognize this as a sort of overflow. We have only one wheel with 0 and 1 on it, so when adding 1 to 1 we loop back to 0 and there are no others wheels to pick up the extra 1.
Using XOR, it’s very easy to add even long binary numbers by hand since you never have to keep track of carried digits. You can look at each pair of digits on their own.
In our case, our plaintext is 1, our key is 1, so our ciphertext is 0. As before, we have perfect secrecy using this method. Our random key could have been 0 or 1, so our adversaries have no idea whether our plaintext was 0 or 1. But how do we decrypt? Once again, this proves even easier than before. Observe the following:
0 ⊕ 0 = 0 1 ⊕ 1 = 0 0 ⊕ 0 = 0 1 ⊕ 0 = 1
In other words, for XOR we have the two rules
X ⊕ X = 0 X ⊕ 0 = X
Now recall how encryption worked:
plaintext ⊕ key = ciphertext
Now let’s monkey around with that encryption equation a bit:
plaintext ⊕ key ⊕ key = ciphertext ⊕ key plaintext ⊕ (key ⊕ key) = ciphertext ⊕ key plaintext ⊕ 0 = ciphertext ⊕ key plaintext = ciphertext ⊕ key
The process for encryption is the exact same as for decryption. Add key to plaintext and you get ciphertext, add key to ciphertext and you get plaintext. Cool, huh?
Now perhaps you aren’t impressed by any of this since all we encrypted was a measely bit, but what is binary data other than a string of bits? All you need is a binary format for your data and random bits of key for each of your plaintext bits and you’re good to go: encrypt each bit individually as we just did. Using ASCII to encode “ATTACK” as binary, and using 48 random bits of key, we can encrypt text:
01000001 01010100 01010100 01000001 01000011 01001011 (plaintext) 01000000 11101110 11100111 11110101 01101110 00000011 (key) ----------------------------------------------------- 00000001 10111010 10110011 10110100 00101101 01001000 (ciphertext)
This encryption method has a more general name:
A cipher that encrypts each bit independantly—where each is bit unaffected by other bits in the plaintext or key—is called a stream cipher. A stream cipher doesn’t need to be encrypted or decrypted all at once; each bit of the plaintext and key can pass through the cipher bit-by-bit (like water flowing through a stream).
Notice how a stream cipher doesn’t mention how you come up with your key. So a one-time pad is just a special kind of stream cipher.
The nature of stream ciphers can be a disadvantage in some cases. For example, if an attacker somehow gets part of your key, they can read part of your plaintext—which might be all they need. A totally different class of ciphers called block ciphers solves this by scrambling up big chunks of plaintext all at once using the key.
Block ciphers are sort of the opposite of stream ciphers: every bit of the ciphertext should be dependent on every other bit—ensuring that the entire message must decrypted all at once or not at all. Unfortunately they are much more complicated than stream ciphers, so we won’t discuss them in this chapter.
Now that we know how to do perfect encryption in binary, what more could there be to discuss? It turns out that there’s a lot, because one-time pads are extremely impractical when it comes to digital encryption. Remember that you need just as much key as plaintext, and to maintain security of the one-time pad you cannot ever reuse key. Computers regularly encrypt megabytes if not gigabytes of information, meaning that you’ll need millions or billions of random bits every day. Why is that a problem? Recall that computers are machines that compute functions: algorithms that give you the same result every time you provide the same input. That is exactly what we don’t want when it comes to randomness! We want the computer to be able to generate different key bits every time it encrypts.
Fortunately, clever engineers have found a few tricks for getting random numbers out of computers—such as timing user input or reading information from the various devices plugged into the computer. However these methods are very slow compared to normal computation. For example, on the laptop I’m writing this with, my computer can only generate about 3-4 bytes of randomness per second during normal use. If I stop using it, it quickly runs out of randomness since I’ve deprived it of a major source (my input). One way computers get around this limitation is that they’re always generating randomness, and if you don’t need it at that moment, they save it for later in something called an entropy pool.
Randomness is sometimes called entropy—especially when measuring how random something is.
For example, suppose we generate 16 random bits using an unfair coin that lands on heads 70% of the time. The result will still be “random”, but intuitively it will be less random than if we were to use a fair coin. In fact, using some fancy math we can calculate that the 16 bits from the unfair coin have only 14 bits of entropy. In other words, they have the same “amount” of randomness as 14 bits produced by a completely fair coin.
Even with an entropy pool, computers can’t produce enough randomness to keep up with the demand. Instead they usually rely on pseudorandom numbers.
Pseudorandom numbers are numbers that “look” random. A function that produces pseudorandom numbers is called a pseudorandom number generator or PRNG.
One of the first PRNGs was the linear-feedback shift register. Don’t worry about the fancy name: they are surprisingly simple. All you need in order to make one is to choose the register size in bits, N, and then choose several “taps” between 0 and N−1.
Operating the shift register is also simple: you choose a starting value called the “seed” which has N bits. Then you XOR the bits indicated by the taps, put the resulting bit on the end, and shift everything over by one. Let’s see how this works in an exmaple.
We’ll operate a 4 bit shift register with taps 2 and 3 using the seed 0100.
taps: .. register: 0100
Here I’ve marked the 2nd and 3rd bits with dots since these are the taps (the leftmost bit is the 0th). We XOR the bits under the taps and put the result on the end. 0 ⊕ 0 = 0, so…
taps: .. register: 00100 ^
Now we shift. You can think of this as the register shifting to the right or the taps shifting to the left. Either one works:
taps: .. register: 00100
The next step will add 1 ⊕ 0 = 1, so the result is
taps: .. register: 100100
Do you have the hang of it? Try to follow along for the next few steps:
taps: .. register: 1100100 taps: .. register: 01100100 taps: .. register: 101100100 taps: .. register: 0101100100 taps: .. register: 10101100100 taps: .. register: 110101100100 taps: .. register: 1110101100100 taps: .. register: 11110101100100
There are a few key observations we need to make about shift registers.
- This process can continue forever! We have an infinite number of pseudorandom bits now.
- The first four bits are all that matter in determining what comes next—after a bit is shifted past the last tap, it no longer has any effect on later bits.
- Since only the first four bits matter, there are only 24=16 different sequences that can be produced by this PRNG.
- Most importantly, 1, 2, and 3 together tell us that for every seed, the PRNG will eventually start repeating itself.
For our last example, the full sequence we get is
Notice how the first four bits are the same as the last four bits. If we continue generating bits, it will repeat this sequence over and over again. However, look at what we’ve gained: starting with just 4 bits, we’ve produced over 16 bits that look pretty random. This is the trick of PRNGs: we use a true random source (like an entropy pool) to slowly generate a small seed, then the PRNG quickly spits out a whole bunch of randomish numbers for us to use as key. The larger the shift register, the better: with an 18-bit register and taps 0 and 11, we get over 250 thousand psuedorandom bits out!
Before we get too excited, let’s make sure this is still secure. PRNGs aren’t secure like one-time pads: they don’t have enough key. Suppose we encrypt 250 thousand bits using that 18-bit shift register. From our previous observations, we know that there are only 218 different key sequences we can generate. Therefore, if an adversary intercepts our message and knows what sort of PRNG we’re using they can try all 218 until they get plaintext that looks legitimate. Using a computer, this sort of trial-and-error approach would be nearly instantaneous. This is the same problem we had with Caesar’s cipher.
The crucial point is this: a one-time pad is perfectly secure because for N bits of plaintext, you have N bits of entropy. But PRNGs can’t produce entropy. When you make 250 thousands bits from an 18-bit shift register, they still have only 18 bits of entropy among them.
So then all hope is lost? PRNGs are useless? No! Computers can only try keys by trial-and-error so quickly. The trick is to pick a key large enough that it is impractical to guess.
A secure key is one with so many possible values that even if an adversary could guess keys very quickly using a computer, it would take so long to find the right one that the information they could decrypt with the key would no longer be useful by the time they figure it out.
This definition of “secure” can be a little surprising the first time you hear it. It means that nothing is secure forever with computers—a sufficiently determined attacker can always keep guessing keys until they stumble on the right one. For example, if your credit card number is secured with a key so large that it would take 5 years on average for a computer to guess it, that’s probably “secure enough”. In 5 years you probably aren’t going to have the same credit card number, so by then it won’t matter if it’s guessed. At the time of this writing, 80 bits is considered about the bare minimum key size you need to be considered secure for most common purposes—but that will increase as computers get faster. Most ciphers play it safe by using even larger keys. So PRNGs still have a place: by using one with a key that is just big enough to be secure, you can quickly encrypt plaintext that is many times larger than your key.
I used linear-feedback shift registers as an example in this section because they are very easy to explain. But among PRNGs, they aren’t very good: most shift registers can be quickly broken using some clever math.
There’s a rather large gap in our cipher discussion so far: whether you’re using a one-time pad with a massive key or a PRNG with a tiny key, you still need a way to get the key to the recipient. To figure out this problem, let’s go back to the analogy of sending actual packages in the mail.
I mentioned earlier that encryption is like locking your package in a safe. Now suppose Alice wants to send Bob a package securely. If she locks it in a safe, it will certainly be secure from prying eyes, but how will Bob open it? If Alice and Bob can meet in person ahead of time, they could exchange the key then. But this has obvious limitations:
- What if Bob loses the key? If someone else finds the key, the security of their communication is compromised. Then they will need to meet again to exchange a new key, which is inconvenient.
- What if Alice and Bob have never met in person? Shouldn’t there still be a way to securely send messages to each other?
The solution is clever and simple. Instead of Alice supplying the key and safe, Bob does. Then Bob mails Alice the unlocked safe, which she places her package in and locks. When Bob receives the safe he can unlock it because he had the key to begin with.
This works because of an important difference between real-world locks and the ciphers we’ve talked about so far. For a real lock, you usually don’t need a key to lock it—you only need one to unlock it—but for our ciphers we encrypted and decrypted using the same key.
Ciphers that encrypt and decrypt using the same key are called symmetric-key ciphers.
So now we need a digital lock: a cipher that anyone can use to encrypt, but only the key holder can decrypt. Another way to think about it is for the cipher to have two keys: one key for encrypting and a different key for decrypting. The encryption key can be distributed publicly—it’s like the unlocked safe that Bob sends in the mail. The decryption key is kept secret by the recipient.
Ciphers that use different keys for encryption and decryption are often called public-key ciphers since the encryption key does not need to be kept secret and can be shared publicly.
How do public-key ciphers work? So far, they all rely on tricky math problems. In particular, they use math problems that are very hard to solve if you only have a little information, but are very easy to solve if you are given some extra information.
For example, suppose I ask you to find the two integers X and Y greater than 1 such that X×Y=43,047,409. This is a rather tricky problem: no one knows a faster way of solving it than just trying every possible value for X and Y until you stumble on the right ones. Now suppose I tell you that X=10,079—suddenly the extremely difficult problem is extremely easy: you can just divide to get Y. In this example, X×Y is kind of like a public encryption key and X is like the private decryption key: as long as we keep X secret, we can share X×Y freely and everyone will have a hard time finding X and Y.
You might have noticed me try to sneak something by you just then: “find the two integers”. How do I know that there are only two numbers that multiply to that value? It’s because 43,047,409 is the product of two prime numbers. A result of the fundamental theorem of arithmetic is that the product of two prime numbers cannot be the product of any other integers (greater than 1).
Public-key ciphers are often used to perform a “key exchange” for a symmetric-key cipher. If you want to communicate to someone with a symmetric-key cipher, you can choose a random key and send it to them via a public-key cipher. This gives you the best of both worlds of public- and symmetric-key ciphers.
Unfortunately the actual encryption/decryption process is more complicated with public-key ciphers (due to the tricky math) and goes beyond the scope of this book. If you want to know more, I recommend reading about RSA, one of the first public-key ciphers.
- Decrypt “QZGYQDMFQ NK OAXGYZ” using Caesar’s cipher with a key of 12.
- Assuming “UJSUCWV” was encrypted using Caesar’s cipher, recover the key and plaintext.
Caesar’s cipher is no good, but here’s a slight improvement: rather than shifting every letter by the same amount, how about we scramble up the whole alphabet? So our key might look like this:
Which means that every A is replaced with a G, every B is replaced with E, every C with U, etc. This is called a substitution cipher. Use the above key to decrypt the ciphertext “WHSAA EDKLTM”.
This substitution cipher has over 4 septillion possible keys—that’s greater than the “minimum” of 280 that I mentioned earlier with PRNGs, and of course much, much greater than the 26 keys we get with Caesar’s cipher. But in fact substitution ciphers are only marginally more secure than Caesar’s cipher.
Can you think of how you might be able to crack a subtitution cipher without trying every single key? If you skipped it earlier, read the blurb about not reusing one-time pads for inspiration. Here’s a short paragraph encrypted using a substitution cipher if you want to take a crack at it:
KWEW AU S EWSUZXSOPD PZXL BSESLESBK. AC UKZJPR OW PZXL WXZJLK CKSC DZJ QSX QESQG CKW QABKWE OD WISYAXAXL CKW EWPSCAVW TEWNJWXQD ZT CKW PWCCWEU SXR YSGAXL S TWM WRJQSCWR LJWUUWU. SU DZJ LJWUU YZEW PWCCWEU AC OWQZYWU WSUAWE CZ YSGW TJECKWE LJWUUWU OD QZYBSEAXL CKW QABKWECWIC CZ GXZMX WXLPAUK MZERU!
- Compute 0xB3C7⊕0xAB8F.
- Compute the resulting pseudorandom stream from the 5-bit shift register with taps 2 and 4 and a starting value of 0b11001.