The storage1/attackme.cc
file contains a particularly dangerous kind of
undefined behavior, a buffer
overflow. In a buffer
overflow, an untrusted input is transferred into a buffer, such as an array
of characters on the stack, without checking to see whether the buffer is big
enough for the untrusted input.
Historically buffer overflows have caused some of the worst, and most consequential, C and C++ security holes known. They are particularly bad when the untrusted input derives from the network: then breaking into a network can be as simple as sending a packet.
The buffer overflow in attackme.cc
derives from a checksum function.
Our simple
checksum
takes in a pointer to the buffer, then copies that buffer to a
local variable, buf
, and processes the copy. Unfortunately, checksum
doesn’t verify that the input fits its buffer, and if the user supplies too
big an argument, checksum
’s buffer will overflow!
Buffer overflows cause undefined behavior—it’s illegal to access memory outside any object. Undefined behavior is always bad, but some is worse than others; and this particular buffer overflow allows the attacker to completely own the victim program, causing it to execute any shell command!
The attack works as follows.
The function’s entry sequence allocates local variable space with
subq $112, %rsp
. Examining the assembly we can infer thatbuf
is stored at this%rsp
.The function’s return address is stored at
112(%rsp)
, which isbuf.c + 112
.Any input
arg
withstrlen(arg) > 99
will overflow the buffer. (The 100th character is the null character that ends the string). Butarg
s with 100 to 111 characters won’t cause problems, because the remaining 12 bytes of local variable space are unused (they’re present to ensure stack frame alignment).arg
s of 112 or more characters, however, will run into the stack slot reserved forchecksum
’s return address! An attacker can put any value they want in that location, as long as the value contains no null characters (sincechecksum
’s buffer copy stops when it encounters the end of the input string).Overflowing the return address slot causes harm when
checksum
returns. Examining the state during the exit sequence, we see that the immediately previous instructions load%rdi
withbuf
and callfinish_checksum
. But what a coincidence—finish_checksum
does not modify%rdi
! Thus, whenchecksum
returns,%rdi
will be loaded with the address ofbuf
.Our attacker therefore supplies a carefully-chosen string that overwrites
checksum
’s return address with the address of therun_shell
function. This will causerun_shell
to run the programs defined in the initial portion of the string!
Thus, the attacker has taken control of the victim by returning to an unexpected library function, with carefully-chosen arguments. This attack is called a return-to-libc attack.
A version of
checksum
’s copy-to-aligned-buffer technique is actually useful in practice, but real code would use a smaller buffer, processed one slice at a time.
Defending against return-to-libc attacks and buffer overflows
Return-to-libc attacks used to be pretty trivial to find and execute, but
recent years have seen large improvements in the robustness of typical C and
C++ programs to attack. Here are just a few of the defenses we had to disable
for the simple attackme
attack to work.
Register arguments: In older architectures, function arguments were all
passed on the stack. In those architectures attackers could easily determine
not only the function returned to, but also its arguments (a longer buffer
overflow would overwrite the stack region interpreted by the “return-callee”
as arguments). attackme
only works because finish_checksum
happens not to
modify its argument.
Modern operating systems and architectures support position-independent
code and position-independent executables. These features make programs
work independent of specific addresses for functions, and the operating system
can put a program’s functions at different addresses each time the program
runs. When a program is position-independent, the address of the attacker’s
desired target function can’t be predicted, so the attacker has to get lucky.
(The x86-64 designers were smart to add %rip
-relative addressing, since
that’s what enables efficient position-independent executables!)
Finally, modern compilers use a technique called stack protection or
stack canaries to
detect buffer overflows and stop retq
from returning to a corrupted address.
This is a super useful technique.
It’s called a “canary” in reference to the phrase “canary in a coal mine”.
The operating system reserves a special tiny memory segment when the program starts running. This tiny memory segment is used for special purposes, such as thread-local storage. Although this segment can be addressed normally—it exists in memory—the operating system also ensures it can be accessed through special instruction formats like this:
movq %fs:0x28, %rax
The
%fs:
prefix tells the processor that0x28
(a memory offset) is to be measured relative to the start of the thread-local storage segment.A specific memory word in thread-local storage is reserved for a canary value. The operating system sets this value to a random word when starting the program.
The compiler adds code to function entry and exit sequences that use this value. Specifically, something like this is added to entry:
movq %fs:0x28, REG # load true canary to register pushq REG # push canary to stack
And something like this is added to exit:
popq REG # pop canary from stack xorq %fs:0x28, REG # xor with true canary jne fail # fail if they differ
where the
fail
branch calls a library-provided function such as__stack_chk_fail
.
The pushed canary is located between the function’s buffers (its local variables) and the function’s return address. Given that position, any buffer overflow that reaches the return address will also overwrite the canary! And the attacker can’t easily guess the canary or overwrite the true canary’s memory location, since both are random.
If the stack canary and the true canary don’t match, the function can infer that it executed some undefined behavior. Maybe the return address was corrupted and maybe it wasn’t; either way, executing undefined behavior means the program is broken, and it is safe to exit immediately.
These techniques, and others like them, are incredibly useful, important, fun, and good to understand. But they don’t make C programming safe: attackers are persistent and creative. (Check out return-oriented programming.) Memory-unsafe languages like C and C++ make programming inherently risky; only programs with no undefined behavior are safe. (And they’re only safe if the underlying libraries and operating system have no undefined behavior either!) Good C and C++ programmers need a healthy respect for—one might say fear of—their tools. Code cleanly, use standard libraries where possible, and test extensively under sanitizers to catch bugs like buffer overflows.
Or listen to cranky people like Trevor Jim, who says that “C is bad and you should feel bad if you don’t say it is bad” and also “Legacy C/C++ code is a nuclear waste nightmare that will make you WannaCry”.
On checksums
A checksum is a short, usually fixed-length summary of a data buffer. Checksums have two important properties:
- If two buffers contain the same data (the same bytes in the same order), then their checksums must equal. (This property is mandatory.)
- If two buffers contain different data, then their checksums should not be equal. (This property is optional: it is OK for distinct data to have equal checksums, though in a good checksum function, this should be rare.)
Checksums have many uses, but they are often used to detect errors in data transcription. For example, most network technologies use checksums to detect link errors (like bursts of noise). To send data over a network, the sender first computes a checksum of the data. Then it transmits both data and checksum over the possibly-noisy network channel. The receiver computes its own checksum of the received data, and compares that with the sender’s checksum. If the checksums match, that’s a good sign that the data was transmitted correctly. If the checksums don’t match, then the data can’t be trusted and the receiver throws it away.
Most checksum functions map from a large domain to a smaller range—for instance, from the set of all variable-length buffers to the set of 64-bit numbers. Such functions must sometimes map unequal data to equal checksums, so some errors must go undetected; but a good checksum function detects common corruption errors all, or almost all, the time. For example, some checksum functions can always detect a single flipped bit in the buffer.
The requirements on checksums are the same as the requirements for hash codes: equal buffers have equal checksums (hash codes), and distinct buffers should have distinct checksums (hash codes). A good hash code can be a good checksum and vice versa.
The most important characteristics of a checksum function are speed and
strength. Fast checksums are inexpensive to compute, but easy to break,
meaning that many errors in the data aren’t detected by the checksum function.
Widely-used fast checksum functions include the IPv4/TCP/UDP
checksum and cyclic
redundancy checks; our
checksum
function is a lot like the IPv4 checksum. Strong checksums, also
known as cryptographic hash
functions, are
hard to break. Ideally they are infeasible to break, meaning that no one
knows a procedure for finding two buffers that have the same checksum.
Widely-used strong checksum functions include SHA-256,
SHA-512, and
SHA-3. There are also many
formerly-strong checksum functions, such as
SHA-1, that have been broken—meaning
that procedures are known for finding two buffers with the same checksum—but
are still stronger in practice than fast checksums.
References: Checksums on Wikipedia; CS 225 explores some related theory.