Overview
We explore undefined behavior in more depth and discuss bitwise arithmetic.
Readings on undefined behavior
Undefined behavior and abstract machine
- A specification imposes requirements on an implementation and on the user of that implementation
- C and C++ have a specification—the language standard
- The user’s job: Write a program that obeys the specification’s rules
- The user writes primarily for the abstract machine defined by the standard
- The compiler’s job: Translate any program that obeys the specification’s
rules into an executable that performs the corresponding computation on the
actual hardware
- The compiler must understand the abstract machine and the real machine
- The compiler may assume the user did their job
Broad-brush model of programmer psychology
People want their code to:
- Work relatively correctly on most inputs
- Run fast
- Work exactly correctly on all inputs
Why?
- If code is wrong on most inputs, its speed doesn’t matter
- If code is too slow or expensive to use, its correctness doesn’t matter
- Sometimes exact correctness is difficult to define
- Of course you should aim for correctness and well-defined specifications though!
Compiler quality
- People want their code to go fast
- Compiler writers are people and they get excited about producing faster code
- Compilers take advantage of many opportunities to speed up code by removing redundancy
The as-if rule
- “conforming implementations are required to emulate (only) the observable
behavior of the abstract machine”
- For instance, data written to files
- If the compiler can figure out that a function call is redundant, it can eliminate it!
- If the compiler can figure out that a computation is redundant, it can
eliminate it!
- Including dynamic memory allocation!
- If the compiler can figure out a cheaper way to perform a computation, it can implement it!
- Examples:
datarep5/opt1.cc
,datarep5/opt2.cc
Undefined behavior
- The C and C++ standards define certain errors as undefined behavior
- If a program executes undefined behavior:
Anything at all can happen; the Standard imposes no requirements. The program may fail to compile, or it may execute incorrectly (either crashing or silently generating incorrect results), or it may fortuitously do exactly what the programmer intended. (ref)
- Undefined behavior is not the same as implementation-defined behavior
- Implementation-defined behavior: Computing
sizeof(int)
- Undefined behavior: Dereferencing a null pointer
- Implementation-defined behavior: Computing
How undefined behavior and optimizing compilers collide
- Compilers may assume that their input programs never execute undefined behavior
- And they may optimize based on this assumption
Example 1: Signed integer overflow
int x_incr = x + 1;
assert(x_incr > x);
printf("%d > %d - assertion passed\n", x_incr, x);
Example 2: Null pointer dereference
static void __devexit agnx_pci_remove (struct pci_dev *pdev)
{
struct ieee80211_hw *dev = pci_get_drvdata(pdev);
struct agnx_priv *priv = dev->priv;
if (!dev) return;
- This caused a serious Linux bug
Example 3: Signed integer overflow in a loop
long sum = 0;
for (int i = n1; i <= n2; ++i) {
sum += f(i);
}
return sum;
Example 4: Out-of-bounds access
int awesome[4] = {0, 1, 2, 3};
bool is_awesome(int v) {
for (int i = 0; i <= 4; ++i) {
if (awesome[i] == v)
return true;
}
return false;
}
More examples of undefined behaviors
- Dividing an integer by 0
- Using an uninitialized variable
- Violating the live object rule
- Modifying a
const
object - Comparing pointers to unrelated objects using
< <= > >=
- Accessing an invalid iterator
- Accessing an object using a pointer of incompatible type
- Executing an infinite loop that doesn’t have any observable behavior (!)
What to do?
- Sanitize
- Our makefiles allow you to compile with
make SAN=1
- This turns on sanitizers, which are efficient checkers that catch many undefined behaviors
- Other powerful tools: valgrind, tis-interpreter, etc.
Bitwise operators
- A bitwise operator is one that operates on the bits of its input(s) independently
- C and C++ bitwise operators:
~
(unary): complement&
: bitwise and|
: bitwise or^
: bitwise exclusive or<<
: left shift>>
: right shift
Truth tables
~ |
output |
---|---|
0 | 1 |
1 | 0 |
& |
0 | 1 |
---|---|---|
0 | 0 | 0 |
1 | 0 | 1 |
| |
0 | 1 |
---|---|---|
0 | 0 | 1 |
1 | 1 | 1 |
^ |
0 | 1 |
---|---|---|
0 | 0 | 1 |
1 | 1 | 0 |
Beware precedence
- C did not pick the right precedence for bitwise operators
x & 1 != 0
is parsed asx & (1 != 0)
- Most uses of bitwise operators require parentheses
Some mathematical properties
&
,|
, and^
are commutative:(a&b)==(b&a)
&
,|
, and^
are associative:(a&(b&c))==((a&b)&c)
&
and|
are distributive:((a&b)|(a&c)) == (a&(b|c))
,((a|b)&(a|c)) == (a|(b&c))
&
distributes over^
:((a&b)^(a&c)) == (a&(b^c))
0
is the identity for|
and^
:(x|0)==x
,(x^0)==x
-1
(=~0
) is the identity for&
:(x&-1)==x
0
is the absorbing or annihilating element for&
:(x&0)==0
-1
is the absorbing element for|
:(x|-1)==-1
Left and right shift
a << i
: Move the bits ina
left byi
places, shifting in 0s- Like
a = a * 2i
- Like
a >> i
: Move the bits ina
right byi
places- If
a
is unsigned, shifting in 0s - If
a
is signed, shifting in copies of the sign bit - Like
a = a / 2i
rounded down
- If
- On many computers, left and right shift are much faster than multiplication and (especially) division
i
must be nonnegative and less than8 * sizeof(a)
Aside: Sized integer types
- Normal C++ integers, like
int
,long
, etc., have variable size - The standard also defines a set of sized types in
#include <cstdint>
uint8_t
: Unsigned 8-bit integerint32_t
: Signed 16-bit integer- Also
16
,64
uintptr_t
: Unsigned integer big enough to represent any pointerptrdiff_t
: Signed integer big enough to represent any difference between pointerssize_t
: Unsigned integer big enough to hold the size of any representable objectssize_t
: Signed version ofsize_t
Fun with bits
- Can you define unary negation
-x
using other operators (bitwise and+
)? - What does
(x & 1) == 0
mean? - What does
x & 15
equal? - What can you learn about
x
from the valuex & (x - 1)
? - What does
x & -x
equal?
Fun with bits (2)
- Can you define unary negation
-x
using other operators (bitwise and+
)?-x
≡~x + 1
- What does
(x & 1) == 0
mean?- It tests whether
x
is even
- It tests whether
- What does
x & 15
equal?x % 16
- In general,
(x & (2i - 1))
≡x
mod2i
, but is much faster to compute
- What can you learn about
x
from the valuex & (x - 1)
?- Whether
x
is a power of two.(x & (x - 1)) == 0
iffx
is 0 or a power of 2
- Whether
- What does
x & -x
equal?- The value of the lowest 1 bit in
x
(5 & -5) == 1
;(10 & -10) == 2
;(8 & -8) == 8
- (This also equals the gcd (greatest common divisor) of
x
and-x
!)
- The value of the lowest 1 bit in
Using bits
- Bitwise operators allow us to efficiently represent small sets
- A set is a collection of elements, each of which might be present or absent
- An integer is a collection of bits, each of which might be 1 or 0
- Set operations, such as union, intersection, and membership-test, correspond to bitwise operations
Bitset example
- Consider sets taken from the universe {
e
,f
,g
,h
} - Assign each element
m
a distinct bit value ℬ(m
) (which must be a power of two)- For example, ℬ(
e
) =1
, ℬ(f
) =2
, ℬ(g
) =4
, ℬ(h
) =8
- For example, ℬ(
- A set of elements is represented as the sum of the ℬ values
- ℬ({}) =
0
- ℬ({
e
,g
}) = ℬ(e
) + ℬ(g
) =5
- ℬ({
e
,f
,g
,h
}) =15
- ℬ({}) =
- Union and intersection map to
|
and&
- ℬ(X ∪ Y) = (ℬ(X)
|
ℬ(Y)) - ℬ(X ∩ Y) = (ℬ(X)
&
ℬ(Y)) - For example, {
e
,f
} ∪ {e
,h
} = {e
,f
,h
}, and indeed(3 | 9) == 11
- ℬ(X ∪ Y) = (ℬ(X)
- Set difference maps to
&
with~
- ℬ(X ∖ Y) = (ℬ(X)
&
~
ℬ(Y))
- ℬ(X ∖ Y) = (ℬ(X)
- Membership testing maps to
&
m
∈ X ⟺ (ℬ(m
)&
ℬ(X))!= 0
WTF corner
(You don’t need to understand this, but it is fun!)
unsigned long wtf(unsigned long x) {
unsigned long y = x & -x;
unsigned long c = x + y;
return (((x ^ c) >> 2) / y) | c;
}