Exercise P: Pointers
Here are some exercises that may help you master C pointers.
Principles
C pointer arithmetic behavior follows logically from several principles.
Array layout
Arrays in C are laid out in memory using contiguous allocation: element n+1 is placed in memory immediately after element n.
The sizeof
operator determines an object’s size in and out of arrays.
Let array
be an array of objects of type T
(e.g., T array[100];
),
and assume element 0 of the array is located at address
((uintptr_t) X)
. Then array element n
is located at address
((uintptr_t) X) + n*sizeof(T)
.
Pointer–array equivalence
C pointers can behave like array positions in expressions. Given the following declarations:
T* p; // pointer to type T
T array[100]; // array of T objects
You’d expect this to work, and it does:
p = &array[0]; // p now points to the first element of the array
` *p = 3; // assigns `array[0]` to 3 `
assert(array[0] == 3); // OK
But C also lets you use array notation on the pointer. For instance:
` p[0] = 4; // assigns `array[0]` to 4 `
assert(array[0] == 4); // OK
// Note: When p is a pointer, p[0] and *p ALWAYS mean the same thing.
` p[1] = 5; // assigns `array[1]` to 5 `
assert(array[1] == 5); // OK
This works even in the middle of the array.
p = &array[5]; // p now points to the sixth element of the array
p[0] = 10;
assert(array[5] == 10); // OK
p[-1] = 9;
assert(array[4] == 9); // OK
In fact, array variables in expressions behave like pointers. The
assignment p = array
is valid, and has the same meaning as
p = &array[0]
:
p = array;
assert(p == &array[0]); // OK
The equivalence isn’t total. A C pointer behaves like an array position, not an array. You can’t assign an array variable from a pointer:
array = p; // error: incompatible types when assigning to type ‘int[100]’ from type ‘int *’
Array position arithmetic
C allows you to compare pointers into the same array. The results are what you’d expect.
assert(&array[5] > &array[4]); // 5 > 4, so &array[5] > &array[4]
assert(&array[0] == &array[0]);
assert(&array[9] < &array[10]);
Now, since &array[5] > &array[4]
, the laws of arithmetic say that
&array[5] - &array[4] > 0
. This is true in C. Subtraction on array
positions as equivalent to subtraction on the corresponding array
indexes. Thus:
ptrdiff_t i;
i = &array[5] - &array[4];
assert(i == 1); // == 5 - 4
i = &array[0] - &array[0];
assert(i == 0); // == 0 - 0
i = &array[9] - &array[10];
assert(i == -1); // == 9 - 10
(See C Patterns for more on
ptrdiff_t
.)
Now, since &array[5] - &array[4] == 1
, we should expect that
&array[4] + 1 == &array[5]
. This is true too.
assert(&array[0] + 6 == &array[6]);
assert(6 + &array[0] == &array[6]);
assert(&array[4] - 1 == &array[3]);
Putting it together
Since C pointers behave like array positions, it must logically follow that C pointer arithmetic behaves like array position arithmetic. And it does.
p = &array[0];
T* q = &array[4];
assert(q - p == 4);
assert(q == p + 4);
assert((q + 1) - array == 5);
The sizeof
wrinkle
These principles fit together quite logically, but one consequence endlessly surprises students: Calculations on C array positions and pointers usually return different results from calculations on addresses. The reason is contiguous layout.
For instance, consider this:
int iarr[10]; // Assume the compiler puts this at address 0x1000.
printf("%zd\n", &iarr[2] - &iarr[1]);
// Prints “1”
int* p1 = &iarr[1];
int* p2 = &iarr[2];
printf("%p %p\n", p1, p2); // Prints “0x1004 0x1008”
printf("%zd\n", p2 - p1); // Prints “1”
Every step in this sequence is logical. Array position arithmetic is the
same as array index arithmetic. Array elements are laid out contiguously
in memory, and int
s are 4 bytes big (sizeof(int) == 4
), so the
pointer to element 2 is four bytes after the pointer to element 1.
Pointers behave like array positions. But admittedly the result is a
little funky. We subtract 0x1008 - 0x1004
and get 1
, not 4
! The
compiler has divided the difference in addresses, namely 4, by
sizeof(int)
, which is also 4, to get 1. This will become second
nature, but if it’s not, it still makes sense if you go step by step.
This also means that the result of a pointer arithmetic expression depends heavily on the types of the pointer arguments.
printf("%zd\n", (char*) p2 - (char*) p1); // Prints “4”: 4/sizeof(char) == 4/1 == 4
printf("%zd\n", (short*) p2 - (short*) p1); // Prints “2”: 4/sizeof(short) == 4/2 == 2
printf("%zd\n", (double*) p2 - (double*) p1); // Probably prints “0”
Rule of thumb
We find the following discipline useful for avoiding pointer errors.
- Use
sizeof
in pointer arithmetic expressions only when the pointer has typechar*
orunsigned char*
.
Exercises
On each line, write the type and the numeric value of the
expression on the left. For pointer types, write the numeric address
(what you would get from printf("%p")
). Assume an x86-like machine
(with 32-bit addresses and integers and little endian storage). We did
line 0 for you.
Type |
Numeric value |
||
---|---|---|---|
char* p = (char*) 0x1000; |
|||
0. |
p |
|
|
1. |
&p[1] |
_________________ |
_________________ |
2. |
&p[-1] |
_________________ |
_________________ |
3. |
&p[0] |
_________________ |
_________________ |
4. |
&p[1] - &p[0] |
_________________ |
_________________ |
5. |
&p[16] |
_________________ |
_________________ |
6. |
(p + 1) - p |
_________________ |
_________________ |
7. |
&p[16] - p |
_________________ |
_________________ |
8. |
q - p |
_________________ |
_________________ |
9. |
sizeof(p) |
_________________ |
_________________ |
10. |
sizeof(*p) |
_________________ |
_________________ |
int* ip = (int*) p; |
|||
11. |
&ip[0] |
_________________ |
_________________ |
12. |
&ip[1] |
_________________ |
_________________ |
13. |
&ip[1] - &ip[0] |
_________________ |
_________________ |
14. |
(char*) &ip[1] - p |
_________________ |
_________________ |
15. |
sizeof(ip) |
_________________ |
_________________ |
16. |
sizeof(*ip) |
_________________ |
_________________ |
17. |
&ip[sizeof(int)] |
_________________ |
_________________ |
18. |
ip + sizeof(int) |
_________________ |
_________________ |
19. |
ip + 1 |
_________________ |
_________________ |
20. |
p + sizeof(int) |
_________________ |
_________________ |
int* iq = (int*) q; |
|||
21. |
iq - ip |
_________________ |
_________________ |
22. |
&iq[-1] - ip |
_________________ |
_________________ |
p[0] = p[1] = p[2] = p[3] = 0; |
|||
23. |
*ip |
_________________ |
_________________ |
*(char*) ip = 1; |
|||
24. |
*ip |
_________________ |
_________________ |
*((char*) ip + 1) = 1; |
|||
25. |
p[1] |
_________________ |
_________________ |
26. |
*ip |
_________________ |
_________________ |
*((char*) ip) = 2; |
|||
27. |
*((char*) ip) |
_________________ |
_________________ |
28. |
*ip |
_________________ |
_________________ |
Solutions
When you’re ready for solutions, go here.