2013/ExerciseP

From CS61
Jump to: navigation, search
Computer Science 61 and E61
Systems Programming and Machine Organization
This is the 2013 version of the course. Main site

Exercise P: Pointers

Here are some exercises that may help you master C pointers.

Principles

C pointer arithmetic behavior follows logically from several principles.

Array layout

Arrays in C are laid out in memory using contiguous allocation: element n+1 is placed in memory immediately after element n.

The sizeof operator determines an object’s size in and out of arrays. Let array be an array of objects of type T (e.g., T array[100];), and assume element 0 of the array is located at address ((uintptr_t) X). Then array element n is located at address ((uintptr_t) X) + n*sizeof(T).

Pointer–array equivalence

C pointers can behave like array positions in expressions. Given the following declarations:

T *p;                   // pointer to type T
T array[100];           // array of T objects

You’d expect this to work, and it does:

p = &array[0];          // p now points to the first element of the array
*p = 3;                 // assigns array[0] to 3
assert(array[0] == 3);  // OK

But C also lets you use array notation on the pointer. For instance:

p[0] = 4;               // assigns array[0] to 4
assert(array[0] == 4);  // OK
                        // Note: When p is a pointer, p[0] and *p ALWAYS mean the same thing.
p[1] = 5;               // assigns array[1] to 5
assert(array[1] == 5);  // OK

This works even in the middle of the array.

p = &array[5];          // p now points to the sixth element of the array
p[0] = 10;
assert(array[5] == 10); // OK
p[-1] = 9;
assert(array[4] == 9);  // OK

In fact, array variables in expressions behave like pointers. The assignment p = array is valid, and has the same meaning as p = &array[0]:

p = array;
assert(p == &array[0]);  // OK

The equivalence isn’t total. A C pointer behaves like an array position, not an array. You can’t assign an array variable from a pointer:

array = p;      // error: incompatible types when assigning to type ‘int[100]’ from type ‘int *’

Array position arithmetic

C allows you to compare pointers into the same array. The results are what you’d expect.

assert(&array[5] > &array[4]);      // 5 > 4, so &array[5] > &array[4]
assert(&array[0] == &array[0]);
assert(&array[9] < &array[10]);

Now, since &array[5] > &array[4], the laws of arithmetic say that &array[5] - &array[4] > 0. This is true in C. Subtraction on array positions as equivalent to subtraction on the corresponding array indexes. Thus:

ptrdiff_t i;
i = &array[5] - &array[4];
assert(i == 1);                   // == 5 - 4
i = &array[0] - &array[0];
assert(i == 0);                   // == 0 - 0
i = &array[9] - &array[10];
assert(i == -1);                  // == 9 - 10

(See C Patterns for more on ptrdiff_t.)

Now, since &array[5] - &array[4] == 1, we should expect that &array[4] + 1 == &array[5]. This is true too.

assert(&array[0] + 6 == &array[6]);
assert(6 + &array[0] == &array[6]);
assert(&array[4] - 1 == &array[3]);

Putting it together

Since C pointers behave like array positions, it must logically follow that C pointer arithmetic behaves like array position arithmetic. And it does.

p = &array[0];
T *q = &array[4];
assert(q - p == 4);
assert(q == p + 4);
assert((q + 1) - array == 5);

The sizeof wrinkle

These principles fit together quite logically, but one consequence endlessly surprises students: Calculations on C array positions and pointers usually return different results from calculations on addresses. The reason is contiguous layout.

For instance, consider this:

int iarr[10];                  // Assume the compiler puts this at address 0x1000.
printf("%zd\n", &iarr[2] - &iarr[1]);
                               // Prints “1”

int *p1 = &iarr[1];
int *p2 = &iarr[2];
printf("%p %p\n", p1, p2);     // Prints “0x1004 0x1008”
printf("%zd\n", p2 - p1);      // Prints “1”

Every step in this sequence is logical. Array position arithmetic is the same as array index arithmetic. Array elements are laid out contiguously in memory, and ints are 4 bytes big (sizeof(int) == 4), so the pointer to element 2 is four bytes after the pointer to element 1. Pointers behave like array positions. But admittedly the result is a little funky. We subtract 0x1008 - 0x1004 and get 1, not 4! The compiler has divided the difference in addresses, namely 4, by sizeof(int), which is also 4, to get 1. This will become second nature, but if it’s not, it still makes sense if you go step by step.

This also means that the result of a pointer arithmetic expression depends heavily on the types of the pointer arguments.

printf("%zd\n", (char *) p2 - (char *) p1);     // Prints “4”: 4/sizeof(char) == 4/1 == 4
printf("%zd\n", (short *) p2 - (short *) p1);   // Prints “2”: 4/sizeof(short) == 4/2 == 2
printf("%zd\n", (double *) p2 - (double *) p1); // Probably prints “0”

Rule of thumb

We find the following discipline useful for avoiding pointer errors.

  • Use sizeof in pointer arithmetic expressions only when the pointer has type char * or unsigned char *.

Exercises

On each line, write the type and the numeric value of the expression on the left. For pointer types, write the numeric address (what you would get from printf("%p")). Assume an x86-like machine (with 32-bit addresses and integers and little endian storage). We did line 0 for you.

Type Numeric value
char *p = (char *) 0x1000;
char *q = (char *) 0x1010;
0. p char * 0x1000
1. &p[1] _________________ _________________
2. &p[-1] _________________ _________________
3. &p[0] _________________ _________________
4. &p[1] - &p[0] _________________ _________________
5. &p[16] _________________ _________________
6. (p + 1) - p _________________ _________________
7. &p[16] - p _________________ _________________
8. q - p _________________ _________________
9. sizeof(p) _________________ _________________
10. sizeof(*p) _________________ _________________
int *ip = (int *) p;
11. &ip[0] _________________ _________________
12. &ip[1] _________________ _________________
13. &ip[1] - &ip[0] _________________ _________________
14. (char *) &ip[1] - p _________________ _________________
15. sizeof(ip) _________________ _________________
16. sizeof(*ip) _________________ _________________
17. &ip[sizeof(int)] _________________ _________________
18. ip + sizeof(int) _________________ _________________
19. ip + 1 _________________ _________________
20. p + sizeof(int) _________________ _________________
int *iq = (int *) q;
21. iq - ip _________________ _________________
22. &iq[-1] - ip _________________ _________________
p[0] = p[1] = p[2] = p[3] = 0;
23. *ip _________________ _________________
*(char *) ip = 1;
24. *ip _________________ _________________
*((char *) ip + 1) = 1;
25. p[1] _________________ _________________
26. *ip _________________ _________________
*((char *) ip) = 2;
27. *((char *) ip) _________________ _________________
28. *ip _________________ _________________

Solutions

When you’re ready for solutions, go here.