CS 1355
Introduction to Programming in C
Monday 2006.10.23 (Week 7)
Lecture notes (at http://r638-2.cs.nthu.edu.tw/C/notes/13.txt)
Chapter 6: Arrays
- Arrays
- Initializer
- sizeof, array bounds
- string
- sorting
What is an array?
- a consecutive sequence of data elements
- the elements can be "indexed" (picked out) by an integer
(index => "subscript")
- the elements are of the same type.
- lets you declare a whole group of variables,
rather than one at a time
Example: histogram
(count how many students are of each grade)
#include
int main() {
int count[11]; /* declare ten elements, count[0]..count[10]*/
int i, score;
for (i = 0; i< 11; i++) {
count[i] = 0; /* initialize everybody to 0 */
}
while (scanf("%d", &score) != EOF) {
/* compute the index to increment!
* 0..9 => count[0]++, 10..19 => count[1]++, etc.
* so, this is just a score / 10 operation.
*/
count[score / 10]++;
}
/* now print the histogram */
for (i = 0; i < 10; i++) {
printf("%2d - %2d: %d\n", i*10, i*10+9, count[i]);
}
printf("%7d: %d\n", 100, count[10]); /* print 100 separately */
}
====
Run it:
% ./a.out
10 20 10 90 80 30 70 22 55 100 100 100 90 30 77 79 88 89 99
^D
0 - 9: 0
10 - 19: 2
20 - 29: 2
30 - 39: 2
40 - 49: 0
50 - 59: 1
60 - 69: 0
70 - 79: 3
80 - 89: 3
90 - 99: 3
100: 3
====
This way, you don't need to handle each case separately like
while(scanf("%d", &score) != EOF) {
switch(score / 10) {
case 0: /* 0-9 */ count0++; break;
case 1: /* 10-19 */ count1++; break;
case 2: /* 20-29 */ count2++; break;
... etc
}
}
Array initialization:
- it is possible to initialize an array using { } list.
- example:
int count[11] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
Since 0 is a special case, could just say
int count[11] = { 0 };
They could be different values.
- Depending on if the variable is static or auto:
- static (global or local static): initialized at compile time
- auto variables: initialized at runtime
- Actually, if you have an initializer, then you can leave out
the size of the array!
- Example
int a[ ] = { 15, 30, 40, 45, 20, 19, 7 };
This is equivalent to declaring
int a[7] = { 15, 30, 40, ... }; /* seven elements */
and initializing. The missing dimension is automatically figured out
by the compiler.
- Reason: it is easy to make mistakes if you hardwire array sizes.
Better to use symbolic constants or expressions.
You can also say
sizeof(a) / sizeof(int)
To determine how many elements there are.
Try it in Ch:
> int a [ 400 ];
> sizeof(a)
1600
> sizeof(int)
4
>
Another way is to define macros
#define SIZE 10
int a[SIZE];
To draw histogram graphically, just print * instead of printing %d
#include
void printBar(int n) {
int i;
for (i = 0; i < n; i++) {
printf("*");
}
}
int main() {
int count[11]; /* declare ten elements, count[0]..count[10]*/
int i, score;
for (i = 0; i< 11; i++) {
count[i] = 0; /* initialize everybody to 0 */
}
while (scanf("%d", &score) != EOF) {
/* compute the index to increment!
* 0..9 => count[0]++, 10..19 => count[1]++, etc.
* so, this is just a score / 10 operation.
*/
count[score / 10]++;
}
/* now print the histogram */
for (i = 0; i < 10; i++) {
printf("%2d - %2d: ", i*10, i*10+9);
printBar(count[i]); printf("\n");
}
printf("%7d: ", 100);
printBar(count[10]); printf("\n");
}
---
run it
% ./a.out
10 20 10 90 80 30 70 22 55 100 100 100 90 30 77 79 88 89 99
0 - 9:
10 - 19: **
20 - 29: **
30 - 39: **
40 - 49:
50 - 59: *
60 - 69:
70 - 79: ***
80 - 89: ***
90 - 99: ***
100: ***
-------------
Careful: array out-of-bound!
what if we enter a score like 200?
- it might crash!
- error message may say
"Bus Error"
"Segmentation Fault (core dumped)"
etc
if you have an array declared as
int a[11]
- valid index values are 0...10.
- Anything outside the range is out of bound.
To be safe, we should check the array bound to make sure it is
within range.
Question: how to handle it when the array is out of bound?
This is called an exceptional condition.
- in other languages (e.g., C++, Java), it might be handled by
"throwing an exception"
- simple way: print the error message to tell the user
Strings:
- a string is an array of characters
char s[20];
more specifically, C strings are null-terminated
- Two types of strings
- string literals (constants): immutable (cannot be changed)
- string buffers (variables): mutable (can be changed)
String initialization:
char s[ ] = "hello";
This is the same as
char s[ ] = { 'h', 'e', 'l', 'l', 'o', '\0' };
The '\0' is the null character.
The null character is really included. Can try this in Ch:
> sizeof("hello")
6
Arrays are passed as pointers to a function call!
the elements are not copied.
Experiment 1: print the "address" vs "string" of s
#include
int main() {
char s[ ] = "abc";
printf("%d\n", s); /* this prints the "address of" s !!*/
printf("%s\n", s); /* this prints the "string" of s */
}
Experiment 2: pointer passing
#include
void modifyStringBuffer(char t[ ]) {
/* testing: sets the [0] character to '-'
t[0] = '-'
}
int main() {
char s[ ] = "hello world";
printf("address of s is %d\n", &s);
printf("parameter s is actually also the address of s: %d\n", s);
modifyStringBuffer(s);
printf("%s\n", s); /* prints -ello world */
printf("address of s is %d\n", s); /* s itself should be the same */
/* even though the buffer's content changed */
}
Experiment 3: parameter
#include
void modifyLocal(char t[ ]) {
t = "goodbye";
}
int main() {
char s[ ] = "hello world";
printf("address of s is %d\n", s);
modifyLocal(s);
printf("address of s is %d\n", s); /* s is not changed!! */
/* you could also print it as %p (for "pointer") */
printf("the string s is %s\n", s); /* content is not changed, either*/
}
Standard input: scanf() with %s format
%s means a blank-delimited string
Example: ask the user for last name, first name, separated by spaces
1 #include
2 int main() {
3 char firstname[20], lastname[20];
4 printf("enter firstname lastname: ");
5 scanf("%s%s", firstname, lastname);
/* same as &firstname, &lastname */
6 printf("hi %s, your last name is %s\n", firstname, lastname);
7 }
% ./a.out
enter firstname lastname: George Washington
hi George, your last name is Washington
- scanf() will look for non-blank words for each %s.
- line 5, pass firstname, lastname
- Because they are arrays, they are exactly the same as
&firstname, &lastname
- also the same as &firstname[0], &lastname[0]
- need to make sure the buffers are large enough.
=> it can be error if you enter a name longer than 19 characters
(plus one '\0' for the terminating character)
For example, if the user types a really long name,
(longer than 19 characters), then it can cause a "buffer overflow"!
=> scanf() doesn't know the size of the buffers (firstname, lastname)
and doesn't know where to stop.
=> this is very dangerous!! so, use scanf() %s format with caution!
(the very first virus was caused by such type of buffer overflow)
Function return values:
- do not return arrays if they are local auto variables!
reason: the space gets de-allocated after the function returns,
so it is no longer valid!
Solution:
1) use a "static" local variable,
2) use malloc() to allocate memory explicitly.
#include
char* getstring() {
char s[ ] = "hello world\n";
return s;
}
int main() {
printf("%s\n", getstring());
}
=> You may get a warning for "function returns address of local variable"
scanf("%s", sbuf);
=> do not use &sbuf, since sbuf itself is already a pointer expression
=> this is potentially dangerous, bounds not checked!
----
Using arrays: read input into memory and perform operations
#include
#define SIZE 100
int main() {
int db[SIZE];
int i, score;
for (i = 0; i < SIZE; i++) {
if (scanf("%d", &score) != 1) {
break; /* end of input */
}
db[i] = score;
}
/* now we have the entire array in memory */
/* would like to print statistics */
/* including min, max, average (mean), median */
printf("max = %d, min = %d, mean = %.2f, median = %d",
max(db, i), min(db, i), mean(db, i), median(db, i));
}
int max(int d[], int size) {
/* maintain a "current maximum" variable */
/* set it to the highest seen so far */
int currentMax = 0;
int i;
for (i = 0; i < size; i++) {
if (d[i] > currentMax) { currentMax = d[i]; }
}
return currentMax;
}
float mean(int d[], int size) {
int sum = 0;
int i;
for (i = 0; i < size; i++) {
sum += d[i];
}
return (float)sum / size;
}
Question: how to implement median??
- what is a median?
example: eleven numbers: 1 2 2 2 3 [4] 10 13 15 18 19
4 is median because it is in the "middle" in a sorted sequence
But scores are not always sorted!
One solution: histogram technique
int median(int d[], int size) {
int hist[101] = { 0 }; /* one entry for each score */
int i;
int medianPosition = size / 2; /* example: 11 / 2 = 5, which is the 6th */
int rank = 0;
for (i = 0; i < size; i++) {
hist[d[i]]++;
}
/* after this for loop, hist[ ] contains count of each score, */
/* like a histogram */
/* next, find the score that contains the median position */
for (i = 0; i < 101; i++ ) {
rank += hist[i];
if (rank >= medianPosition) { break; }
}
return i;
}
Another solution: perform sorting, so that
max(db) is simply db[SIZE-1],
min(db) is simply db[0],
median(db) is db[SIZE/2]
Many Sorting algorithms exist
* in unix command,
% sort -n
23
15
44
100
6
^D
6
15
23
44
100
%
So... we could actually call unix sort on the input!!
Run it as
% sort -n | ./a.out
Another way, perform sorting in the C program!
- Bubblesort:
idea: divide the array into two partitions:
- one partition contains elements in sorted positions
- each iteration, move another element from the unsorted partition
to the sorted partition.
- Once it moves, the ith element will be in its correct (absolutely
sorted.
Primitive:
void
bubbleUp(int a[], int i) {
/*
purpose: compare a[i] and a[i+1],
switch them if a[i] > a[i+1]
assumption: a has at least i+2 elements
*/
if (a[i] > a[i+1]) {
int temp = a[i];
a[i] = a[i+1];
a[i+1] = temp;
}
}
Example: array a[ ] = { 7, 6, 5, 8, 4, 1, 9, 22, 3 };
if we call
bubble(a, 0);
then after the call, array a looks like
{ 6, 7, 5, 8, 4, 1, 9, 22, 3 }
(i.e., switched a[0], a[1])
if we do
for (j = 0; j < SIZE-1; j++) {
bubbleUp(a, j);
}
then, the array looks like
{ 6, 5, 7, 4, 1, 8, 9, 3, **22** }
(i.e., largest element (22) is at the end)
Next time, if we go
for (j = 0; j < SIZE-2; j++) {
bubbleUp(a, j);
}
then the array looks like
{ 5, 6, 4, 1, 7, 8, 3, **9**, 22 }
(i.e., the second largest element (9) is at the second-to-last position)
To generalize,
for (i = SIZE-1; i > 0; i--) {
for (j = 0; j < i; j++) {
bubbleUp(a, j);
}
}
------
here is the complete program
#include
int a[] = { 7, 6, 5, 8, 4, 1, 9, 22, 3 };
#define SIZE 9
void
bubbleUp(int a[], int i) {
/*
purpose: compare a[i] and a[i+1],
switch them if a[i] > a[i+1]
assumption: a has at least i+2 elements
*/
if (a[i] > a[i+1]) {
int temp = a[i];
a[i] = a[i+1];
a[i+1] = temp;
}
}
void
printArray(int a[]) {
int i;
for (i=0; i< SIZE; i++) {
printf("%d ", a[i]);
}
printf("\n");
}
int main() {
int i, j;
printArray(a);
bubbleUp(a, 0);
printf("after bubble(a, 0):\n");
printArray(a);
for (j = 0; j < SIZE-1; j++) {
bubbleUp(a, j);
}
printf("after for j=0; j printArray(a);
for (i = SIZE-1; i > 0; i--) {
for (j = 0; j < i; j++) {
bubbleUp(a, j);
}
}
printf("after complete bubblesort:\n");
printArray(a);
}