One of the features of computers is that numbers are generally stored in a memory space of fixed size. For instance an int may be in 16 bits on some architectures. An attempt to compute a value that can only be represented in 17 or more bits will overflow the variable, resulting in rather odd behavior.
This is complicated by the fact that most variables are signed, which is to say that the variable can be either positive or negative. In twos compliment arithmetic, by far the most common way to store signed int, the highest order bit is the sign bit and is not available to indicate magnitude. However, arithmetic which overflows into the sign bit may cause the result to be negative. Let's look at this in more detail, but using 8 bit variables for convenience. Also for convenience, we will only consider unsigned numbers.
| Hex | 0f |
| Binary | 00001111 |
| Decimal | 15 |
Now add one to that number.
| Hex | 10 |
| Binary | 00010000 |
| Decimal | 16 |
So far so good. But what if we deal with larger numbers.
| Hex | f0 |
| Binary | 11110000 |
| Decimal | 240 |
Now add 16 to that number, truncating after the 8th bit.
| Hex | not 100, but 00 |
| Binary | Not 10000000, but 00000000 |
| Decimal | Not 256, but 0 |
Recall that the sign bit is stored in the highest order bit. That means that in the example above, the decimal value of f0 would not be 240, but a negative number.
When you are debugging your software you will occasionally encounter cases where simple math results in numbers which are wildly wrong. If you were to graph the results of s linear sequence of such calculations you would see major discontinuities in the results. When you see results like this you should consider that you may be dealing with integer overflow.
When you discover integer overflow in your code there are several ways to resolve it. Any of these may be sufficient. But each involves tradeoffs.
Consider the calculation of
a=b*c/d;This calculation can overflow in the calculation of b*c, even tho d would have brought the result back into range. We can make use of this by forcing the calculation to be
a=(b/d)*c;Now the intermediate result, as well as the final result, will not overflow. There are two problems with this solution:
Lets plug some actual numbers into our equation:
a=13*3/2; -> 19.5 which is truncated to 19But if we do the division first the result is different:
a=(13/2)*3; -> 18Notice that in the second calculation the intermediate calculation drops the fractional part. That fractional error is then multiplied by 3, generally increasing the error.
This is an important point. When doing integer calculations the fractional parts are dropped at every intermediate calculation.
The C compiler is given considerable latitude for optimization. The intermediate calculations in an equation can be done in any order, and are not in your control.
To force the compiler to perform the division before the multiply, you must break up the calculation into two statements.
e=b/d;
a=e*c;
Done this way, the compiler will perform the calculations in the order you have specified.
Having seen the downsides to this approach, you may be tempted to use the other solutions. But bear in mind that they also have downsides.
There are several integer data types in C. In order of (sometimes) increasing size these are:
| Type | Type | Type | Minimum | Note |
| char | unsigned char | signed char | 8 bits | The char type, when unqualified by signed or unsigned may or may not include a sign bit. The C compiler is free to choose. |
| short int | unsigned short int | signed short int | 16 bits | |
| int | unsigned int | signed int | 16 bits | int is the natural integer type of the CPU. If the CPU naturally works in 32 bits, then int will be 32 bits. |
| long int | unsigned long int | signed long int | 32 bits |
OK, so how do we make use of this information? Assuming your variables are all int, the following modification may fix the problem.
a=(long)b*(long)c/(long)d;What this does is to promote each of the variables to a long before performing the calculation. When the result is then stored in a, it is truncated back to int. But by then hopefully the result will fit into an int.
Disadvantages to this approach.
This approach is similar to the previous one, but instead of promoting the variables to the larger size in the middle of the calculation, it is done in the declaration. So we would change from
int a, b, c, d;
to
long a, b, c, d;
This also has disadvantages.
Integer overflow is the ugly underbelly of the C language. If you are not experienced with it the results can be maddeningly illogical. Resolving the issue involves tradeoffs. There is no one-size-fits-all solution.
Now that you are aware of this problem you may be asking "why did the designers of C allow such ugliness into the language"? In fact they had good reasons. They could have specified that the compiler would detect all such problems and abort the program. But doing that would add overhead to all programs, making them larger and slower. Instead the C language is designed as a language that allows you to get excellent performance out of the computer. The downside is that you, as the programmer, must ensure that your code doesn't run into problems of this sort.