Originally published by Robert Beisert at fortcollinsprogram.robert-beisert.com

CS 101 – Integer Representation

Apologies for the title, but I can’t think of a better way to put it.

Unsigned

The simplest way to handle integers in binary is to treat every number as a positive number. This is called the unsigned integer representation. With the exception of subtraction, studied earlier, we have always used the unsigned integer representation before.

Sign-Magnitude

Naturally, if we need to deal with negative numbers, we must have a way to tell the difference between them and positive numbers. The simplest way to do this is to always use the first bit as a sign. This means that to turn the number 2 (00000010) into negative two (-2), we flip the first bit (10000010).

There are a few problems with this method, however.

First and most obvious, we have two different values for zero in this scenario. The number 00000000 and 10000000 are both equal to 0, but they are positive and negative. Since the number zero has no positive or negative value, this causes some complications.

Second, this form is completely useless for math. If we want to add a positive number to a negative number, we have to add several stages of complexity to our circuits.

Finally, this limits the number of values we can store. In a single byte, we can only store up to +/- 127. Surely we can do better.

One’s Complement

The next reasonable step is to employ one’s complement representation, where we flip every bit from the positive number. This means that the number two (00000010) becomes negative two (11111101).

This format is not substantially better than the sign-magnitude format, as we haven’t increased the range of values we can store. We also still have two zero values, which is not good.

One’s complement is mostly useful because it leads us to…

Two’s Complement

This is where it starts to get good. Two’s complement begins like one’s complement, only we add 1 to the final value. This was covered previously (when we discussed subtraction), but the following steps convert 113 (01101001) to the two’s complement form of -133:

01101001         <- Initial value

10010110      <- One’s Complement

10010111        <- Two’s Complement

What good is this?

First, we get one more value in the range. While we still only store up to +127, we can now store -128 (11111111).

Second, we finally only have one value for zero. 00000000 is the only zero in the system!

Finally, as we learned in binary subtraction, we can perform subtraction using addition. There is nothing not to love about that.

Note: We usually employ two’s complement, for the reasons provided. If you have a computer that uses any of the others, I’m genuinely surprised.

Ranges

If you’re a math buff, you know how range notation works. For the rest of us, everything inside of square braces [] is a range, separated by commas.

For N bits:

Unsigned                                 [0 , 2^N]

Sign-Magnitude                     [-(2^(N -1) -1) , (2^(N-1)-1)]

One’s Complement                [-(2^(N -1) -1) , (2^(N-1)-1)]

Two’s Complement               [-2^(N -1) , (2^(N-1)-1)]

Size

While we’re talking about ranges, I think it important to note that the standard integer size tends to be 32 bits. This is the ANSI C convention, and it carries over to most languages. This is not universal. Consult any manuals you have before assuming the size of your integers.

Or, if you’re using C, use the library <stdint.h> and use the provided integers. These integers are defined by their size, which can have many useful applications.

For example, an unsigned 8-bit integer is referenced by uint8_t.