Data Representation
Data in a computer is stored in the form of "bits." A bit is something that can be either zero or one. This web page shows eight interpretations of the same 32 bits. You can edit any of the interpretations, and the others will change to match it. For a more detailed explanation, see the rest of this page.
Binary (32 bits): |
|
Graphical: | |
Hexadecimal: | |
Unsigned Decimal: | |
Signed Decimal: | |
Real Number: | |
8-bit Characters: | |
16-bit Characters: |
About the Representations
In a computer, items of data are represented in the form of bits, that is, as zeros and ones. More accurately, they are stored using physical components that can be in two states, such as a wire that can be at high voltage or low voltage, or a capacitor that can either be charged or not. These components represent bits, with one state used to mean "zero" and the other to mean "one." To be stored in a computer, a data item must be coded as a sequence of such zeros and ones. But a given sequence of zeros and ones has no built-in meaning; it only gets meaning from how it is used to represent data.
The table at the right shows some possible interpretations of four bits. This web page shows some possible interpretations of 32 bits. Here is more information about the eight interpretations:
- 32 Bits — The "32 Bits" input box shows each of the 32 bits as a zero or one. You can type anywhere from 1 to 32 zeros and ones in the input box; the bits you enter will be padded on the left with zeros to bring the total up to 32. You can think of this as a 32-bit "base-2 number" or "binary number," but really, saying that it is a "number" adds a level of interpretation that is not built into the bits themselves. (As an example, the base-2 number 1011 represents the integer 1 × 23 + 0 × 22 + 1 × 21 + 1 × 20. This is similar to the way the base-10 number 3475 represents 3 × 103 + 4 × 102 + 7 × 101 + 5 × 100.) In a modern computer's memory, the bits would be stored in four groups of eight bits each. Eight bits make up a "byte," so we are looking at four bytes of memory.
- Graphical — Instead of representing a bit as a zero or one, we can represent a bit as a square that is colored white (for zero) or black (for one). The 32 bits are shown here as a grid of such squares. You can click on a square to change its color. The squares are arranged in four rows of eight, so each row represents one byte. Note that there are two ways that the four bytes could be arranged in memory: with the high-order (leftmost) byte first or with the low order byte first. These two byte orders are referred to as "big endian" and "little endian." The big-endian byte order is used here.
- Hexadecimal — Hexadecimal notation uses the characters 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F. (Lower case letters are also OK when entering the value into the "Hexadecimal" input box.) Each of these characters corresponds to a group of four bits, so a 32-bit value corresponds to 8 hexadecimal characters. A string of hexadecimal characters can also be interpreted as a base-16 number, and the individual characters 0 through F correspond to the ordinary base-10 numbers 0 through 15.
- Unsigned Decimal — We usually represent integers as "decimal," or "base-10," numbers, rather than binary (base-2) or hexadecimal (base-16). Using just 4 bits, we could represent integers in the range 0 to 15. Using 32 bits, we can represent positive integers from 0 up to 232 minus 1. In terms of base-10 numbers, that means from 0 to 4,294,967,295. The "Unsigned Decimal" input box shows the base-10 equivalent of the 32-bit binary number. You can enter the digits 0 through 9 in this box (but no commas).
- Signed Decimal — Often, we want to use negative as well as positive integers. When we only have a certain number of bits to work with, we can use half of the available values to represent negative integer sand half to represent positive integers and zero. In the signed decimal interpretation of 32-bit values, a bit-pattern which has a 1 in the the first (leftmost) position represents a negative number. The representation used for negative numbers (called the twos complement representation) is not the most obvious. If we had only four bits to work with, we could represent signed decimal values from -8 to 7, as shown in the table. With 32-bits, the legal signed decimal values are -2,147,483,648 to 2,147,483,647. When entering values, you can type the digits 0 through 9, with an optional plus or minus sign. Note that for the integers 0 through 2,147,483,647, the signed decimal and unsigned decimal interpretations are identical.
- Real Number — A real number can have a decimal point, with an integer part before the decimal point and a fractional part after the decimal point. Examples are: 3.141592654, -1.25, 17.0, and -0.000012334. Real numbers can be written using scientific notation, such as 1.23 × 10-7. In the "Real Number" input box, this would be written 1.23e-7. These examples use base 10; the base 2 versions would use only zeros and ones. A real number can have infinitely many digits after the decimal point. When we are limited to 32 bits, most real numbers can only be represented approximately, with just 7 or 8 significant digits. The encoding that is used for real numbers is IEEE 754. The first (leftmost) bit is a sign bit, which tells whether the number is positive or negative. The next eight bits encode the exponent, for scientific notation. The remaining 23 bits encode the significant bits of the number, referred to as the "mantissa." There are special bit patterns that represent positive and negative infinity. And there are many bit patterns that represent so-called NaN, or "not-a-number" values. The encoding is quite complicated! For the "Real Number" input box, you can type integers, decimal numbers, and scientific notation. You can also enter the special values infinity, -infinity, and nan. (There are many different not-a-number values, but in this web app they are all shown as "NaN".) When you leave the input box, your input will be converted into a standard form. For example, if you enter 3.141592654, it will be changed to 3.1415927, since your input has more significant digits than can be represented in a 32-bit number. And 17.42e100 will change to infinity, since the number you entered is too big for a 32-bit number. (Note, by the way, that real numbers on a computer are more properly referred as "floating point" numbers.)
- 8-bit Characters — Text is another kind of data that has to be represented in binary form to be stored on a computer. ASCII code uses seven bits per character, to represent the English alphabet, digits, punctuation, and certain "unprintable" characters that don't have a visual representation. ASCII can be extended to eight bits in various ways, allowing for 256 possible characters, numbered from 0 to 255. The "8-bit Characters" input box uses the first 256 characters of the 16-bit character set that is actually used in web apps. If you type characters outside that range into the input box, you'll get an error. The unprintable characters are represented using a notation such as <#26> for the character with code number 26. In fact, you can enter any character in this format, but when you leave the input box, it will be converted to a single character. For example, <#169> will show up as © and <#65> as A. The binary number consisting of 32 zeros shows up in this box as <#0><#0><#0><#0>.
- 16-bit Characters — Unicode text encoding allows the use of 16 bits for each character, with 65536 possible characters, allowing it to represent text from all of the world's languages as well as mathematical and other symbols. (Even 65536 characters is not enough — some characters are actually represented using two or three 16-bit numbers, but that's not supported in this web app.) As with 8-bit characters, notations such as <#65534> can be used in the "16-bit Characters" input box.
As a final remark, note that you can't really ask for something like "the binary representation of 17", any more than you can ask for the meaning of the binary number "11000100110111". You can type "17" into six of the seven input boxes in this web app, and you will get five different binary representations! The meaning of "17" depends on the interpretation.