The size of information in the computer is measured in kilobytes, megabytes, and gigabytes. In this section, we'll look at common sizes you would see in real life.

Character building: any thinking person today should have a rough idea of what KB, MB and GB are.

Kilobyte or KB

One kilobyte (KB) is a collection of about 1000 bytes. A page of ordinary roman alphabetic text takes about 2 kilobytes to store (about one byte per letter). A typical short email would also take up just 1 or 2 kilobytes. Text is one of the most naturally compact types of data at about one byte required to store each letter. In non-roman alphabets, such as Kanji, the storage takes up 2 or 4 bytes per "letter" which is still pretty compact compared to audio and images.

Megabyte or MB

One megabyte is about 1 million bytes (or about 1000 kilobytes). An MP3 audio file of a few minutes or a 10 million pixel image from a digital camera would typically take up few megabytes. The rule of thumb for MP3 audio is that 1 minute of audio takes up about 1 megabyte. Audio and image and video data typically stored in "compressed" form, MP3 being an example. We'll talk about how compression works later. A data CD disk stores about 700 MB. The audio on a CD is not compressed, which is why it takes so much more space than the MP3. The series of bits are represented as spiral path of tiny pits in the silver material in the disk. Imagine that each pit is interpreted as a 0, and the lack of a pit is a 1 as the spiral sequence is read. Fun fact: the whole spiral on a CD is over 5km long.

Gigabyte or GB

One gigabyte (GB) is about 1 billion bytes, or 1 thousand megabytes. A computer might have 4 GB of RAM. A flash memory card used in a camera might store 8 GB. A DVD movie is roughly 4-8 GB.

Terabyte or TB

One terabyte (TB) is about 1000 gigabytes, or roughly 1 trillion bytes. You can buy 1 TB and 2 TB hard drives today, so we are just beginning the time when this term comes in to common use. Gigabyte used to be an exotic term too, until Moore's law made it common.

Gigahertz - Speed, not Bytes

One gigahertz is 1 billion cycles per second (a megahertz is a million cycles per second). Gigahertz is a measure of speed, very roughly the rate that at a CPU can do its simplest operation per second. Gigahertz does not precisely tell you how quickly a CPU gets work done, but it is roughly correlated. Higher gigahertz CPUs also tend to be more expensive to produce and they use more power (and as a result give off more heat) .. a challenge for putting fast CPUs in small devices like phones. The ARM company is famous for producing chips that are very productive with minimal power and heat. Almost all cell phones currently use ARM CPUs.

Kilobyte / Megabyte / Gigabyte Word Problems

You should be comfortable doing simple arithmetic to figure MB / GB sizes, just as you should be able to do basic computations with inches and pounds (or meters and kilos).

Alice has 600 MB of data. Bob has 700 MB of data. Will it all fit on Alice's 2 GB thumb drive?
Yes it fits: 600 MB + 700 MB is 1300 MB. 1300 MB is 1.3 GB, so it will fit on the 2 GB drive no problem. Equivalently we could say that the 2 GB drive has space for 2000 MB, so the 1300 MB fits.
Alice has 100 small images, each of which is 500 KB. How much space do they take up overall in MB?
100 times 500 KB is 50000 KB, which is 50 MB.
Your ghost hunting group is recording the sound inside a haunted Stanford classroom for 20 hours as MP3 audio files. About how much data will that be, expressed in GB?
MP3 audio takes up about 1 MB per minute. 20 hours, 60 minutes/hour, 20 * 60 yields 1200 minutes. So that's about 1200 MB, which is 1.2 GB.
Suppose we have an image which is 800 pixels by 600 pixels. Each pixel has its own red, green, and blue values, each stored in 1 byte. How many bytes are required to store the whole image in RAM?
800 x 600 is 480,000 pixels. Each pixel takes 3 bytes (one byte each for red/green/blue), so 480,000 * 3 is 1,440,000 bytes overall, i.e. about 1.4 MB, which is the space required for the image in RAM. On disk, you will notice that .jpg files takes up much less space than that; this is due to "compression" which is a very effective space-reduction technique for image and audio data - a future topic.

Alternate Terminology: Kibibyte Mebibyte Gibibyte Tebibyte

It's convenient within the computer to organize things in groups of powers of 2. For example, 210 is 1024, and so a program might group 1024 items together, as a sort of "round" number of things within the computer. The term "kilobyte" above refers to this group size of 1024 things. However, people also group things by thousands -- 1 thousand or 1 million items.

There's this problem with the word "megabyte" .. does it mean 1024 * 1024 bytes, i.e. 220 which is 1,048,576, or does it mean exactly 1 million, 1000 * 1000. It's just a 5% difference, but marketers tend to prefer the 1 million, interpretation, since it makes their hard drives etc. appear to hold a little bit more. Also, the difference grows larger and larger for the gigabyte and terabyte sizes. In an attempt to fix this, the terms "kibibyte" "mebibyte" "gibibyte" "tebibyte" have been introduced to specifically mean the 1024 based units (see wikipedia kibibyte article). These terms do not seem to have caught on very strongly thus far. If nothing else, remember that terms like "megabyte" have this little wiggle room in them between the 1024 and 1000 based meanings. We will never grade off for this distinction .. "about a million" will be our close-enough interpretation for "megabyte". The "error" at the megabyte level is about 5%. At the terabyte level the error is about 10%.