Digital Media

Digital Compression Lossless

Compression adjusts how data is stored to use fewer bytes
Lossless compression keeps all the original detail (vs. "lossy")
Suppose we have this series of sample numbers:
12000, 12002, 12006, 12007, 12010, 12006, 12005
The numbers are quite near to each other - typical for signals
What if we just record the first number
Then record the differences, one sample to the next (aka delta)
12000, +2, +4, +1, +3, -4, -1
The resulting numbers are much smaller, requiring fewer bytes to store
Compression: rearrange the data to use fewer bytes
This compression is lossless, recreating signal perfectly

Digital Compression Lossy

Lossy compression
Compress by discarding some detail
Example lossy scheme for the audio data:
-Discard every other number
-On playback, estimate the missing numbers as the average of the surrounding numbers
-This uses about half the space
-A "lossy" strategy, data comes back a little different
original (7 numbers):
12000, 12002, 12006, 12007, 12010, 12006, 12005
compressed (4 numbers):
12000, (xxx), 12006, (xxx), 12010, (xxx), 12005
playback (7 numbers):
12000, (12003), 12006, (12008), 12010, (12007), 12005
The compressed version does an ok lossy job, using about half the space
MP3 audio compression .. very effective lossy compression, about 10x less space
As a practical matter, MP3 sounds great
Lossy compression works very well with media image/sound/video data

With this picture of a signal digitized into a series of sample numbers, you can understand a bit how compression works. For audio, the samples tend to look like this: 12000, 12002, 12006, 12007, 12010, 12006, 12005.

As a practical matter, each sample number tends to be very close to the sample numbers that come just before and just after it in time. So one way to "compress" the audio, so it takes up less space, is to record just the changes -- for each sample, record how much change it is from the previous sample. So the above looks like 12000, +2, +4, +1, +3, -4, -1. These change numbers tend to be small, so it turns out they can be recorded more compactly (requiring fewer 0's and 1's). This is nice example of digital compression -- recording the data in a way which takes up less space, but you can still recreate the original signal. In this case, the compression is lossless.

Having translated the audio into the digital domain -- a series of sample numbers -- we open the data up to all sorts of computer manipulations, since computers are cheap and effective at manipulating numbers. MP3 is another example of audio compression. MP3 is complicated, reducing the space required by 10x, and also it is lossy, so it discards little bits of the original signal in a way which the human auditory system tends not to notice.

JPEG Standard

JPEG is free and open image format standard (wikipedia)
Printers, phones, computers, tablets, imaging software ... all interoperate via the JPEG standard
e.g. phone take a picture, sends to a web page, later sent to a printer
Each stage works, because its engineers refer to the JPEG standard
Typical standard success story
Compatibility between many players
No patents, no payments, no permission required
Standards are key to the Internet in the same way

Images - How Many Bytes?

What is the "raw" uncompressed size of an image?
flowers.jpg is 457 x 360
That's 164,520 pixels
Q: How many bytes per pixel?
A: 3! - one byte for each of red/green/blue
So total bytes is
164,520 * 3 = 493,560 bytes, i.e. about 493 KB
There are other formats with more bytes per pixel, but 3 is very common

JPEG Images

JPEG is lossy compression
JPEG quality levels and compression q1-q100 (sometimes called q1-q10)
Thinking about compression vs. quality vs. size can be confusing
As one goes up, another goes down
quality q100 = highest quality, most bytes, least compression
quality q1 = lower quality (artifacts), least bytes, most compression
quality q70 it a typical camera setting (generally not documented)
q70 gives up some imperceptible detail but saving 10x bytes

JPEG is a free and open standard for storing digital images, such as you would take with a digital camera. JPEG is a "lossy" compression format, detailed below, allowing an image to be adjusted, losing some detail but requiring less space in the process.

JPEG is an incredibly successful standard, allowing computers, phones, printers, TVs, email, blogs, .. to exchange image files and understand each other. Some commonly understood standard format for what is "an image" is needed, and JPEG is the mostly widely used one.

JPEG stands for the Joint Photographic Expert Group, a technical committee which drafted the standard, originally in 1992. I doubt it was possible at the time to understand how widespread and critical this format would become.

JPEG is a "lossy" format, meaning that the level detail preserved when a JPEG is saved is adjustable. Say the quality levels are in the range q10, q20, .. q100, q100 corresponding to very little compression and high visual quality and q10 corresponding to very aggressive compression with lower visual quality. In reality the scale terminology is not exact across systems, sometimes described as 0-10, or 1-100. An image saved with q10 saves the maximum detail, but the resulting file takes up the most space. An image can be saved with a lower quality level, causing it to lose some detail, but take up less space. Or in other words, q1 is more compressed, and q100 is less compressed. JPEG is very smart about the way it loses detail, so saving at something like q70 is a normal thing to do without losing appreciable detail.

How many bytes does an image take up? The main issue is just plain size -- how many pixels. The 457 x 360 flowers image has 164520 pixels. Say each pixel takes three bytes (one for each color channel), that's 493560 bytes or about 493 KB.

Flowers JPEG Examples

Here are versions of the flowers.jpg image with different compression levels...

Here is the image as it originally came out of my camera. I believe the camera uses about q70 compression internally. This image takes up 48 KB. The "raw" form of the image takes up 493 KB, so q70 is saving us about 10x space. Basically, this shows JPEG works quite well: giving up tiny amounts of detail for a 10x space savings.
flowers at q70

Here is the image compressed at q50, taking up 29 KB, 60% of the size of the q70. I cannot see obvious differences between this version and the one above, although there must be some tiny differences.
flowers at q50

Here is the same image compressed very aggressively at q10, taking up 14KB, or about 29% of the size of the q70 version. Generally you don't want to compress this much. If you zoom way in, you can see the results of the compression in this version:
flowers at q10

There are two things you notice in JPEGs as the shed detail:

Block artifacts -- JPEG divides the image into 8x8 blocks. If the compression is very high, you can see the block boundaries. You can see this clearly in the upper-left flower if you zoom your browser in. What's most amazing is how these blocks are not noticeable if you glance at the image normally.
Edge artifacts/noise -- JEPG has a hard time with crisp edges between two colors. In a more-compressed version, little "noise" speckles or distortions can appear to either side of the hard edge. Look at the left edge of the flower which is halfway down vertically, or at the very upper left flower.

Considering the the q10 version takes about 4x fewer bytes than the q70 version, JPEG does a good job keeping the basic look of the scene when asked to use less space.

GIF and PNG Images

GIF and PNG (Portable Network Graphics) are "lossless" image formats, recording every pixel exactly. They are used for non-photographic images, like little solid color icons. GIF is older and used to be patented. PNG is newer and performs a little better. Most recently, a form of GIF has been used for short, no-audio video clips.

Audio Formats

MP3 Audio - lossy compression like JPEG
A patent means that the patent holder can require a fee to use the technology
MP3 is patented .. $1-$2 fee paid to patent
MP3 patents expire in 2017
Compresses (lossy) audio down to about 1 MB per minute
Other audio formats: AAC, Ogg Vorbis (a free and open standard)
Aside: Nerds have a tragic penchant for choosing pun names for things that make terrible products names. I suspect Ogg Vorbis would be more widespread if they had given it a more marketable name like ClearSound. ("LibreOffice" is maybe another example this error.)

MP3 is the dominant audio format (good example of a "network effect", a later topic). MP3 is lossy, like JPEG. Raw CD audio takes up about 10 MB per minute (this is how it is stored on an audio CD .. no compression). MP3 gets that down to about 1 MB per minute while still sounding pretty good. As with JPEG, you can choose the level of compression, say 2 MB per minute to keep more detail, or 512 KB per minute if space is at a premium.

MP3 is patented, and legitimately so. (Nick's opinion) Many modern software patents are ridiculous, just patenting obvious solutions. However, MP3 is legitimate: it uses complex and non-obvious techniques to get its excellent 10x compression while still sounding good. If a device or software plays or produces MP3s, a license fee is due to the patent holders, on the order $1-$2 per instance. With each MP3 device you have owned over the years .. you have in effect paid this fee each time. Licensing of devices which can play video is similar.

Video Formats

Video is still images 20-60x a second + audio
Compressed video takes up roughly 30 MB/minute, or about 2 GB/hour
- vs. 1 MB/minute for audio
MPEG-2 used on DVDs (old)
MPEG-4/h.264 is ubiquitous, used by phones, cameras, blue-ray disks
MPEG-4/h.264 is high quality and heavily patented
Like MP3, you have paid an MPEG-4/h.264 fee with each device purchased

A video is basically a series of images -- 20 to 60 per second, plus an audio "track". Video data takes up a lot of bytes, but computers have now become powerful enough to handle video. Very roughly speaking, say compressed video of about DVD quality takes about 2 GB per hour (roughly 30 MB per minute). In reality, there is a very large range of video sizes -- HD video takes more space, smaller YouTube video takes less space. Video compression is complicated and the techniques are heavily patented.

MPEG (Motion Picture Experts Group) standardizes some video formats in the industry, and the MPEG-LA (Licensing Authority) handles collecting patent royalties, which are significant.

MPEG-2 is used in DVDs and some satellite TV systems, originally released in 1995. Compression techniques have gotten significantly better since then.

MPEG-4 and particular the "h.264" compression system is very good at producing good looking video with the minimum bytes. Most digital video cameras, phones, and Blu-ray disks use h.264 internally to compress and store the video data. Patent fees are paid by the manufacturer to produce an encoder or decoder in hardware or software.

h.264 Obnoxious Licensing Terms

One surprising thing about the h.264 licensing is that it does not come with an unrestricted right to distribute your own video. You have properly bought a video camera (paying for the patents), and produced your video and stored it on your hard drive. However, if you want to make a web site or whatever that distributes the video to many people, you may have to pay additional royalties for each minute distributed of your video. There may be exceptions if your video is distributed for free, however these terms have been changed over time, so really you have to consult a lawyer to see what you are permitted to do with your video. These license restrictions strike me as unusually obnoxious and certainly out of step with how you usually think of having your own data file. My guess is that MPEG did not have a lot of competition, resulting in these one-sided terms.

Open Video Format - WebM

In support of an open internet, Mozilla, Google and others have been working on a free and open "WebM" video compression scheme to compete with h.264 -- free to encode or decode video, and free to distribute the video however its owner wishes. Those are the sorts of terms under which the internet has thrived. See the WebM project.

Firefox, Chrome, and Microsoft Edge support WebM. Apple is the holdout. (Heh, maybe Apple is doing so well financially, they feel they need to behave obnoxiously. Actually I suspect Apple wants complex patent schemes to inhibit competition. Not a strategy to be proud of.)

If you were working on a project that took in video and re-distributed it ... you could fall afoul of h.264 licensing terms. That's why WebM exists. WikiPedia uses WebM for video. YouTube supports WebM.