Analog-3 - Digital Media

< CS101

JPEG Images

JPEG is a free and open standard for storing digital images, such as you would take with a digital camera. JPEG is a "lossy" compression format, detailed below, allowing an image to be adjusted, losing some detail but requiring less space in the process.

JPEG is an incredibly successful standard, allowing computers, phones, printers, TVs, email, blogs, .. to exchange image files and understand each other. Some commonly understood standard format for what is "an image" is needed, and JPEG is the mostly widely used one.

JPEG stands for the Joint Photographic Expert Group, a technical committee which drafted the standard, originally in 1992. I doubt it was possible at the time to understand how widespread and critical this format would become.

JPEG is a "lossy" format, meaning that the level detail preserved when a JPEG is saved is adjustable. Say the quality levels are in the range q10, q20, .. q100, q100 corresponding to very little compression and high visual quality and q10 corresponding to very aggressive compression with lower visual quality. In reality the scale terminology is not exact across systems, sometimes described as 0-10, or 1-100. An image saved with q10 saves the maximum detail, but the resulting file takes up the most space. An image can be saved with a lower quality level, causing it to lose some detail, but take up less space. Or in other words, q1 is more compressed, and q100 is less compressed. JPEG is very smart about the way it loses detail, so saving at something like q70 is a normal thing to do without losing appreciable detail.

Here are three versions of the flowers.jpg image

Here is the image as it originally came out of my camera. I believe the camera uses about q70 internally. This image takes up 48 KB. The "raw" form of the image takes up about 490 KB, so q70 is saving us about 10x space. Basically, this shows JPEG works quite well for a 10x space savings.
flowers at q70

Here is the image saved at q50, taking up 29 KB. I cannot see obvious differences between this version and the one above, although there must be some tiny differences.
flowers at q50

Here is the same image, saved at q10, taking up 14KB (about 4x fewer bytes than the q70 version). Generally you don't want to compress this much. If you zoom way in, you can see the results of the compression in this version:
flowers at q10

There are two things you notice in JPEGs as the shed detail:

Considering the the q10 version takes about 4x fewer bytes than the q70 version, JPEG does a good job keeping the basic look of the scene when asked to use less space.

Images - How Many Bytes?

How many bytes does an image take up? The main issue is just plain size -- how many pixels. The 457 x 360 flowers image has 164520 pixels. Say each pixel takes three bytes (one for each color channel), that's 493560 bytes or about 493 KB.

Note that the q70 flowers image up there takes up 48 KB (10x smaller than 493 KB), so even at relatively high quality levels, JPEG saves a huge amount of space -- that's why cameras use it internally.

My cheap little "12 megapixel" digital camera takes images which are 4000 by 3000 pixels, stored with JPEG compression of about q70. Each resulting image takes up about 3 MB. Without JPEG each image would take up: 12 million pixels x 3-byte-per-pixel = 36 MB.

GIF and PNG Images

GIF and PNG (Portable Network Graphics) are "lossless" image formats, recording every pixel exactly. They are used for non-photographic images, like little solid color icons. GIF is older and used to be patented. PNG is newer and performs a little better. Most recently, a form of GIF has been used for short, no-audio video clips.

Audio Formats

MP3 is the dominant audio format (good example of a "network effect", a later topic). MP3 is lossy, like JPEG. Raw CD audio takes up about 10 MB per minute (this is how it is stored on an audio CD .. no compression). MP3 gets that down to about 1 MB per minute while still sounding pretty good. As with JPEG, you can choose the level of compression, say 2 MB per minute to keep more detail, or 512 KB per minute if space is at a premium.

MP3 is patented, and legitimately so. (Nick's opinion) Many modern software patents are ridiculous, just patenting obvious solutions. However, MP3 is legitimate: it uses complex and non-obvious techniques to get its excellent 10x compression while still sounding good. If a device or software plays or produces MP3s, a license fee is due to the patent holders, on the order $1-$2 per instance. With each MP3 device you have owned over the years .. you have in effect paid this fee each time. Licensing of devices which can play video is similar.

AAC is a more modern form of MP3 used by Apple with their dominant iPod/iTunes system. AAC also features DRM (Digital Rights Management) features, which allow the original content owner to control how the purchaser can use the content they have "bought".

Video Formats

A video is basically a series of images -- 20 to 60 per second, plus an audio "track". Video data takes up a lot of bytes, but computers have now become powerful enough to handle video. Very roughly speaking, say compressed video of about DVD quality takes about 2 GB per hour (roughly 30 MB per minute, or 0.5 MB per second). In reality, there is a very large range of video sizes -- HD video takes more space, smaller YouTube video takes less space. Video compression is complicated and the techniques are heavily patented.

MPEG (Motion Picture Experts Group) standardizes some video formats in the industry, and the MPEG-LA (Licensing Authority) handles collecting patent royalties, which are significant.

MPEG-2 is used in DVDs and some satellite TV systems, originally released in 1995. Compression techniques have gotten significantly better since then.

MPEG-4 and particular the "h.264" compression system is very good at producing good looking video with the minimum bytes. Most digital video cameras, phones, and Blu-ray disks use h.264 internally to compress and store the video data. Patent fees are paid by the manufacturer to produce an encoder or decoder in hardware or software.

h.264 Obnoxious Licensing Terms

One surprising thing about the h.264 licensing is that it does not come with an unrestricted right to distribute your own video. You have properly bought a video camera (paying for the patents), and produced your video and stored it on your hard drive. However, if you want to make a web site or whatever that distributes the video to many people, you may have to pay additional royalties for each minute distributed of your video. There may be exceptions if your video is distributed for free, however these terms have been changed over time, so really you have to consult a lawyer to see what you are permitted to do with your video. These license restrictions strike me as unusually obnoxious and certainly out of step with how you usually think of having your own data file. My guess is that MPEG did not have a lot of competition, resulting in these one-sided terms.

Open Video Format - WebM

In support of an open internet, Mozilla, Google and others have been working on a free and open "WebM" video compression scheme to compete with h.264 -- free to encode or decode video, and free to distribute the video however its owner wishes. Those are the sorts of terms under which the internet has thrived. See the WebM project.

Bottom line: if you convert a video to .webm format, and then you don't have to worry about the licensing for yourself or your audience. Chrome, Firefox, and Microsoft Edge support it out of the box. Apple is the holdout, and should be embarrassed to be so clearly on the wrong side of internet history. (Aside: as a company gets more powerful, does this tend to make it less customer friendly? Maybe this is an example.)

Old Way: Flash Video

HTML5 Video Tag

Here's our earlier hard-drive example video embedded with a <video> tag as follows (the video is kind of big). Note how similar this is to putting in an image tag.
<video src="how-a-hard-drive-works.webm" controls="1" width="750">

The video is not hosted by YouTube or something, it's a direct part of the page. This is the better way to do it.