Today in Tedium: I fully admit it—I stretch images. I also intentionally wash images out, remove as many colors as possible, and save the images in formats that actively degrade the final result. This is a crime against imagery on the internet, an active ignoring of the integrity of the original picture, but to me, I kind of see it as having some artistic advantages. Degradation, you see, is a tenet of the modern internet, something that images have to do to flow through the wires more quickly. Gradually, the wires got fast enough that nearly any still image could be delivered through them in a reasonable amount of time. But the artifacts still matter. The degradation still matters. The JPEG was the puzzle piece that made the visual internet work. With that in mind, today’s Tedium considers how the JPEG came to life. — Ernie @ Tedium
Today’s GIF comes from a Computer Chronicles episode on file compression. Enjoy, nerds.
Want a byte-sized version of Hacker News? Try TLDR’s free daily newsletter.
TLDR covers the most interesting tech, science, and coding news in just 5 minutes.
No sports, politics, or weather.
The GIF was a de facto standard. The JPEG was an actual one
I always thought it disappointing that the one time Steve Wilhite truly addressed his audience of admirers in the modern day, he attempted to explain how the file format he invented was pronounced. And it didn’t go over particularly well.
I remember it well. Back in 2013, when he claimed it was pronounced with a soft-G, like the brand of peanut butter. I posted about the quote on ShortFormBlog, and the quote got nearly 5,000 “notes” on Tumblr. Many commenters felt steamed that this random guy emerged after a quarter-century to tell them how their word was supposed to be pronounced. I’m convinced this post unwittingly set the tide against Wilhite on the GIF’s favorite platform, despite the fact that I personally agreed with him.
The Frogman, a key innovator of the animated GIF form, put it as such: “It’s like someone trying to tell you ‘Sun’ is actually pronounced wombatnards.”
But in many ways, the situation paints how Wilhite, who died in 2022, did not develop his format by committee. He could say it sounded like “JIF” because he literally built it himself. It was not the creation of a huge group of people from different parts of the corporate world. He was handed the project as a CompuServe employee in 1987. He produced the object, and that was that. The initial document describing how it works? Dead simple. 37 years later, we’re still using the GIF.
The JPEG, which formally emerged about five years later, was very much not that situation. Far from it, in fact—it’s the difference between a de facto standard and an actual one.
Built with input from dozens of stakeholders, the goal of the Joint Photographic Experts Group was ultimately to create a format that fit everyone’s needs. And when the format was finally unleashed on the world, it was the subject of a 600-plus-page book.
And that book, not going to lie, has a killer cover:
JPEG: Still Image Data Compression Standard, written by IBM employees and JPEG organization stakeholders William B. Pennebaker and Joan L. Mitchell, describes a landscape of multimedia imagery, held back without a way to balance the need for photorealistic images and immediacy:
JPEG now stands at the threshold of widespread use in diverse applications. Many new technologies are converging to help make this happen. High-quality continuous-tone color displays are now a part of most personal computing systems. Most of these systems measure their storage in megabytes, and the processing power at the desk is approaching that of mainframes of just a few years ago. Communication over telephone lines is now routinely at 9,600 baud, and with each year modem capabilities improve. LANs are now in widespread use. CD-ROM and other mass-storage devices are opening up the era of electronic books. Multimedia applications promise to use vast numbers of images and digital cameras are already commercially available.
These technology trends are opening up both a capability and a need for digital continuous-tone color images. However, until JPEG compression came upon the scene, the massive storage requirement for large numbers of high-quality images was a technical impediment to widespread use of images. The problem was not so much the lack of algorithms for image compression (as there is a long history of technical work in this area), but, rather, the lack of a standard algorithm—one which would allow an interchange of images between diverse applications. JPEG has provided a high-quality yet very practical and simple solution to this problem.
And honestly, they were absolutely right. For more than 30 years, JPEG has made high-quality, high-resolution photography accessible in operating systems far and wide. Although we no longer need to compress every JPEG file to within an inch of its life, having that capability helped enable the modern internet.
(The book, which both tries to explain the way JPEG works for the layperson and through in-depth mathematical equations, is on the Internet Archive for one-hour checkout, by the way, but its layout is completely messed up, sadly.)
As the book notes, Mitchell and Pennebaker were given IBM’s support to follow through this research and work with the JPEG committee, and that support led them to develop many of the JPEG format’s foundational patents. One of the first patents filed by Mitchell and Pennebaker around image compression, filed in 1988 and granted in 1990, described an “apparatus and method for compressing and de-compressing binary decision data by arithmetic coding and decoding wherein the estimated probability Qe of the less probable of the two decision events, or outcomes, adapts as decisions are successively encoded.” Another, also tied to Pennebaker and Mitchell, described an “apparatus and method for adapting the estimated probability of either the less likely or more likely outcome (event) of a binary decision in a sequence of binary decisions involves the updating of the estimated probability in response to the renormalization of an augend A.”
That likely reads like gibberish to you, but essentially, IBM and other members of the JPEG standards committee, such as AT&T and Canon, were developing ways to use compression to make high-quality images easier to deliver in confined settings.
Each brought their own needs to the process. Canon, obviously, was more focused on printers and photography, while AT&T’s interests were tied to data transmission. Together, the companies left behind a standard that has more than stood the test of time.
All this means, funnily enough, that the first place that a program capable of using JPEG compression appeared was not MacOS or Windows, but OS/2, which supported the underlying technology of JPEG as early as 1990 through the OS/2 Image Support application. (The announcement of the support went under the radar, being announced as “Image compression and decompression capability for color and gray images in addition to bilevel images,” but Pennebaker and Mitchell make clear in their book that this coding appeared in OS/2 Image Support first.)
Hearing that there was a “first application” associated with JPEG brought me down a rabbit hole. I did a long search for this application yesterday, trying to find as much info as possible about it. My process involved setting up an OS/2 VM and a modern web browser, so I could run any OS/2 applications related to this.
But it was all for naught, but it did lead to an entertaining Mastodon thread. Unfortunately, what I thought would bring me a step closer to an application led me to a text file describing the application.
Any IBM employees with a copy of OS/2 Image Support lying around? You’re holding the starting point of modern-day computerized photography.
“The purpose of image compression is to represent images with less data in order to save storage cost or transmission time and costs. Obviously, the less data required to represent the image, the better, provided there is no penalty in obtaining a greater reduction. However, the most effective compression is achieved by approximating the original image (rather than reproducing it exactly), and the greater the compression, the more approximate (‘lossy’) the rendition is likely to be.”
— A description of the goals of the JPEG format, according to JPEG: Still Image Data Compression Standard. In many ways, the JPEG was intended to be a format that could be perfect when it needed to be, but good enough when the circumstances didn’t allow for perfection.
What a JPEG does when you heavily compress it
The thing that differentiates a JPEG file from a PNG or a GIF is the nature of its compression. The goal for a JPEG image is to still look like a photo when all is said and done, even if some compression is necessary to make it all work at a reasonable size. The idea is to make it so that you can display something that looks close to the original image in fewer bytes.
Central to this is a compression process called discrete cosine transform (DCT), a lossy form of compression encoding heavily used in all sorts of compressed formats, most notably in digital audio and signal processing. Essentially, it delivers a lower-quality product by removing extreme details, while still keeping the heart of the original product through approximation. The stronger the cosine transformation, the more compressed the final result.
The algorithm, developed by researchers Nasir Ahmed, T. Natarajan, and K. R. Rao in the 1970s, essentially takes a grid of data and treats it as if you’re controlling its frequency with a knob. The data comes out like a faucet, or like a volume control. The more data you want, the higher the setting. Essentially, DCT allows a trickle of data to still come out even in highly compromised situations, even if it means a slightly compromised result. In other words, you may not keep all the data when you compress it, but DCT allows you to keep the heart of it.
That is dumbed down significantly, because we are not a technical publication. However, if you want a more technical but still somewhat easy-to-follow description of DCT, I recommend this clip from Computerphile, featuring a description of compression from computer imaging researcher Mike Mound, who uses the wales on the jumper he’s wearing to break down how cosine transform functions.
DCT is everywhere. If you have ever seen a streaming video or an online radio stream that degraded in quality because your bandwidth suddenly declined, you are witnessing DCT being utilized in real time.
A JPEG file doesn’t have to leverage the DCT in just one way, as JPEG: Still Image Data Compression Standard explains:
The JPEG standard describes a family of large image compression techniques, rather than a single compression technique. It provides a “tool kit” of compression techniques from which applications can select elements that satisfy their particular requirements.
The toolkit has four modes, which work in these ways:
- Sequential DCT, which displays the compressed image in order, like a window shade slowly being rolled down
- Progressive DCT, which displays the full image in the lowest-resolution format, then adds detail as more information rolls in
- Sequential lossless, which uses the window shade format but doesn’t compress the image
- Hierarchial mode, which combines the prior three modes—so maybe it starts with a progressive mode, then loads DCT compression slowly, but then reaches a lossless final result
At the time the JPEG was being created, modems were extremely common, and that meant images loaded slowly, making Progressive DCT the most fitting format for the early internet. Over time, the progressive DCT mode has become less common, as many computers can simply load the sequential DCT in one fell swoop.
When an image is compressed with DCT, it tends to be less noticeable in areas of the image where there’s a lot of activity going on. Those areas are harder to compress, which means they keep their integrity longer. It tends to be more noticeable, however, with solid colors or in areas where the image sharply changes from one color to another—you know, like text on a page. (Which is why if you have a picture of text, you shouldn’t share it in a JPG format unless it is high resolution or you can live with the degradation.)
Other formats, like PNG, do better with text, because their compression format is intended to be non-lossy. (Notably, PNG’s compression format, DEFLATE, was designed by Phil Katz, who also created the ZIP format. The PNG format uses it in part because it was a license-free compression format. So it turns out the brilliant coder with the sad life story improved the internet in more ways than one before his untimely passing. How is there not a dramatic movie about Phil Katz?)
In many ways, the JPEG is one tool in our image-making toolkit. Despite its age and maturity, it remains one of our best options for sharing photos on the internet. But it is not a tool for every setting—despite the fact that, like a wrench sometimes used as a hammer, we often leverage it that way.
NO
The answer to the question, “Did NCSA Mosaic initially support inline JPEG files?” It’s surprising today, given the absolute ubiquity of the JPG format, but the browser that started the visual internet did not initially support JPG files without the use of an external reader. (It supported inline GIF files, however, along with the largely forgotten XBitMap format.) Support came in 1995, but by that point, Netscape Navigator had come out—explicitly promoting its offering of inline JPEG support as a marquee feature.
How a so-called patent troll was able to make bank off the JPEG in the early 2000s
If you’re a patent holder, the best kind of patent to hold is one that has been largely forgotten about, but is the linchpin of a common piece of technology already used by millions of people.
This is arguably what happened in 1986, when Compression Labs employees Wen-Hsiung Chen and Daniel J. Klenke filed what became U.S. patent 4,698,672, “Coding system for reducing redundancy,” which dealt with a way to improve signal processing for motion graphics, so they took up less space in distribution. This arguably overlapped with what the JPEG format was doing. They had created a ticking time bomb for the computer industry. Someone just needed to find it.
And find it they did. In 1997, a company named Forgent Networks acquired Compression Labs, and in 2002, Forgent claimed this patent effectively gave them partial ownership of the JPEG format in various settings, including digital cameras. They started filing patent lawsuits—and winning, big.
"The patent, in some respects, is a lottery ticket," Forgent Chief Financial Officer Jay Peterson told CNET in 2005. "If you told me five years ago that 'You have the patent for JPEG,' I wouldn't have believed it."
Now, if this situation sounds familiar to you, it’s because a better-known company, Unisys, had done this exact same thing nearly a decade prior, except with the GIF format. The company began threatening CompuServe and others at a time when the GIF was the internet’s favorite file format. Unisys apparently had no qualms with being unpopular with internet users of the era, and charged website owners $5,000 to use GIFs. Admittedly, that company had a more cut-and-dry case for doing so, as the firm directly owned the Lempel–Ziv–Welch (LZW) compression format that GIFs used, as it was created by employees of its predecessor company, Sperry. (This led to the creation of the patent-free PNG format in 1995.)
But Forgent, despite having a far more tenuous claim on its rights ownership to the JPEG compression algorithm, was nonetheless much more successful in drawing money from patent lawsuits against JPEG users, earning more than $100 million from digital camera makers during the early 2000s before the patent finally ran out of steam around 2007. The company also attempted to convince PC makers to give them a billion dollars, before being talked down to a mere $8 million.
In the process of trying to squeeze cash from an old patent, their claims grew increasingly controversial. Eventually, the patent was narrowed in scope to only motion-based uses, i.e. video. On top of that, evidence of prior art was uncovered because patent troll critics were understandably pissed off when Forgent started suing in 2004.
(The company tried expanding its patent-trolly horizons during this period. It began threatening DVR-makers over a separate patent that described recording TV shows to a computer.)
Forgent Networks no longer exists under that name. In 2007, just as the compression patent expired, the company renamed itself to Asure Software, which specializes in payroll and HR solutions. They used their money to get out of the patent-trolling game, which I guess is somewhat noble.
200M
The estimated number of images that the Library of Congress has inventoried in JPEG 2000 format, a successor standard to the JPEG first released in 2001. The flexible update of the original JPEG format added better compression performance, but required more computational power. The original JPEG format is much more popular, but JPEG 2000 has found success in numerous niches.
The JPEG file format has served us well. It’s been difficult to remove the format from its perch. The JPEG 2000 format, for example, was intended to supplant it by offering more lossless options and better performance. However, it is less an end-user format and more specialized.
JP2s are harder to find on the open web—one of the few places online that I see them happens to be the Internet Archive. (Which means the Internet Archive served images from that JPEG book in JP2 format.)
Other image technologies have had somewhat more luck getting past the JPG format, with the Google-supported WebP format proving popular with website developers (if controversial for the folks actually saving images), and the formats AVIF and HEIC, each developed by standards bodies, have largely outpaced both JPEG and JPEG 2000.
The JPEG will be difficult to kill at this juncture. These days, the format is similar to the MP3 file or the ZIP format—two legacy formats too popular to kill. Other formats that compress the files better and do the same things more efficiently are out there, but it’s difficult to topple a format with a 30-year head start.
Shaking off the JPG is easier said than done. I think most people will be fine to keep it around.
--
Find this one fascinating? Share it with a pal!
And if you’re looking for a tech-news roundup, TLDR is a great choice. Give ’em a look!