I first discovered my passion for computers on the day my dad brought home our first family computer nearly 20 years ago. While it would be a few more years before that passion turned into an obsession bordering on insanity, the desire to create things out of intangible bits and bytes was irresistible. Little did I know, this same instinct was one that fueled the personal computing revolution that had already been raging for the past decade (and more). In those previous ten years, the world went from the first laptops and the World Wide Web to pocket computers and Wi-Fi, and while Microsoft and Apple duked it out over whose feature-packed operating system was better, there was a third contender that was built for—and by—computing hobbyists with that same drive to create. It was years ahead of the curve.
Not Just a Hobby
On August 25th, 1991, Linus Torvalds changed the world as we knew it when he announced the first version of the Linux operating system. In a short message to the comp.os.minix newsgroup on Usenet, Torvalds wrote of his then-unnamed hobby project that he was developing for Intel 386 computers; inspired by the MINIX operating system, but with a more generous license and better, if initially limited, functionality.
Less than a month later, version 0.01 of the Linux Kernel was uploaded to an FTP server on the Finnish University and Research Network for the world to use. Over the next two years, it became one of the first primarily-digitally distributed operating systems; but where it excelled in functionality and distribution, it severely lacked in usability.
Distro Fever
A hobbyist operating system through-and-through, Linux was considered incredibly difficult for novices to start using; so much so that the first “user-friendly” Linux distribution (commonly referred to as a “distro”) still required the user to edit the master boot record with a hex editor just to get it to boot from a hard drive. Out of these difficulties sprung a new set of distros aimed specifically at making Linux as user-friendly as possible. From 1993 to 1995, projects like Slackware, Debian, and Red Hat were launched with a multitude of useful features, such as graphical user interfaces, advanced networking capabilities, and even package managers.
Enter the Package Manager
Patch distribution in Linux is much different than its proprietary counterparts. Thanks to the open source revolution, Linux-based patches can be applied directly from the patch’s original source code, which has guided the design and implementation of its package management systems over the years. While there are countless package management solutions available for Linux today—another consequence of the open source ecosystem—many of them have been built off of the patterns established by only a handful of package managers created near the dawn of the Linux-era.
Package Management System (PMS), originally released with the BOGUS Linux distro in 1994, is generally regarded as the first package manager ever released for the Linux operating system. While the proprietary operating systems distributed by Apple and Microsoft were updated using closed-source methods that were specific to the target environment, the open source nature of the Linux operating system allowed PMS to compile packages and operating system patches directly from the original unmodified source code. Known as “pristine sources,” any changes required for a package to be built by PMS was patched in at compile-time, which allowed the original developers to release one, “pristine” version of their source code.
Created at about the same time as PMS, Red Hat was also working on their own bundled package manager, Red Hat Software Program Packages (RPP). While it was a great first effort, its reign was short-lived, having only been distributed with Red Hat Linux 1.0. Unlike PMS, RPP didn’t rely on pristine package sources, which meant that every package managed by RPP had to be modified specifically for RPP. This introduced a lot of complexity to developers who intended to distribute software via RPP, as it required them to make changes directly to their source code in order for it to build properly. The complexity of using RPP was further compounded by its inability to build packages for multiple architectures, which put Red Hat at a disadvantage as they began exploring the possibility of releasing their Linux distribution for other architectures like the DEC Alpha.
Just a year after the release of Red Hat Linux 1.0, Red Hat Package Manager, more commonly known as RPM, became RPP’s Perl-based successor. Released in 1995 with Red Hat Linux 2.0, and drawing upon the lessons learned from both PMS and RPP, RPM was originally created with the goal of allowing the team at Red Hat to build new versions of their flagship operating system without having to change the source code of any dependent components. While BOGUS Linux, and by extension PMS, were short-lived projects, the concept of “pristine sources” was something that heavily influenced the creation of RPM. Despite having seen several revisions in the past quarter century—including a rebuild in the C programming language—at its core RPM is one of the oldest package management systems still in use today.
While Red Hat was working towards its own holistic approach to package management, it was also calling out Debian Linux for quietly releasing their own package management tool for the Debian Package (dpkg) in 1994. Originally released as a shell script for managing packages, dpkg saw several rewrites before the year was over in both Perl, and finally in C. Used to install and manage packages on a Debian system, dpkg is a low-level tool that has received thousands of modifications over the past two decades (the most recent of which was just under 24 hours ago at the time of this writing).
Managing the Package Managers
While the earliest known package managers could be used to update both the operating system and its software from both physical and downloadable media, it wasn’t until several years later that the concept of automated package management was introduced. Built as more user-friendly frontends to package managers like dpkg and RPM, these tools automated much of the work required to find, retrieve, install, upgrade, and remove patches from a system. Through the use of locally managed remote repository databases, users can indicate where in the world these package management frontends should look for and download patches. While the nature of open source means that there are countless frontends available, only a very small handful of them have achieved any sort of ubiquity.
When it comes to RPM-based Linux distributions like Red Hat and Fedora, YUM wins the popularity contest. In fact, YUM is so ingrained in the fabric of these distros that you’d be hard pressed to find a user of an RPM-based Linux distribution that hasn’t heard of it before. Originally a rewrite of Yellowdog Linux’s own Yellowdog Updater, the Yellowdog Updater, Modified (YUM) was created by the Duke University Department of Physics for updating and managing their Red Hat Linux systems. What made YUM particularly valuable, aside from being able to remotely retrieve and install packages, was its ability to quickly determine and download any dependencies that the targeted package requires prior to installation. This emphasis on performance and stability made YUM an instant hit, and by 2005 it was estimated that over half of all machines running Linux were using YUM. Since that time, YUM has been superseded on some systems by other, more advanced solutions like DNF (Dandified YUM). However, in many cases, it is still the tool of choice for automatically managing patches on an RPM-based operating system in both the command line and the graphical user interface.
There are hundreds, if not thousands, of Linux distributions available in the wild, but Debian is arguably one of the most popular. If you’ve ever used a Debian-based operating system like Ubuntu, then the apt command should be a familiar one. Originally intended to replace dselect, Debian’s first dpkg frontend, Advanced Package Tool (known then as Project Deity) was far from an instant success. While APT resulted in an excellent tool for installing and managing packages and their dependencies from remote repositories, the original goal of replacing dselect’s user interface wasn’t met until years later, when other third-party solutions were created to better automate APT in a more user-friendly manner. Ubuntu Software Center, for example, is a graphical user interface for APT developed by the Ubuntu project to allow users to more easily automate the installation and management of software and operating system patches on their systems.
But Wait, There’s More!
Linux may have introduced the concept of automated package management, but by 2018 that exclusivity is long gone. Thanks to the ubiquity of the internet and an increasing need for secure software distribution, package management is no longer the novelty it once was. The App Store, Chocolatey, Homebrew, MacPorts, YUM, APT, Portage, NPM, Bundler, Composer… as Adam Wiggins explains in the Twelve-Factor App manifesto, dependency management is one of the key tenants of a stable application environment. While the Twelve-Factor App is primarily geared towards application developers, the importance of a single source of truth for package management should be readily apparent.
In the next article of this series, we will be digging deeper into the world of package management: what packages are, where they come from, and how package developers use these established distribution channels to safely deliver software to their users.