Common section

Chapter Twenty-Six

The Operating System

We have, at long last, assembled—at least in our imaginations—what seems to be a complete computer. This computer has a central processing unit (CPU), some random access memory (RAM), a keyboard, a video display whose memory is part of RAM, and some kind of mass-storage device. All the hardware is in place, and we eye with excitement the on/off switch that will power it up and bring it to life. Perhaps this project has evoked in your mind the labors of Victor Frankenstein as he assembled his monster, or Geppetto as he built the wooden puppet that he will name Pinocchio.

But we’re still missing something, and it’s neither the power of a lightning bolt nor the purity of a wish upon a star. Go ahead: Turn on this new computer and tell us what you see.

As the screen blinks on, it displays pure random garbage. If you’ve constructed a graphics adapter, there will be dots of many colors but nothing coherent. For a text-only video adapter, you’ll see random characters. This is as we expect. Semiconductor memory loses its contents when the power is off and begins in a random and unpredictable state when it first gets power. All the RAM that’s been constructed for the microprocessor contains random bytes. The microprocessor begins executing these random bytes as if they were machine code. This won’t cause anything bad to happen—the computer won’t blow up, for instance—but it won’t be very productive either.

What we’re missing here is software. When a microprocessor is first turned on or is reset, it begins executing machine code at a particular memory address. In the case of the Intel 8080, that address is 0000h. In a properly designed computer, that memory address should contain a machine-code instruction (most likely the first of many) that the CPU executes when the computer is turned on.

How does that machine-code instruction get there? The process of getting software into a newly designed computer is one of the more confusing aspects of the project. One way to do it is with a control panel similar to the one in Chapter 19 used for writing bytes into random access memory and later reading them:

A control panel for reading and writing 64 kilobytes of memory, with Takeover switches to switch to manual control, a Write switch, and a Reset switch to reset a CPU.

Unlike the earlier control panel, this one has a switch labeled Reset. The Reset switch is connected to the Reset input of the CPU. As long as that switch is on, the microprocessor doesn’t do anything. When you turn off the switch, the microprocessor begins executing machine code at address 0000h.

To use this control panel, you turn the Reset switch on to reset the microprocessor and to stop it from executing machine code. You turn on the Takeover switch to take over the address bus and data bus. At this time, you can use the switches labeled A₀ through A₁₅ to specify a 16-bit memory address. The lightbulbs labeled D₀ through D₇ show you the 8-bit contents of that memory address. To write a new byte into that address, you set up the byte on switches D₀ through D₇ and flip the Write switch on and then off again. After you’re finished inserting bytes into memory, turn the Takeover switch off and the Reset switch off, and the microprocessor will execute the program.

This is how you enter your first machine-code programs into a computer that you’ve just built from scratch. Yes, it’s unbearably laborious. That goes without saying. That you will make little mistakes now and then is a given. That your fingers will get blisters and your brain will turn to mush is an occupational hazard.

But what makes it all worthwhile is when you start to use the video display to show the results of your little programs. One of the first pieces of code you’ll write is a little subroutine that converts numbers to ASCII. For example, if you’ve written a program that results in the value 4Bh, you can’t simply write that value to the video display memory. What you’ll see on the screen in that case is the letter K because that’s the letter that corresponds to the ASCII code 4Bh. Instead, you need to display two ASCII characters: 34h, which is the ASCII code for 4, and 42h, which is the ASCII code for B. You’ve already seen some code that does just that: the ToAscii routine on page 398 in Chapter 24.

One of your highest priorities is probably getting rid of that ridiculous control panel, and that involves writing a keyboard handler: a program that reads characters typed from the keyboard, stores them in memory, and also writes them to the screen. Transferring characters from the keyboard to the screen is sometimes called echoing, and it gives the illusion of a direct connection between keyboard and display.

You might want to expand this keyboard handler into something that executes simple commands—that is, something useful for the keyboard handler to do. The code that you write to handle these commands is now termed a command processor. To keep it simple at first, you decide on just three commands. These three commands correspond to the first letter that is typed on the line:

· W for Write

· D for Display

· R for Run

Your keyboard handler executes these commands when you hit the Enter key to signal that you’re finished typing the command.

If the line of text begins with a W, the command means to Write some bytes into memory. The line you type on the screen looks something like this:

W 1020 35 4F 78 23 9B AC 67

This command instructs the command processor to write the hexadecimal bytes 35h, 4Fh, and so on into memory beginning at address 1020h. For this job, the keyboard handler needs to convert ASCII codes to bytes—a reversal of the ToAscii conversion that I demonstrated earlier.

If the line of text begins with a D, the command means Display some bytes in memory. The line you type on the screen looks like this:

D 1030

The command processor responds by displaying bytes stored beginning at location 1030h. You can use the Display command to examine the contents of memory.

If the line of text begins with an R, the command means Run. Such a command looks like this:

R 1000

This command means “Run the program that’s stored beginning at address 1000h.” The command processor can store 1000h in the register pair HL and then execute the instruction PCHL, which loads the program counter from register pair HL, effectively jumping to that address.

Getting this keyboard handler and command processor working is an important milestone. Once you have it, you no longer need suffer the indignity of the control panel. Entering data from the keyboard is easier, faster, and classier.

Of course, you still have the problem that all the code you’ve entered disappears when you turn off the power. For that reason, you’ll probably want to store all this new code in read-only memory, or ROM. In the early days of microprocessors such as the Intel 8080, it became possible to program ROM chips in the privacy of your home. Programmable read-only memory (PROM) chips are programmable only once. Erasable programmable read-only memory (EPROM) chips can be programmed and reprogrammed after being entirely erased by exposure to ultraviolet light.

This ROM containing your keyboard handler would then occupy the address space beginning at 0000h formerly occupied by RAM. You’d still keep the RAM, of course, but it would occupy a somewhat higher address in the memory space.

The creation of the command processor is an important milestone not only because it provides a faster means to enter bytes into memory but also because the computer is now interactive.

Once you have the command processor in ROM, you can start experimenting with writing data from memory to the disk drive and reading the data back into memory. Storing programs and data on the disk is much safer than storing them in RAM (where they’ll disappear if the power fails) and much more flexible than storing them in ROM.

Eventually you might want to add some new commands to the command processor. For example, an S command might mean to Store some memory in a particular group of disk sectors, while the L command does a Load of the contents of those disk sectors into memory.

Of course, you’ll have to keep track of what you’re storing in which disk sectors. You’ll probably keep a pad and pencil handy for this purpose. And be careful: You can’t just store some code located at one address and then later load it back into memory at another address and expect it to work. All the Jump and Call instructions will be wrong because they indicate the old addresses. Also, you might have a program that’s longer than the sector size of your disk, so you’ll need to store it in several sectors. Some sectors on the disk will already be occupied by other programs or data, so the free sectors available for storing a long program might not be consecutive on the disk.

Eventually, you could decide that the manual clerical work involved in keeping track of where everything is stored on the disk is just too much. At this point, you’re ready for a file system.

A file system is software that organizes data into files. A file is simply a collection of related data that occupies one or more sectors on the disk. Most importantly, each file is identified by a name that helps you remember what the file contains. You can think of the disk as resembling a file cabinet in which each file has a little tab that indicates its name.

A file system is almost always part of a larger collection of software known as an operating system. The keyboard handler and command processor we’ve been building in this chapter could certainly evolve into an operating system. But instead of trudging through that long evolutionary process, let’s take a look instead at a real operating system and get a feel for what it does and how it works.

Historically, the most important operating system for 8-bit microprocessors was CP/M, originally standing for Control Program/Monitor but later renamed Control Program for Microcomputers. It was written in the mid-1970s for the Intel 8080 microprocessor by Gary Kildall (1942–1994), who later founded Digital Research Incorporated (DRI).

CP/M was stored on a disk, but most of the disk was available for storing your own files. The CP/M file system is fairly simple, but it satisfies two major requirements: First, each file on the disk is identified by a name that is also stored on the disk. Second, files don’t have to occupy consecutive sectors on a disk. It often happens that as files of various sizes are created and deleted, free space on the disk becomes fragmented. The ability of a file system to store a large file in nonconsecutive sectors is very useful. The table that equates files with their disk sectors is also stored on the disk.

Under CP/M, each file is identified with a two-part name. The first part is known as the filename and can have up to eight characters, and the second part is known as the file type or extension and can have up to three characters. There are several standard file types. For example, TXT indicates a text file (that is, a file containing only ASCII codes and readable by us humans), and COM (which is short for command) indicates a file containing 8080 machine-code instructions—a program.

This file-naming convention came to be known as 8.3 (pronounced eight dot three), indicating the maximum eight letters before the period and the three letters after. Although modern file systems have removed the limitation of eight characters and three characters, this general convention for naming files is still quite common.

Computers that used CP/M contained ROM with a small piece of code known as a bootstrap loader, so called because that code effectively pulls the rest of the operating system up by its bootstraps. The bootstrap loader loads the very first sector from the diskette into memory and runs it. This sector contains code to load the rest of CP/M into memory. The entire process is called booting the operating system, a term that is still widely used.

CP/M itself was organized in a hierarchy: At the lowest level was the Basic Input/Output System, or BIOS (pronounced BY-ohss). This contained code that directly accessed the hardware of the computer, including reading and writing disk sectors. Every manufacturer of a computer that ran CP/M would provide their own BIOS for their particular assemblage of hardware.

Next in the hierarchy was the Basic Disk Operating System, or BDOS (pronounced BE-doss). The primary function of the BDOS is to organize the disk sectors handled by the BIOS into files.

When CP/M finishes loading into memory, it runs a program called the Console Command Processor (CCP) to display a prompt on the screen:

In computers that have more than one disk drive, the A refers to the first disk drive, the one from which CP/M was loaded. The prompt is your signal to type something and press the Enter key. Most of the commands are for working with files, such as listing them (DIR for directory), erasing them (ERA), renaming them (REN), and displaying the contents (TYPE). A name that CP/M doesn’t recognize is assumed to be a program stored somewhere on disk.

CP/M also contained a collection of subroutines that programs could use to read from the keyboard, write characters to the video display, save data in a file on the disk, and load the contents of that file back into memory.

Programs running under CP/M did not need to access the hardware of the computers directly because the BDOS portion of CP/M used the BIOS portion to access the hardware. This means that a program written for CP/M could run on any computer running CP/M without knowing about the underlying hardware. This is a principle known as device independence, and it was crucial to the development of commercial software. Later on, such programs became known as applications or apps.

A collection of subroutines provided by an operating system is known as an application programming interface, or API. In an ideal world, the programmer of an application program needs to know only about the API and not how an API is implemented or the hardware it accesses. In reality, sometimes a little more knowledge is found to be helpful.

To a computer user, an operating system is the user interface, or UI. In the case of CP/M, this was the command-line interface (CLI) implemented by the CCP. To a programmer, an operating system is also the API—the collection of subroutines available for an application program.

In the case of CP/M, these subroutines had a common entry point at location 0005h in memory, and a program would use one of these subroutines by making a call to that memory location:

CALL 0005h

Or simply:

CALL 5

This was known as the “Call 5” interface!

The specific routine was specified by the value of register C. Here are a few examples:

A table of several CP/M operating system calls, including reading input from the keyboard, writing output to the screen, opening and closing files, and writing to and reading from files.

Often one of these functions would require more information. For example, when C is 09h, the register pair DE contains an address of ASCII characters to write to the display. The dollar sign ($) is used to mark the end of the string.

What does CALL 5 actually do? The memory location at 0005h is set up by CP/M to contain a JMP instruction, which jumps to a location in the BDOS part of CP/M, which then checks the value of the C register and jumps to the appropriate subroutine.

CP/M was once a very popular operating system for the 8080 and remains historically important. CP/M was the major influence behind a 16-bit operating system named QDOS (Quick and Dirty Operating System) written by Tim Paterson of Seattle Computer Products for Intel’s 16-bit 8086 and 8088 chips. QDOS was eventually renamed 86-DOS and licensed by Microsoft Corporation. Under the name MS-DOS (Microsoft Disk Operating System, pronounced em ess dahs, like the German article das), the operating system was licensed to IBM for the first IBM Personal Computer, introduced in 1981. Although a 16-bit version of CP/M (called CP/M-86) was also available for the IBM PC, MS-DOS quickly became the standard. MS-DOS (called PC-DOS on IBM’s computers) was also licensed to other manufacturers that created computers compatible with the IBM PC.

As the name implies, MS-DOS is primarily a disk operating system, as was Apple DOS, created in 1978 for the Apple II. Very little was provided apart from the ability to write files to disks, and later read those files.

In theory, application programs are supposed to access the hardware of the computer only through the interfaces provided by the operating system. But many programmers of the 1970s and 1980s often bypassed the operating system, particularly in dealing with the video display. Programs that directly wrote bytes into video display memory ran faster than programs that didn’t. Indeed, for some applications—such as those that needed to display graphics on the video display—the operating system was totally inadequate. What many programmers liked most about these early operating system was that they “stayed out of the way” and let programmers write programs that ran as fast as the hardware allowed.

The first indication that home computers were going to be much different from their larger and more expensive cousins was probably the application VisiCalc. Designed and programmed by Dan Bricklin (born 1951) and Bob Frankston (born 1949) and introduced in 1979 for the Apple II, VisiCalc used the screen to give the user a two-dimensional view of a spreadsheet. Prior to VisiCalc, a spreadsheet was a wide piece of paper with rows and columns generally used for doing a series of calculations. VisiCalc replaced the paper with the video display, allowing the user to move around the spreadsheet, enter numbers and formulas, and recalculate everything after a change.

What was amazing about VisiCalc is that it was an application that could not be duplicated on larger computers. A program such as VisiCalc needs to update the screen very quickly. For this reason, it wrote directly to the random-access memory used for the Apple II’s video display. This memory is part of the address space of the microprocessor. This was not how large computers were designed or operated.

The faster a computer can respond to the keyboard and alter the video display, the tighter the potential interaction between user and computer. Most of the software written in the first decade of the personal computer (through the 1980s) wrote directly to video display memory. Because IBM set a hardware standard that other computer manufacturers adhered to, software manufacturers could bypass the operating system and use the hardware directly without fear that their programs wouldn’t run right (or at all) on some machines. If all the PC clones had different hardware interfaces to their video displays, it would have been too difficult for software manufacturers to accommodate all the different designs.

But as applications proliferated, problems surfaced. The most successful applications took over the whole screen and implemented a sophisticated UI based around the keyboard. But each application had its own ideas about the UI, which meant that skills learned in one application couldn’t be leveraged into others. Programs also couldn’t coexist well. Moving from one program to another generally required ending the running program and starting up the next.

A much different vision of personal computing had been developing for several years at the Palo Alto Research Center (PARC), which was founded by Xerox in 1970 in part to help develop products that would allow the company to enter the computer industry.

The first big project at PARC was the Alto, designed and built in 1972 and 1973. By the standards of those years, it was an impressive piece of work. The floor-standing system unit had 16-bit processing, two 3 MB disk drives, and 128 KB of memory (expandable to 512 KB). The Alto preceded the availability of 16-bit single-chip microprocessors, so the processor had to be build from about 200 integrated circuits.

The video display was one of several unusual aspects of the Alto. The screen was approximately the size and shape of a sheet of paper—8 inches wide and 10 inches high. It ran in a graphics mode with 606 pixels horizontally by 808 pixels vertically, for a total of 489,648 pixels. One bit of memory was devoted to each pixel, which meant that each pixel could be either black or white. The total amount of memory devoted to the video display was 64 KB, which was part of the address space of the processor.

By writing into this video display memory, software could draw pictures on the screen or display text in different fonts and sizes. Rather than using the video display simply to echo text typed by the keyboard, the screen became a two-dimensional high-density array of information and a more direct source of user input.

The Alto also included a little device called a mouse, which rolled on the table and contained three buttons. This was an invention of engineer and inventor Douglas Engelbart (1925–2013) while at the Sanford Research Center. By rolling the mouse on the desk, the user of the Alto could position a pointer on the screen and interact with onscreen objects.

Over the remainder of the 1970s, programs written for the Alto developed some very interesting characteristics. Multiple programs were put into windows and displayed on the same screen simultaneously. The video graphics of the Alto allowed software to go beyond text and truly mirror the user’s imagination. Graphical objects (such as buttons and menus and little pictures called icons) became part of the user interface. The mouse was used for selecting windows or triggering the graphical objects to perform program functions.

This was software that went beyond the user interface into user intimacy, software that facilitated the extension of the computer into realms beyond those of simple number crunching, software that was designed—to quote the title of a paper written by Douglas Engelbart in 1963—“for the Augmentation of Man’s Intellect.”

The Alto was the beginning of the graphical user interface, or GUI, often pronounced gooey, and much of the pioneering conceptual work is attributed to Alan Kay (born 1940). But Xerox didn’t sell the Alto (one would have cost over $30,000 if they had), and over a decade passed before the ideas in the Alto would be embodied in a successful consumer product.

In 1979, Steve Jobs and a contingent from Apple Computer visited PARC and were quite impressed with what they saw. But it took them over three years to introduce a computer that had a graphical interface. This was the ill-fated Apple Lisa in January 1983. A year later, however, Apple introduced the much more successful Macintosh.

The original Macintosh had a Motorola 68000 microprocessor, 64 KB of ROM containing the operating system, 128 KB of RAM, a 3.5-inch diskette drive (storing 400 KB per diskette), a keyboard, a mouse, and a video display capable of displaying 512 pixels horizontally by 342 pixels vertically. (The display itself measured only 9 inches diagonally.) That’s a total of 175,104 pixels. Each pixel was associated with 1 bit of memory and could be colored either black or white, so about 22 KB were required for the video display RAM.

The hardware of the original Macintosh was elegant but hardly revolutionary. What made the Mac so different from other computers available in 1984 was the Macintosh operating system, generally referred to as the system software at the time and later known as Mac OS, and currently as macOS.

A text-based single-user operating system such as CP/M or MS-DOS or Apple DOS isn’t very large, and most of the API supports the file system. A graphical operating system such as macOS, however, is much larger and has hundreds of API functions. Each of them is identified by a name that describes what the function does.

While a text-based operating system such as MS-DOS provides a couple of simple API functions to let application programs display text on the screen in a teletypewriter manner, a graphical operating system such as macOS must provide a way for programs to display graphics on the screen. In theory, this can be accomplished by implementing a single API function that lets an application set the color of a pixel at a particular horizontal and vertical coordinate. But it turns out that this is inefficient and results in very slow graphics.

It makes more sense for the operating system to provide a complete graphics programming system, which means that the operating system includes API functions to draw lines, rectangles, and curves as well as text. Lines can be either solid or composed of dashes or dots. Rectangles and ellipses can be filled with various patterns. Text can be displayed in various fonts and sizes and with effects such as boldfacing and underlining. The graphics system is responsible for determining how to render these graphical objects as a collection of dots on the display.

Programs running under a graphical operating system use the same APIs to draw graphics on both the computer’s video display and the printer. A word processing application can thus display a document on the screen so that it looks very similar to the document later printed, a feature known as WYSIWYG (pronounced wizzy wig). This is an acronym for “What you see is what you get,” the contribution to computer lingo by the comedian Flip Wilson in his Geraldine persona.

Part of the appeal of a graphical user interface is that different applications have similar UIs and leverage a user’s experience. This means that the operating system must also support API functions that let applications implement various components of the user interface, such as buttons and menus. Although the GUI is generally viewed as an easy environment for users, it’s also just as importantly a better environment for programmers. Programmers can implement a modern user interface without reinventing the wheel.

Even before the introduction of the Macintosh, several companies had begun to create a graphical operating system for the IBM PC and compatibles. In one sense, the Apple developers had an easier job because they were designing the hardware and software together. The Macintosh system software had to support only one type of diskette drive, one type of video display, and two printers. Implementing a graphical operating system for the PC, however, required supporting many different pieces of hardware.

Moreover, although the IBM PC had been introduced just a few years earlier (in 1981), many people had grown accustomed to using their favorite MS-DOS applications and weren’t ready to give them up. It was considered very important for a graphical operating system for the PC to run MS-DOS applications as well as applications designed expressly for the new operating system. (The Macintosh didn’t run Apple II software, primarily because it used a different microprocessor.)

In 1985, Digital Research (the company behind CP/M) introduced GEM (the Graphical Environment Manager), VisiCorp (the company marketing VisiCalc) introduced VisiOn, and Microsoft released Windows version 1.0, which was quickly perceived as being the probable winner in the “windows wars.” But it wasn’t until the May 1990 release of Windows 3.0 that Windows began to attract a significant number of users, eventually to become the dominant operating system for desktops and laptops. Despite the superficially similar appearances of the Macintosh and Windows, the APIs for the two systems are completely different.

Phones and tablets are another story, however. Although there are many similarities in the graphical interfaces of phones, tablets, and larger personal computers, these APIs are also different. Currently the phone and tablet market is dominated by operating systems created by Android and Apple.

Although not quite visible to most users of computers, the legacy and influence of the operating system UNIX remains a powerful presence. UNIX was developed in the early 1970s at Bell Telephone Laboratories largely by Ken Thompson (born 1943) and Dennis Ritchie (1941– 2011), who also had some of the best beards in the computer industry. The funny name of the operating system is a play on words: UNIX was originally written as a less hardy version of an earlier operating system named Multics (which stands for Multiplexed Information and Computing Services), which Bell Labs had been codeveloping with MIT and GE.

Among hardcore computer programmers, UNIX is the most beloved operating system of all time. While most operating systems are written for specific computers, UNIX was designed to be portable, which means that it can be adapted to run on a variety of computers.

Bell Labs was a subsidiary of American Telephone & Telegraph at the time UNIX was developed and therefore subject to court decrees intended to curb AT&T’s monopoly position in the telephone industry. Originally, AT&T was prohibited from marketing UNIX; the company was obliged to license it to others. So beginning in 1973, UNIX was extensively licensed to universities, corporations, and the government. In 1983, AT&T was allowed back into the computer business and released its own version of UNIX.

The result is that there’s no single version of UNIX. There are, instead, a variety of different versions known under different names running on different computers sold by different vendors. Lots of people have put their fingers into UNIX and left their fingerprints behind. Still, however, a prevalent “UNIX philosophy” seems to guide people as they add pieces to UNIX. Part of that philosophy is using text files as a common denominator. Many little UNIX command-line programs (called utilities) read text files, do something with them, and then write to another text file. UNIX utilities can be strung together in chains that do different types of processing on these text files.

The most interesting development for UNIX in recent years has been the Free Software Foundation (FSF) and the GNU project, both founded by Richard Stallman (born 1953). GNU (pronounced not like the animal but instead with a distinct G at the beginning) stands for “GNU’s Not UNIX,” which, of course, it’s not. Instead, GNU is intended to be compatible with UNIX but distributed in a manner that prevents the software from becoming proprietary. The GNU project has resulted in the creation of many UNIX-compatible utilities and tools, and also Linux, which is the core (or kernel) of a UNIX-compatible operating system.

Written largely by Finnish programmer Linus Torvalds (born 1969), Linux has become quite popular in recent years. The Android operating system is based on the Linux kernel, large supercomputers use Linux exclusively, and Linux is also quite common on internet servers.

But the internet is a subject for the final chapter in this book.

Page

Contents

If you find an error or have any questions, please email us at admin@erenow.org. Thank you!