A history of ARM, part 1: Building the first chip

Enlarge[1]Aurich Lawson / Getty Images

It was 1983, and Acorn Computers was on top of the world. Unfortunately, trouble was just around the corner. The small UK company was famous[2] for winning a contract with the British Broadcasting Corporation to produce a computer for a national television show.

Sales of its BBC Micro were skyrocketing and on pace to exceed 1.2 million units.

A history of ARM, part 1: Building the first chipEnlarge[3] / A magazine ad for the BBC Micro.

The tagline was "The Shape of Things to Come."

But the world of personal computers was changing. The market for cheap 8-bit micros that parents would buy to help kids with their homework[4] was becoming saturated. And new machines from across the pond, like the IBM PC and the upcoming Apple Macintosh, promised significantly more power and ease of use.

Acorn needed a way to compete, but it didn't have much money for research and development.

A seed of an idea

Sophie Wilson, one of the designers of the BBC Micro, had anticipated this problem. She had added a slot called the "Tube" that could connect to a more powerful central processing unit. A slotted CPU could take over the computer, leaving its original 6502 chip free for other tasks.

But what processor should she choose? Wilson and co-designer Steve Furber considered various 16-bit options, such as Intel's 80286, National Semiconductor's 32016, and Motorola's 68000. But none were completely satisfactory.

A history of ARM, part 1: Building the first chipEnlarge[5] / The 286, 32016, and 68000 CPUs, roughly to scale.Wikipedia

In a later interview[6] with the Computing History Museum, Wilson explained, "We could see what all these processors did and what they didn't do.

So the first thing they didn't do was they didn't make good use of the memory system. The second thing they didn't do was that they weren't fast; they weren't easy to use. We were used to programming the 6502 in the machine code, and we rather hoped that we could get to a power level such that if you wrote in a higher level language you could achieve the same types of results."

But what was the alternative? Was it even thinkable for tiny Acorn to make its own CPU from scratch? To find out, Wilson and Furber took a trip to National Semiconductor's factory in Israel.

They saw hundreds of engineers and a massive amount of expensive equipment. This confirmed their suspicions that such a task might be beyond them. Then they visited the Western Design Center in Mesa, Arizona.

This company was making the beloved 6502 and designing a 16-bit successor, the 65C618. Wilson and Furber found little more than a "bungalow in a suburb" with a few engineers and some students making diagrams using old Apple ][ computers and bits of sticky tape.

A history of ARM, part 1: Building the first chipEnlarge[7] / The Western Design Center in 2022, according to Google. It might even be the same bungalow!

Suddenly, making their own CPU seemed like it might be possible. Wilson and Furber's small team had built custom chips before, like the graphics and input/output chips for the BBC Micro.

But those designs were simpler and had fewer components than a CPU. Despite the challenges, upper management at Acorn supported their efforts. In fact, they went beyond mere support.

Acorn co-founder Herman Hauser, who had a PhD in Physics, gave the team copies of IBM research papers[8] describing a new and more powerful type of CPU. It was called RISC, which stood for "reduced instruction set computing."

Taking a RISC

What exactly did this mean? To answer that question, let's take a super-simplified crash course on how CPUs work.

It starts with transistors, tiny sandwich-like devices made from silicon mixed with different chemicals. Transistors have three connectors. When a voltage is put into the gate input, it allows electricity to flow freely from the source input to the drain output.

When there is no voltage on the gate, this electricity stops flowing. Thus, the transistor works as a controllable switch.

A history of ARM, part 1: Building the first chipA simplified transistor animation.Jeremy Reimer

You can combine transistors to form logic gates. For example, two switches connected in a series make an "AND" gate, and two connected in parallel form an "OR" gate.

These gates let a computer make choices by comparing numbers.

A history of ARM, part 1: Building the first chipEnlarge[9] / Simplified AND and OR gates, using transistors.Jeremey Reimer

But how to represent numbers? Computers use binary, or Base 2, by equating a small positive voltage to the number 1 and no voltage to 0. These 1s and 0s are called bits.

Since binary arithmetic is so simple, it's easy to make binary adders that can add 0 or 1 to 0 or 1 and store both the sum and an optional carry bit. Numbers higher than 1 can be represented by adding more adders that work at the same time. The number of simultaneously accessible binary digits is one measure of the "bitness" of a chip.

An 8-bit CPU like the 6502 processes numbers in 8-bit chunks.

A history of ARM, part 1: Building the first chipEnlarge[10] / A full adder circuit made of AND and OR gates.

Click here[11] for an interactive version by Charles Petzold.Jeremy Reimer

Arithmetic and logic are a big part of what a CPU does. But humans need a way to tell it what to do. So every CPU has an instruction set, which is a list of all the ways it can move data in and out of memory, do math calculations, compare numbers, and jump to different parts of a program.

The RISC idea was to drastically reduce the number of instructions, which would simplify the internal design of the CPU. How drastically? The Intel 80286, a 16-bit chip, had a total of 357 unique instructions.

The new RISC instruction set that Sophie Wilson was creating would have only 45.

A history of ARM, part 1: Building the first chipEnlarge[12] / Comparison of Intel 80286 and ARM V1 instruction set.

Each instruction variant has a separate numerical code. (Spreadsheet compiled by the author.)

To achieve this simplification, Wilson used a "load and store" architecture. Traditional (complex) CPUs had different instructions to add numbers from two internal "registers" (small chunks of memory inside the chip itself) or to add numbers from two addresses in external memory or for combinations of each. RISC chip instructions, in contrast, would only work on registers.

Separate instructions would then move the answer from the registers to external memory.

A history of ARM, part 1: Building the first chipEnlarge[13] / Comparison of assembly language for a generic CISC CPU versus a generic RISC one.

The RISC processor must load memory values into registers before operating on them.

This meant that programs for RISC CPUs typically took more instructions to produce the same result. So how could they be faster? One answer was that the simpler design could be run at a higher clock speed.

But another reason was that more complex instructions took longer for a chip to execute. By keeping them simple, you could make every instruction execute in a single clock cycle. This made it easier to use something called pipelining[14].

Typically, a CPU has to process instructions in stages. It needs to fetch an instruction from memory, decode the instruction, and then execute the instruction. The RISC CPU that Acorn was designing would have a three-stage pipeline.

While one part of the chip executed the current instruction, another part was fetching the next one, and so forth.

A history of ARM, part 1: Building the first chipEnlarge[15] / The ARM V1 pipeline.

Each stage takes the same time to complete.WikiChip.org

A disadvantage of the RISC design was that since programs required more instructions, they took up more space in memory. Back in the late 1970s, when the first generation of CPUs were being designed, 1 megabyte of memory cost about £5,000. So any way to reduce the memory size of programs (and having a complex instruction set would help do that) was valuable.

This is why chips like the Intel 8080, 8088, and 80286 had so many instructions. But memory prices were dropping rapidly. By 1994, that 1 megabyte would be under £6.

So the extra memory required for a RISC CPU was going to be much less of a problem in the future. To further future-proof the new Acorn CPU, the team decided to skip 16 bits and go straight to a 32-bit design. This actually made the chip simpler internally because you didn't have to break up large numbers as often, and you could access all memory addresses directly. (In fact, the first chip only exposed 26 pins of its 32 address lines, since 2 to the power of 26, or 64MB, was a ridiculous amount of memory for the time.)

All the team needed now was a name for the new CPU. Various options were considered, but in the end, it was called the Acorn RISC Machine, or ARM.

On an ARM and a prayer

The development of the first ARM chip took eighteen months. To save money, the team spent a lot of time testing the design before they put it into silicon.

Furber wrote an emulator for the ARM CPU in interpreted BASIC on the BBC Micro. This was incredibly slow, of course, but it helped prove the concept and validate that Wilson's instruction set would work as designed. According to Wilson, the development process was ambitious but straightforward.

"We thought we were crazy," she said. "We thought we wouldn't be able to do it. But we kept finding that there was no actual stopping place. It was just a matter of doing the work."

Furber did much of the layout and design of the chip itself, while Wilson concentrated on the instruction set. But in truth, the two jobs were deeply intertwined. Picking the code numbers for each instruction isn't done arbitrarily.

Each number is chosen so that when it is translated into binary digits, appropriate wires along the instruction bus activate the right decoding and routing circuits. The testing process matured, and Wilson led a team that wrote a more advanced emulator. "With pure instruction simulators, we could have things that were running at hundreds of thousands of ARM instructions per second on a 6502 second processor," she explained. "And we could write a very large amount of software, port BBC BASIC to the ARM and everything else, second processor, operating system. And this gave us increasing amounts of confidence.

Some of this stuff was working better than anything else we'd ever seen, even though we were interpreting ARM machine code. ARM machine code itself was so high-performance that the result of interpreted ARM machine code was often better than compiled code on the same platform." These amazing results spurred the small team to finish the job.

The design for the first ARM CPU was sent to be fabricated at VLSI Technology Inc., an American semiconductor manufacturing firm. The first version of the chip came back to Acorn on April 26, 1985. Wilson plugged it into the Tube slot on the BBC Micro, loaded up the ported-to-ARM version of BBC BASIC, and tested it with a special PRINT command.

The chip replied, "Hello World, I am ARM[16]," and the team cracked open a bottle of champagne.

A history of ARM, part 1: Building the first chipEnlarge[17] / One of the very first ARM chips.Center for Computing History, UK[18]

Let's step back for a moment and reflect on what an amazing accomplishment this was. The entire ARM design team consisted of Sophie Wilson, Steve Furber, a couple of additional chip designers, and a four-person team writing testing and verification software. This brand new 32-bit CPU based on an advanced RISC design was created by fewer than ten people, and it worked correctly the first time.

In contrast, National Semiconductor was up to the 10th revision of the 32016 and was still finding bugs. How did the Acorn team do this? They designed ARM to be as simple as possible.

The V1 chip had only 27,000 transistors (the 80286 had 134,000!) and was fabricated on a 3 micrometer process--that's 3,000 nanometers, or about a thousand times less granular than today's CPUs.

A history of ARM, part 1: Building the first chipEnlarge[19] / The ARM V1 chip and its block diagram.WikiChip.org[20]

At this level of detail, you can almost make out the individual transistors. Look at the register file, for example, and compare it to this interactive block diagram[21] on how random access memory works. You can see the instruction bus carrying data from the input pins and routing it around to the decoders and to the register controls.

As impressive as the first ARM CPU was, it's important to point out the things it was missing. It had no on-board cache[22] memory. It didn't have multiplication or division circuits.

It also lacked a floating point unit, so operations with non-whole numbers were slower than they could be. However, the use of a simple barrel shifter[23] helped with floating point numbers. The chip ran at a very modest 6 MHz.

And how well did this plucky little ARM V1 perform? In benchmarks, it was found to be roughly 10 times faster than an Intel 80286 at the same clock speed and equivalent to a 32-bit Motorola 68020 running at 17 MHz. The ARM chip was also designed to run at very low power.

Wilson explained that this was entirely a cost-saving measure--the team wanted to use a plastic case for the chip instead of a ceramic one, so they set a maximum target of 1 watt of power usage.
But the tools they had for estimating power were primitive. To make sure they didn't go over the limit and melt the plastic, they were very conservative with every design detail. Because of the simplicity of the design and the low clock rate, the actual power draw ended up at 0.1 watts.

In fact, one of the first test boards the team plugged the ARM into had a broken connection and was not attached to any power at all. It was a big surprise when they found the fault because the CPU had been working the whole time. It had turned on just from electrical leakage coming from the support chips.

The incredibly low power draw of the ARM chip was a "complete accident," according to Wilson, but it would become important later.

ARMing a new computer

So Acorn had this amazing piece of technology, years ahead of its competitors. Surely financial success was soon to follow, right? Well, if you follow computer history[24], you can probably guess the answer.

By 1985, sales of the BBC Micro were starting to dry up, squeezed by cheap Sinclair Spectrums on one side and IBM PC clones on the other. Acorn sold a controlling interest in its company to Olivetti, with whom it had previously partnered to make a printer for the BBC Micro. In general, if you're selling your computer firm to a typewriter company, that's not a good sign.

Acorn sold a development board with the ARM chip to researchers and hobbyists, but it was limited to the market of existing BBC Micro owners. What the company needed was a brand new computer to really showcase the power of this new CPU. Before it could do this, it needed to upgrade the original ARM just a bit.

The ARM V2 came out in 1986 and added support for coprocessors (such as a floating point coprocessor, which was a popular add-on for computers back then) and built-in hardware multiplication circuits. It was fabricated on a 2 micrometer process, which meant that Acorn could boost the clock rate to 8 MHz without consuming any more power. But a CPU alone wasn't enough to build a complete computer.

So the team built a graphics controller chip, an input/output controller, and a memory controller. By 1987, all four chips, including the ARM V2, were ready, along with a prototype computer to put them in. To reflect its advanced thinking capabilities, the company named it the Acorn Archimedes.

A history of ARM, part 1: Building the first chipEnlarge[25] / One of the first models of the Acorn Archimedes.Wikipedia

Given that it was 1987, personal computers were now expected to come equipped with more than just a prompt to type in BASIC instructions.

Users demanded pretty graphical user interfaces like those on the Amiga, the Atari ST, and the Macintosh. Acorn had set up a remote software development team in Paolo Alto, home of Xerox PARC[26], to design a next-generation operating system for the Archimedes. It was called ARX, and it promised preemptive multitasking and multiple user support.

ARX was slow, but the bigger problem was that it was late. Very late. The Acorn Archimedes was getting ready to ship, and the company didn't have an operating system to run on it.

This was a crisis situation. So Acorn management went to talk to Paul Fellows, the head of the Acornsoft team who had written a bunch of languages for the BBC Micro. They asked him, "Can you and your team write and ship an operating system for the Archimedes in five months?"

According to Fellows[27], "I was the fool who said yes, we can do it." Five months is not a lot of time to make an operating system from scratch. The quick-and-dirty OS was called "Project Arthur," possibly after the famous British computer scientist Arthur Norman, but also possibly a shortening of "ARm by THURsday!" It started as an extension of BBC BASIC.

Richard Manby wrote a program called "Arthur Desktop" in BASIC, merely as a demonstration of what you could do with the window manager the team had developed. But they were out of time, so the demo was burned into the read-only memory (ROM) of the first batch of computers.

A history of ARM, part 1: Building the first chipA screenshot of the Arthur operating system.Guidebook Gallery[28]

The first Archimedes models shipped in June of 1987, some of them still sporting the BBC branding. The computers were definitely fast, and they were a good deal for the money--the introductory price was GBP800, which at the time would have been about £1,300.

This compared favorably to a Macintosh II, which cost £5,500 in 1987 and had similar computing power. But the Macintosh had PageMaker, Microsoft Word, and Excel, along with tons of other useful software. The Archimedes was a brand new computer platform, and at its release, there wasn't much software available.

The computing world was rapidly converging on IBM PC compatibles and Macintoshes (and for a few more years, Amigas) and everyone else found themselves getting squeezed out. The Archimedes computers got good reviews in the UK press and gained a passionate fan base, but fewer than 100,000 systems were sold over the first couple of years.

The seed grows

Acorn moved quickly to fix bugs in Arthur and work on a replacement operating system, RISC OS, that had more modern features. RISC OS shipped in 1989, and a new revision of the ARM CPU, V3, followed soon afterward.

The V3 chip was built on a 1.5 micrometer process, which shrunk the size of its ARM2 core to approximately one quarter of the available die space. This left room to include 4 kilobytes of fast level-1 cache memory. The clock speed was also increased to 25 MHz.

While these improvements were impressive, engineers like Sophie Wilson believed the ARM chip could be pushed even further. But there were limits to what could be done with Acorn's rapidly dwindling resources. In order to realize these dreams, the ARM team needed to look for an outside investor.

And that's when a representative from another computer company, named after a popular fruit, walked in the door.

Tune in next month for the second installment of the ARM story.

References

  1. ^ Enlarge (cdn.arstechnica.net)
  2. ^ famous (arstechnica.com)
  3. ^ Enlarge (cdn.arstechnica.net)
  4. ^ help kids with their homework (www.youtube.com)
  5. ^ Enlarge (cdn.arstechnica.net)
  6. ^ interview (archive.computerhistory.org)
  7. ^ Enlarge (cdn.arstechnica.net)
  8. ^ IBM research papers (www.ibm.com)
  9. ^ Enlarge (cdn.arstechnica.net)
  10. ^ Enlarge (cdn.arstechnica.net)
  11. ^ here (codehiddenlanguage.com)
  12. ^ Enlarge (cdn.arstechnica.net)
  13. ^ Enlarge (cdn.arstechnica.net)
  14. ^ pipelining (en.wikipedia.org)
  15. ^ Enlarge (cdn.arstechnica.net)
  16. ^ Hello World, I am ARM (www.computinghistory.org.uk)
  17. ^ Enlarge (cdn.arstechnica.net)
  18. ^ Center for Computing History, UK (www.computinghistory.org.uk)
  19. ^ Enlarge (cdn.arstechnica.net)
  20. ^ WikiChip.org (en.wikichip.org)
  21. ^ interactive block diagram (codehiddenlanguage.com)
  22. ^ cache (en.wikipedia.org)
  23. ^ barrel shifter (en.wikipedia.org)
  24. ^ computer history (arstechnica.com)
  25. ^ Enlarge (cdn.arstechnica.net)
  26. ^ Xerox PARC (arstechnica.com)
  27. ^ According to Fellows (www.rougol.jellybaby.net)
  28. ^ Guidebook Gallery (guidebookgallery.org)