Four bits of History
By Joel Jordan


     Today's latest Pentium 4 processors switch at an astounding 2.2 billion cycles per second, contain 42 million transistors, and have gate lengths as small as 130 billionths of a meter.  A little over 30 years ago, when computers this powerful existed only in engineers' wildest dreams, the modern day Pentium's oldest ancestor was created--the Intel 4004.

     In 1971, digital computers had been around for 25 years, and they had progressed from vacuum tube-filled behemoths to desk-sized boxes of integrated circuits.  The now ubiquitous MOS technology was still in its infancy compared to the established bipolar process, and its limitations made the large-scale integration needed for a microprocessor impractical.  The 4004, with 2,300 transistors, however,  proved that the MOS process was ready for the challenge of a microprocessor.

     The technical details shock modern computer users: 750KHz clock speed, 10.8 microsecond instruction cycle, 4-bit arithmetic.  Rather than shipping the chip in a large 40-pin package used by competitors, Intel insisted on a 16-pin package, requiring the address and data bus to be multiplexed on four pins.  Instead of the standard 5V power supply used by TTL logic chips, the 4004 required 15V, which made interfacing with standard components more complex.  In addition, the 4004 could interface directly only with the 4001 ROM and 4002 RAM chips due to the multiplexed bus.  Two additional chips, the 4008 and 4009, were required to use standard RAM and ROM parts.  Rather than use a normal clock, the 4004 required two 750KHz square waves, each 180 degrees out of phase.

     The 48 instructions offered subroutines, conditional branching, and even the use of indirect addressing.  It included instructions for addition and subtraction, but not for Boolean logic.  Rather than having several general-purpose registers, it provided a single accumulator and sixteen 4-bit "scratchpad" registers, also addressable as eight 8-bit register pairs.

     Instructions were divided into one-byte instructions and two-byte instructions.  Since the 4004 only had a 4-bit multiplexed bus, it spent three cycles specifying the instruction address, then two fetching the instruction.  Two byte instructions required a second stage of this to fetch the instruction's additional data.  This bus bottleneck contributed to the processor's fairly long instruction cycle.

     To provide an idea of what coding for the 4004 was like, a simple program is provided below.  Note that, as there are few 4004 chips lying around, this has not been tested in a production system.  The adventurous, however, may assemble this with the 'as' macroassembler, as it includes support for this historic chip.

Chair: Joel Jordan
Email: sigarch@acm.uiuc.edu

Meeting Time: Thursday 7:00 PM
Place:  L510 DCL

     SIGArch is continuing to work on its Engineering Open House project, Digital Dancer. We're finishing work on the hardware portion of our project: the dance stage. We're also beginning the final stages of coding the software portion of the project which will create dance steps for players to perform in real time on arbitrary music.
     Until EOH, SIGArch also meets on Saturdays at 1pm in L510 DCL.

; fibonacci.asm
; the fibonacci sequence on the Intel 4004
; result is output to I/O port 0
   SRC 0      ; use 4001 ROM chip 0
   FIM R0R1, 1; initialize pair 1 to 1
   FIM R2R3, 1; initialize pair 2 to 1
LOOP:
   LD R0     ; load R0 into accumulator
   ADD R2    ; add lower nibble
   WRR       ; output value to ROM I/O port
   XCH R0    ; exchange accumulator and R0
   LD R1     ; load R1 into accumulator
   ADD R3    ; add upper nibble
   WRR       ; output value to ROM I/O port
   XCH R1    ; store result back in R1
   JC DONE
   CLB       ; clear carry and accumulator
   LD R2     ; load R2 into accumulator
   ADD R0    ; add lower nibble
   WRR       ; output value to ROM I/O port
   XCH R2    ; exchange accumulator and R2
   LD R3     ; load R3 into accumulator
   ADD R1    ; add upper nibble
   WRR       ; output value to ROM I/O port
   XCH R3    ; store result back in R3
   JC DONE
   CLB       ; clear carry and accumulator
   JUN LOOP  ; unconditional jump to loop
DONE:
   JUN DONE  ; infinite loop

References:
     "Intel Museum - A history of the Microprocessor" available at http://www.intel.com/intel/intelis/museum/exhibit/hist_micro/index.htm

     "The Intel 4004" available at http://www.intel4004.com

     "The Macroassembler AS" available at http://john.ccac.rwth-aachen.de:8000/as/

3