Design and Implementation of a Simple
8-Bit CPU
Copyright © 1997, 2000 by Rex N. Fisher
This section highlights the importance of using a working CPU model in Electronics Engineering Technology computer courses. It also describes the investigation of existing CPU models that were considered for hardware implementation, and the results of that research.
2.1 The Need for a Simple CPU Model that Actually Works
The two-year Electronics Engineering Technology program at Ricks College has several courses about microprocessors/microcontrollers and computer systems. A prerequisite for all of these courses is EET 151 (Digital Circuits). EET 151 covers combinational SSI circuits, as well as MSI circuits such as ALUs, PALs, and semiconductor memories. The treatment of sequential circuits includes flip flops, latches, registers, counters, and an introduction to state machines. Students in this class learn digital hardware design at a deeper level than do most two-year Computer Science or Vocational Technology students. Throughout the course the students learn that these are the basic building blocks of computers.
Textbooks for the subsequent computer courses, however, treat the internal circuitry of a CPU simply as a functional block diagram. The students are told that this block is a register and that block is an ALU, etc., but the bridge between the block diagram and the circuits studied in EET 151 is never completely built. Students often ask, "Can you show me where that functional block is in a real computer?" They want to make the connection back to the hardware they learned about in Digital Circuits.
Few of the CPU models found in these beginner-level computer textbooks explain how the CPU "knows" what instruction was fetched. Somehow a block called "control logic" sets up the ALU for the correct operation and loads all of the correct registers at the proper time. But, students in Electronics Engineering Technology want to know how this is accomplished.
By building a simple demonstration CPU that actually works, both levels of understanding can be satisfied. The block diagram can be used to learn the basic principles, and the detailed design of the functional CPU can tie the functional blocks back to actual circuits.
2.2 Initial CPU Model Specifications
Ten loosely defined CPU specifications were proposed for this project. The CPU had to be powerful enough to illustrate most of the common features in modern CPUs. It would also have to be simple enough to implement at the IC level within a short amount of time. Although some of these specifications were later modified or discarded, they were used to evaluate the existing CPU models and formed the design basis for the P8 CPU.
1. The CPU should be simple enough to be implemented with a "small" number of ICs (about 20) from the TTL logic family.
2. The control unit should support conditional branching.
3. Include a hardware interrupt.
4. Use an 8-bit datapath and 8-bit address bus to reduce complexity.
5. Incorporate a "small" number of internal data registers (about 3) to demonstrate how they function.
6. The ALU should implement an assortment of arithmetic and logical operations, in order to make the instruction set as complete as possible.
7. The instruction set should include "several" (3 or 4) common addressing modes.
8. Include instructions that are usually found in real CPUs.
9. Implement simple stack operations.
10. No pipelining or other parallel processing techniques should be attempted. They would detract from the simple operation required for easy understanding, and would certainly conflict with goal # 1.
2.3 Alternative Computer Models
This section briefly describes four different CPU models that have been developed for use in college-level computer courses. They are identified in this report by the name of their respective developers. Each one will be compared to the design goals in the previous section. The features, and advantages and disadvantages of each will be identified. The suitability of each model for hardware implementation will be discussed.
2.3.1 Lynn CPU
This CPU model was developed by Doug Lynn for use in his EE
340 and EE 441 courses at the University of Idaho (Lynn,
1996). A diagram of the Lynn CPU is shown in Figure 2.1.
Figure 2.1: Functional Block Diagram of Lynn CPU (Lynn, 1996, EE 441 Handout).
Hardware Features
CPU Architecture: Accumulator-Based CPU
Data Bits:
8
Address Bits: 5
Internal Registers: 1 Accumulator (A)
Flags:
Zero
(Z)
Carry
(C)
Negative
(N)
Overflow
(V)
Instruction Set
Input/Output:
None
--
--
Program Control: BRU
address Direct PC <--address
BRZ
address Direct If
Z=1: PC <-- address
Data Transfer: LOAD
address Direct A <-- MEM(address)
STORE
address Direct MEM(address) <-- A
Arithmetic: ADD
address Direct A <-- A +
MEM(address)
SUB
address Direct A <-- A -
MEM(address)
Logical:
SHR
Register A
<-- 0 ## [A]7..1
Evaluation
Estimated Number of ICs:
20 (12 + State Machine + Memory)
Conditional Branching?: Yes
Hardware Interrupt?: No
8-Bit Data Bus?:
Yes
8-Bit Address Bus?: No (5 bits)
Number of Registers: 1
(Accumulator)
# of ALU Operations: 3
# of Addressing Modes: 2
Stack Operations?: No
Pipelining?:
No
Decision
The Lynn CPU would be easy enough to build, but it does not have many features common to contemporary CPUs. ts instruction set is too limited and does not implement all of the hardware features. Different addressing modes cannot be demonstrated effectively. The available memory is too small to run an actual program.
This CPU model would not be an acceptable candidate for implementation.
2.3.2 Streib CPU
William Streib presents this model in his book (Streib, 1997). See the block diagram in Figure 2.2.
Figure 2.2: Functional Block Diagram of Streib CPU (Streib, 1997, p. 323).
Hardware Features
CPU Architecture: Accumulator-Based
Data Bits:
8
Address Bits: 16
Internal Registers: 1 Accumulator (A)
2
General Purpose Registers (B, C)
2
Special Purpose Registers (H, L)
Flag:
Zero
(Z)
Instruction Set
Input/Output: INPUT address
Direct A <-- PORT(address)
Program Control: JUMP address
Direct PC <--
address
COND
JUMP address Direct If Z = 0: JUMP address
Data Transfer: LOAD A address
Direct A <--
MEM(address)
MOVE
A, B Register A <--
B
Arithmetic: None
--
--
Logical:
AND data
Immediate A
<-- A AND data
Evaluation
Estimated Number of ICs:
24 (14 + Decoder + Control & Timing + Memory)
Conditional Branching?: Yes
Hardware Interrupt?: No
8-Bit Data Bus?:
Yes
8-Bit Address Bus?: No (16
bits)
Number of Registers: 5 (1
Accumulator + 2 GPs + 2 SPs)
# of ALU Operations: 1
# of Addressing Modes: 3
Stack Operations?: No
Pipelining?:
No
Decision
The Streib CPU would be only slightly more difficult to build than the Lynn CPU. There is ample memory space. It has many hardware features common to contemporary CPUs, but its instruction set does not implement most of them. The instruction set uses three different addressing modes, but is very weak functionally -- especially the available ALU operations.
This CPU model would not be an acceptable
candidate for implementation.
2.3.3 Miller CPU
Michael Miller actually has his own name for this one -- Binary Architecture Basic Electronic (BABE) Computer (Miller, 1997).

Figure 2.3: Functional Block Diagram of Miller (BABE) CPU (Miller, 1997, p. 497).
Hardware Features
CPU Architecture:
Accumulator-Based
Data Bits:
8
Address Bits: 16
Internal Registers: 1 Accumulator (A)
1
General Purpose Register (B)
1
I/O Register
Flag:
Zero
(ZF)
Sign
(SF)
Carry
(CF)
Instruction Set
Input/Output:
IN
Register A <-- I/O Register
OUT
Register I/O Register <-- A
Program Control: JMP address
Direct IP <--
address
JZ
address Direct
If Z = 1: JMP address
JNZ
address Direct If
Z = 0: JMP address
JM
address Direct
If S = 1: JMP address
JP
address Direct
If S = 0: JMP address
JC
address Direct
If C = 1: JMP address
JNC
address Direct If
C = 0: JMP address
HALT
Implied
PC <-- PC
Data Transfer: MOV A,
[address] Direct A <-- MEM(address)
MOV
B, [address] Direct B <-- MEM(address)
MOV
A, data Immediate A <-- data
MOV
B, data Immediate B <-- data
MOV
[address], A Direct
MEM(address)<--A
MOV [address], B Direct MEM(address)<--B
Arithmetic: ADD
[address] Direct A <--
A + MEM(address)
ADD
data Immediate A
<-- A + data
ADD
B Register
A <-- A + B
SUB
[address] Direct A <-- A -
MEM(address)
SUB
data Immediate A
<-- A - data
SUB
B Register
A <-- A - B
DEC
A Register
A <-- A - 1
DEC
B Register
B <-- B - 1
INC
A Register
A <-- A + 1
INC
B Register
B <-- B + 1
Logical:
SHL
Register
A <-- [A]6..0 ## 0
SHR
Register
A <-- 0 ## [A]7..1
Evaluation
Estimated Number of ICs:
55 (40 + ALU* + Control Logic + Memory)
Conditional Branching?: Yes
Hardware Interrupt?: No
8-Bit Data Bus?:
Yes
8-Bit Address Bus?: No (16
bits)
Number of Registers: 3 (1
Accumulator + 1 GP + 1 SP)
# of ALU Operations: 10
# of Addressing Modes: 3
Stack Operations?: No
Pipelining?:
No
* There are no standard TTL ALU devices that implement a Shift Right function.
Decision
There is ample memory space for even lengthy programs. It has many hardware features common to contemporary CPUs, and its instruction set implements most of them. Although the instruction set uses three different addressing modes, one of the most powerful -- indirect -- is not included. This is a surprising omission for an otherwise very complete CPU model. The biggest disadvantage of this one -- enough to eliminate it from consideration -- is that it would be incredibly time consuming to build!
The Miller CPU model would not be an acceptable candidate for implementation.
2.3.4 McCalla CPU
This CPU model is Thomas McCalla's design (McCalla, 1992). See Figure 2.4 for his
BC-8A Computer.

Figure 2.4: Functional Block Diagram of McCalla (BC-8A) CPU (McCalla, 1992, p. 673).
Hardware Features
CPU Architecture:
Accumulator-Based CPU
Data Bits:
8
Address Bits: 5
Internal Registers: 1 Accumulator (A)
1
General Purpose Register (B)
1
I/O Register
Flags:
None
Instruction Set
Input/Output: INP
Register
A <-- PORT
OUT
Register
PORT <-- A
Program Control: JMP
address Direct PC <--address
Data Transfer: LDADIR
address Direct A <-- MEM(address)
STADIR
address Direct MEM(address) <-- A
Arithmetic: ADDIR
address Direct A <-- A + MEM(address)
SUBDIR
address Direct A <-- A - MEM(address)
CLA
Register
A <-- 0
Logical
None
--
--
Evaluation
Estimated Number of ICs:
39
Conditional Branching?: No
Hardware Interrupt?: No
8-Bit Data Bus?:
Yes
8-Bit Address Bus?: No (5
bits)
Number of Registers: 3 (Accumulator
+ 1 GP + 1 SP)
# of ALU Operations: 3
# of Addressing Modes: 2
Stack Operations?: No
Pipelining?:
No
Decision
Almost no hardware design effort would be required for the McCalla CPU because the schematic is provided. See Figure 2.5. (That indicates that there are others who believe it is important to show how a simple CPU model can actually be implemented in hardware.) This CPU, however, has nearly twice as many ICs as the goal specifies. The Instruction set is similar to the Lynn CPU and has the same inadequacies. It has only two addressing modes and no conditional branches at all. The available memory is also too small to run an actual program.
This CPU model would not be an acceptable candidate for implementation.

Figure 2.5: Schematic Diagram of McCalla CPU (McCalla, 1992, p.678-679).
2.4 Literature Review Conclusions
The fact that McCalla provided a complete schematic for his CPU and Streib illustrated a partial IC-level design shows that there is a need for this project. None of the four CPU models reviewed, however, had all of the features listed in Section 2.2. This was expected, but each one also had at least one major drawback that totally eliminated it from consideration.
A new CPU model was developed for this project. Called the P8 (Project 8-bit) CPU, it incorporates most of the best features found in the other four CPUs. It is simple, yet has a complete, and reasonably powerful instruction set. Many addressing modes found in contemporary CPUs are supported. The instruction set is also orthogonal.
2.5 Final CPU Model Specifications
During the development of the MP8 CPU, some of the original design specifications were changed. Below are the ten design goals listed in Section 2.2, along with the actual results of those goals.
1. Original: The CPU should be simple enough to be implemented with a "small" number of ICs (about 20) from the TTL logic family.
Final: The CPU is comprised of 23 ICs. Actually, only 22 are required for proper operation, but another was added to gain visibility into one of the registers.
2. Original: The control unit should support conditional branching.
Final: "Jump If Zero" and "Jump If Not Zero" are supported. Zero is the only condition bit, but it is sufficient to illustrate the principle of conditional branching.
3. Original: Include a hardware interrupt.
Final: This was not implemented. It would have required three additional ICs and a more complicated control unit. The additional complexity did not seem worth it. The fact that none of the other CPU designs incorporated a hardware interrupt indicates that other designers of educational CPUs agree with this decision.
4. Original: Use an 8-bit datapath and 8-bit address bus to reduce complexity.
Final: No change.
5. Original: Incorporate a "small" number of internal data registers (about 3) to demonstrate how they function.
Final: There are two data registers in the final design.
6. Original: The ALU should implement an assortment of arithmetic and logical operations, in order to make the instruction set as complete as possible.
Final: The ALU performs the following operations:
Compare
Add
Subtract
Decrement
Logical OR
Logical NOT (Invert)
Logical Shift Left
7. Original: The instruction set should include "several" (3 or 4) popular addressing modes.
Final: The following addressing modes are supported:
Direct
Indirect
Register
Immediate
8. Original: Include instructions that are commonly found in real CPUs.
Final: Two-thirds of the most frequently used instructions are implemented.
9. Original: Implement simple stack operations.
Final: This was not implemented. It would have required four additional ICs and a more complicated control unit. The additional complexity did not seem worth it. The fact that none of the other CPU designs incorporated stack operations indicates that other designers of educational CPUs agree with this decision.
10. Original: No pipelining or other parallel processing techniques should be attempted. They would detract from the simple operation required for easy understanding, and would certainly conflict with goal #1.
Final: No change.
[Back to Top][Back to Table of Contents][<== Previous Secion][Next Section ==>]