GWU Online Engineering Course Takeway

Computer Architecture & Design

Null, Linda, and Julia Lobur. Essentials of Computer Organization and Architecture. Jones & Bartlett

Objectives

1. Describe computer systems, their main components, key terminology and and history of computer architecture
2. Explain Moore’s Law, Computer Level Hierarchy, and Cloud Computing
3. Interpret Von Neumann Architecture and Parallel Computing
4. Utilize Data Representation in Computer Systems
5. Use Boolean Algebra, Digital Logic and Digital Components
6. Interpret Finite-State Machines, CPU Basics and Organization, and Instruction Set Architectures
7. Use various Addressing schemes, Data and Memory Types (e.g. Cache Memory)
8. Measure I/O and Performance and apply Amdahl’s Law
9. Implement I/O Buses and Interfaces, Disk Technology and RAID
10. Describe System Software, Operating Systems
11. Leverage Protected Environments and Virtual Machines
12. ExplainDynamic Link Libraries, Compilers and Interpreters
13. Demonstrate and understanding of Alternative Architectures, RISC Machines
14. Explain advanced concepts such as Parallel and Multiprocessor Architectures, Shared Memory Multiprocessors and Distributed Computing

Takeaways

- System Level Bus connects CPU, memory, and I/O devices.
- CPU Bus consists of ALU, registers, and control unit.
- Control Unit directs the execution of stored program instructions.
- ALU performs arithmetic and logical operations.
- Registers are temporary storage areas for instructions or data.
- IEEE, ITU, and ISO are organizations related to electrical engineering, telecommunications, and worldwide standards.
- Historical development includes Mechanical Calculating Machines, Vacuum Tube Computers, Transistorized Computers, Integrated Circuit Computers, and VLSI Computers.
- Moore's Law states that transistor density doubles every year.
- Rock's Law states that capital equipment cost for semiconductors doubles every 4 years.
- Computer levels include User Level, High-Level Language Level, Assembly Language Level, System Software Level, Machine Level, Control Level, and Digital Logic Level.
- The von Neumann Model includes CPU, main memory, and I/O system.
- Von Neumann computers have a bottleneck due to a single busline, while Harvard computers have two buslines.
- CPU consists of a program counter, register (ALU, control unit), and buses.
- Multicore processors have multiple cores sharing memory and resources.

- Floating-point representation can lead to overflow or underflow.
- Character codes convert human-readable characters to computer-readable bit patterns.
- BCD, EBCDIC, and ASCII are examples of character codes.
- Unicode is a 16-bit system encoding characters from all languages.
- IEEE-754 standard specifies single and double precision floating-point formats.
- Hexadecimal numbers use four-bit groups called nibbles.
- Floating-point operations are not necessarily commutative or distributive.
- Prefixes like Kilo-, Mega-, Giga-, Tera-, Peta-, Exa-, Zetta-, and Yotta- denote large values.
- Hertz measures clock cycles per second, indicating processor speed.
- Kilobyte, Megabyte, and Gigabyte are units of measurement for memory and storage.

Laws of Logic:
- Identity Law: 1x = x, 0 + x = x
- Null Law: 0x = 0, 1 + x = 1
- Idempotent Law: xx = x, x + x = x
- Inverse Law: xx' = 0, x + x' = 1
- Commutative Law: xy = yx, x + y = y + x
- Associative Law: (xy)z = x(yz), (x + y) + z = x + (y + z)
- Distributive Law: x + yz = (x + y)(x + z), x(y + z) = xy + xz
- Absorption Law: x(x + y) = x, x + xy = x
- De Morgan's Law: (xy)' = x' + y', (x + y)' = x'y'
- Double Complement Law: x'' = x

Logic Gates and Circuits:
- Logic Gates: AND gate, OR gate, NOT gate, XOR gate
- K-map: Guidelines for grouping and optimizing
- Combinational circuits: Produce outputs based on inputs

Sequential Circuits:
- State changes occur on clock ticks
- Edge-triggered and level-triggered circuits
- Bus Arbitration: Daisy chain, Centralized parallel, Distributed using self detection, Distributed using collision detection

CPU Components:
- Datapath and control unit
- Registers and arithmetic-logic unit (ALU)
- Buses: Data lines, control lines, and address lines

Interrupts:
- Events of higher priority alter program execution
- Interrupt handlers and software interrupts
- Fetch-decode-execute cycle

Moore and Mealy Machines:
- Moore machines place outputs on each node
- Mealy machines present outputs on transitions

Fetch-Decode-Execute Cycle:
- Fetch: Load PC into MAR, fetch instruction into IR, increment PC
- Decode: Determine operation and fetch operand if needed
- Execute: Perform instruction

Handling Interrupts:
- Check for interrupts at the beginning of the fetch-decode-execute cycle
- Dispatch interrupt handling routine and resume normal execution

Additional Terms:
- MAR (Memory Address Register)
- MDR (Memory Data Register)
- Registers store binary data
- CPU: Fetches instructions and performs operations
- Control unit: Sequences operations and controls data flow
- Bus: Shared datapath connecting subsystems
- Overclocking: Pushing system components beyond their designed limits
- Memory: Byte addressable, word size, and address storage
- D flip-flop for memory storage
- MBR (Memory Buffer Register)
- System bus and extension bus
- Clock speed and von Neumann bottleneck

- 11110001 is the 2's complement of 00001111.
- A processor consists of an arithmetic logic unit and a control unit.
- -52 (base 10) in 8-bit one's complement binary is 11001011.
- -61 (base 10) in 8-bit binary using excess 127 notation is 1,000,010.
- The binary string 01000110001000 is equivalent to the decimal number 2.125.
- The state machine needs to output the string 001 110 011 001 110 001 when given the input string 011010.
- Computers designed using the Harvard architecture have separate buses for data and instructions.
- Multicore architectures have multiple processing units on a single chip.
- The arithmetic logic unit ensures proper instruction decoding and execution.
- The system software level is composed of gates and wires in the computer hierarchy.
- ANSI is the organization that sets standards for computer components, signaling protocols, and data representation.
- Biased floating-point exponents use excess-M representation.
- To uniquely identify each memory word in a system with 16-bit memory words and 32 1M × 8 RAM chips, 24 address bits are required.
- The instruction set architecture is the interface between software and hardware in a machine.
- The transistor was invented by John Bardeen, Walter Brattain, and William Shockley in 1948.
- The decimal value of 00010001 in signed 2's complement notation is 17.
- The data path is a network of registers and arithmetic logic units connected by buses.
- 2B7 (base 16) in binary is 0010,1011,0111.
- Ports allow data movement between the computer and external devices.
- A 4GB byte-addressable memory requires 32 bits for each address.
- Microprocessors are measured in gigahertz.
- Centralized parallel arbitration can cause throughput delays in bus arbitration.
- The memory buffer register holds the actual data for a given memory address.
- Byte addressable means that each individual byte has a unique address.
- The logic circuit implements the Boolean expression ((x + y')y + (x + y'))'.
- Computers designed using the Harvard architecture have separate buses for data and instructions.
- Expressed as a power of two, there are 2^10 kilobytes in a megabyte.
- The ENIAC was the first general-purpose programmable electronic computer.
- In an 8-bit two's complement system, the range of positive integers is from 1 to 127.
- The truth table for the sequential circuit H4 is incomplete.

- Memory Calculation:
- Byte addressable:
- 23 bits required for addressing a 4M x 16 main memory.
- Word addressable:
- 22 bits required for addressing a 4M x 16 main memory.

- Number of 256x8 RAM chips needed for a memory capacity of 4096 bytes:
- 16 chips required.

- Size of the decoder for chip select inputs:
- 4:16 decoder.

- Capacity of a 256M-word RAM expressed in bytes:
- 512M bytes.

- Number of bits required for addressing in a byte-addressable 256M-word RAM:
- 29 bits.

- Number of bits required for addressing in a word-addressable 256M-word RAM:
- 28 bits.

- Memory size of 32K:
- 15 bits needed for each address.

- One's complement:
- 1's complement of 97 in binary: 01100001.
- 1's complement of -97 in binary: 10011110.

- Excess-M:
- Excess-M of 97 is 224 (binary: 11100000).
- Excess-M of -97 is 30 (binary: 00011110).

- Decimal value of an 8-bit binary number:
- Unsigned: 158.
- Signed: -30.
- 1's complement: -97.
- 2's complement: -96.
- Excess-127: 31.

- Binary subtraction:
- 11000100 - 00111011 = 10001001.
- 01011011 - 00011111 = 00111100.

- Floating-point representation:
- 3-bit exponent, 4-bit significand.
- Largest positive: 7.5.
- Smallest positive: 0.03125.

- 14-bit floating-point model:
- Expressing 32: 0,10101,10000000.
- Expressing 3210: 0,10101,10000000.
- Expressing 0.0625: 0,0100,10000000.

- IEEE-754 double precision for 26.625:
- 0, 10000000011, 1010101...

- IEEE single precision for -3.75:
- 1, 10000000, 111......

- Boolean functions:
- F(w, x, y, z) = xz'(xy + xz) + xy'(wz + y).
- F'(w, x, y, z) = ((x'+z)+(x'+y')(x'+z')) ((x'+y)+(w'+z')y').

- Hamming code:
- Error correction for a 10-bit word.
- Information word: 1001100110.
- Error in bit 11.

- Binary multiplication:
- 1011 * 101 = 0110111.
- 10011 * 1011 = 11010001.

- CRC polynomial and information word:
- Polynomial: 1101.
- Information word: 1100011.
- CRC code: 100.
- Actual data: 1100011100.

- 12-bit, 4-exponent, 2's, 7-mantissa model:
- Smallest positive: 0.001953.

- Gates:
- XOR gate: x'y + y'x.
- NAND gate: (xy)'.
- XNOR gate: (x + y)'.

- Operating System (OS) performs three main tasks: process management, system resource management, and system resource protection.
- Multiprogramming allocates CPU time to multiple processes, allowing multiple programs to be loaded in main memory and ready to execute.
- Multiprocessing involves using more than one CPU simultaneously, either in tightly coupled systems with shared memory and I/O devices or loosely coupled systems with physically separate memory.
- Multithreading extends multitasking by allowing multiple threads within a process to execute concurrently.
- Context switch occurs when a process is replaced by another on the CPU.
- BIOS (basic input-output operating system) chip revolutionized small computer operating systems by enabling a single OS to function on different types of systems.
- Two crucial OS components are the kernel and system programs.
- Microkernel systems provide minimal functionality and rely on external programs for most services, offering better security and easier maintenance (examples: MINIX, Mach, QNX).
- Monolithic systems provide most services within a single OS program, offering faster execution speed but are difficult to port across architectures (examples: Linux, MacOS, DOS).
- Relocatable code has operand addresses relative to where the OS loads the program, while absolute code is suitable for device and OS control programming.
- RAID (Redundant Array of Independent Disks) configurations: RAID 0 (drive spanning), RAID 1 (disk mirroring), RAID 2 (Hamming code), RAID 3 (bit striping with parity), RAID 4 (parity disks), RAID 5 (distributed parity), RAID 6 (double parity).
- I/O systems consist of memory blocks, cabling, control circuitry, interfaces, and media.
- I/O control methods include programmed I/O, interrupt-based I/O, DMA (Direct Memory Access), and channel I/O.
- I/O subsystem includes memory blocks for I/O functions, buses for data movement, control modules, interfaces, cabling, and communications links.
- Five general ways to control I/O: programmed I/O, interrupt-driven I/O, memory-mapped I/O, DMA, and channel I/O.
- Memory-mapped I/O shares the same address space between devices and program memory, while channel I/O uses dedicated I/O processors.
- Amdahl's Law calculates the speedup of a system based on the fraction of work performed by a faster component and the speedup factor of that component.
- Compilers translate source code into executable code. Phases include lexical analysis, syntax analysis, semantic analysis, intermediate code generation, optimization, and code generation.
- Binding assigns physical addresses to program variables and can occur at compile time, load time, or run time. Compile time binding results in absolute code, load time binding assigns addresses during program loading, and run time binding requires a base register.
- Compiler phases: lexical analysis, syntax analysis, semantic analysis, intermediate code generation, optimization, and code generation.

- Compilers:
- Analysis phases:
- Lexical analysis extracts tokens and checks for syntax errors.
- Syntax analysis (parsing) checks statement construction and shows errors for unknown errors.
- Semantic analysis checks data types and operator validity.
- Synthesis phases:
- Intermediate code generation creates three address code for optimization and translation.
- Optimization creates assembly code considering architectural features.
- Code generation creates binary code from optimized assembly code.

- JVM:
- Loads programs, links them, starts execution threads, manages program resources, and deallocates resources when programs terminate.
- Verifies bytecode integrity while loading.
- Performs runtime checks and places bytecode in memory.
- Invokes bytecode interpreter.
- Manages heap storage and deallocates resources of terminated threads.
- Terminates JVM upon program termination.

- Memory:
- RAM includes dynamic RAM (DRAM) and static RAM (SRAM).
- DRAM needs refreshing to prevent data loss, while SRAM is fast and used for cache memory.
- ROM retains memory with little charge and is used for permanent or semi-permanent data.

- Cache:
- Direct mapped cache maps memory blocks to specific cache blocks.
- Fully associative cache does not have a mapping, requiring an eviction algorithm.
- Set associative cache maps memory references to a subset of cache slots.

- Effective access time (EAT):
- EAT = H * AccessC + (1 - H) * AccessMM
- H is the cache hit rate, AccessC is the cache access time, and AccessMM is the main memory access time.

- ISA:
- Parallel execution is achieved through instruction pipelining.
- Speedup is calculated using the formula: S = nt(n) / (k + n - 1) * t(p)

- Pipeline hazards:
- Resource conflicts, data dependencies, and conditional branching can cause pipeline conflicts and stalls.

- Instruction format:
- The storage of a hex value (1234) at address 0:
- Big Endian: 00 00 12 34
- Little Endian: 34 12 00 00

- Mathematical expressions:
- Expressions with operations and parentheses are shown in postfix notation.

- Addressing modes:
- Immediate addressing: Data is part of the instruction.
- Direct addressing: Address of the data is given in the instruction.
- Register addressing: Data is located in a register.
- Indirect addressing: Address of the address of the data is given in the instruction.
- Register indirect addressing: A register stores the address of the address of the data.

- Memory:
- Different memory segments: 0*800-0*900-R1, 0*900-0*1000, 0*1000-0*500, 0*1600-0*700.
- Different memory access methods: Immediate, Direct, Indirect, Indexed.

- Other concepts:
- Opcode and number of bits: The number of bits in the opcode determines the number of possible operations.
- Assembler: Two-pass process to create a symbol table and fill in addresses.
- Middleware: Examples include JVM (Java Virtual Machine).
- Tight coupling and addressing modes: Different types of addressing modes and their purposes.
- Types of instructions: Jump, Store, Skip conditions, Subtract, Bit manipulation.
- Different types of architectures: Memory-memory, Register-memory, Load and store.
- Different types of bindings: Load time, Run-time, Late, Compile time.
- Multithreading, Multiprocessing, Multitasking, MultiProgramming: Definitions and differences.
- Hardwired control and microprogrammed control: Differences and operation.
- Instruction sets and architectures: Factors considered in designing instruction sets.
- CPU architecture: Stack, Accumulator, General-purpose register.
- I/O handling approaches: Programmed I/O, Interrupt-driven I/O, DMA devices.
- Java bytecode and JVM: Stack-based language, memory access using registers.
- Locality of reference: Temporal, Spatial, and Sequential locality.
- Cache replacement policies: LRU (Least Recently Used), FIFO (First-In, First-Out), Random.

- I/O should be kept to a minimum for good performance.
- Channel I/O is used in very large systems and consists of one or more I/O processors controlling channel paths.
- Slower devices are multiplexed into a faster channel in channel I/O.
- Keyboards are usually connected through an interrupt-driven I/O system.
- Mass storage devices are typically block I/O devices and can be connected through DMA or channel I/O.
- Serial communications interfaces require fewer conductors, are less susceptible to attenuation, and are suitable for time-sensitive data.
- RAID DP, also known as RAID 5DP or RAID 6, provides advanced data guarding.
- Different RAID levels are chosen based on data criticality and capacity requirements.
- CPU scheduling includes long-term scheduling and short-term scheduling.
- Relocatable code has operand addresses relative to the program's loaded location, while absolute code is nonrelocatable.
- Multiprogramming refers to multiple processes on one processor, while multiprocessing refers to multiple processors.
- Lexical analyzer identifies syntax errors, syntax analysis handles undefined variables, and semantic analyzer checks for errors like adding an integer to a character string.
- Byte-addressable virtual memory with 8 virtual pages of 64 bytes each and 4 page frames means 9 bits for virtual addresses and 8 bits for physical addresses.
- RAID Level 5 spreads parity disks throughout the entire array.
- Multitasking allows multiple processes to run concurrently.
- Programmed I/O is the simplest way for a CPU to communicate with an I/O device.
- RAID Level 4 is RAID-0 with parity.
- RAID-2 writes one bit per strip instead of blocks across drives.
- RAID systems use mirroring or striping, or a combination of both.
- RAID-1 (disk mirroring) is costly and requires large memory space.
- User interface is a program that constitutes the display manager.
- Serial data transmission is preferred for high-performance interfaces.
- RAID Level 3 interleaves data one bit at a time across all data drives.
- MAR ← X, MBR ← AC M[MAR] ← MBR, maximum cycle counter value is 4.
- RAID 50 combines striping and distributed parity.
- Accumulator architectures use sets of general-purpose registers for storing operands.
- I/O channels are driven by I/O processors.
- RAID-1 is also known as disk mirroring.
- Control unit requires a maximum of 32 output signal lines for a 5-bit opcode.
- BASIC is Beginners All-purpose Symbolic Instruction Code.
- RAID-10 combines striping and mirroring.
- RAID DP can tolerate the simultaneous loss of two disk drives without data loss.
- Buses use serial and parallel data transmission modes.
- RAID Level 2 uses mirroring by writing one bit per stripe.
- Shortest job first scheduling gives preference to the job with the shortest execution time.
- Assembler directives can distinguish values as hexadecimal or decimal.
- Resident monitor is a precursor to modern operating systems allowing program processing without human interaction.
- Cloud delivery models include IaaS, PaaS, and SaaS.