Parallel architectures are a sub-class of distributed computing where the processes are all working to solve the same problem.

There are different kinds of parallelism at various levels of computing. For example, even though you might write a program as a sequence of instructions, the compiler or CPU may make changes to it at compile time or run time so that some operations happen in parallel or in a different order. This is called implicit parallelism.

Explicit parallelism, the kind we care about in this lesson, occurs when the programmer is aware of the parallelism and designs the processes to operate in parallel.

This lesson will explore some of the basic concepts in parallel computing, look at some of the theoretical limitations, and focus on a specific kind of parallel programming called MapReduce.

Lesson Objectives

After completing this lesson, you should be able to

Classify parallel computing on the distributed system three axis diagram
Explain Amdahl’s Law in simple terms
Explain the reasons why communication and synchronization limit good parallel performance
Describe the advantages of hierarchical architectures
Explain how MapReduce works.
Use MapReduce to solve a data processing problem.

Required Reading/Viewing

Parallel Architectures (Lesson Slidedoc) (PDF)
Parallel computing from Wikipedia
Introduction to Parallel Architectures from MIT Open CourseWare starting at minute 20:52
MapReduce: Simplified Data Processing on Large Clusters (PDF) by Jeffrey Dean and Sanjay Ghemawat