Outline of Material for Test #2
Parallel Collective Communication Primitives
- Be able to define One-to-All, All-to-All, scatter, gather
and All-to-All Personalized communication
primitives. (In other words, be able to reconstruct Figure's
4.1, 4.8, 4.14, and 4.16 from the text)
- One-to-All Broadcast
- Be able to describe and analyze the Hypercube One-to-All
broadcast algorithm.
- Be prepared to answer questions about the algorithms as
listed in program form in Algorithm 4.1, and 4.2 from the text.
- Understand how the One-to-All broadcast algorithm can be
used as a basis for performing the Single-Node
Accumulation.
- Understand and be prepared to discuss the use of relabeling processor ID's using XOR to achieve the generalized algorithm
- All-to-All Broadcast
- Understand the All-to-All broadcast algorithms for the
ring and Hypercube topology. In particular, pay attention to message
sizes at each step of the algorithm.
- Be prepared to answer questions about the algorithms as
listed in program form in Algorithm 4.4, 4.5, and 4.6 from handout.
- Be able to describe how the All-to-All broadcast can be
used to as a basis for performing the Reduction or
Multinode Accumulation operations.
- Know the difference between an All-to-All reduction and an All-Reduce
- One-to-All Personalized Communication
- Understand how this algorithm is a variation of the
One-to-All broadcast except with shrinking message
sizes.
- Be able to derive and analyze this algorithm on a
Hypercube network by generalizing the
One-to-All algorithm results.
- All-to-All Personalized Communications
- Be able to describe and analyze the the Hypercube algorithm.
- Be able to describe and analyze the E-cube routing algorithm.
- Be able to generalize the ts + m tw performance model to take in account network bisection width.
- Know table 4.1 and 4.2 for the algorithms we have covered.
- Be able to derive timing measurements for any of the above algorithms!
- Study the Homework Assignments!
Performance and Scalability of Parallel Systems
- Be able to define Speedup, Efficiency, Cost, Cost Optimal, Problem Size,
Overhead Function, and Isoefficiency Function.
- Be able to determine if an algorithm is Cost Optimal.
- Be able to determine if an algorithm has an Isoefficiency
function
- Be able to derive the overhead function and use this function
to determine the existence of an Isoefficiency function. Be
able to show that the overhead function is related to the
isoefficiency function when one exists.
- Understand why there is a lower bound on the isoefficiency
function as described in section 5.4.4 from text.
- What is the degree of concurrency of a parallel algorithm?
- What is the relationship between degree of concurrency and the
isoefficiency function? See section 5.4.5 from text.
- What are the sources of parallel overhead? ANS: Communication
time, Load Imbalance, and additional computations performed by
the parallel algorithm.
- What is scaled speedup? How do we measure it?
- What is memory and time-constrained scaled speedup? How do we measure these?
- Study the Homework Assignments!