Outline of Material for Test #3

General Algorithm Analysis Knowledge

Be able to compute timings for algorithms using the network model.
Be prepared to use standard communication algorithms (broadcast, reduction, etc) when describing and analyzing parallel algorithms
Be prepared to perform isoefficiency and cost optimality analysis on any of the algorithms covered in the matrix chapter.
In short, you are responsible for the basic analysis knowledge that we have covered in Test#2

Be able to define internal and external sorting algorithms.
What is the compare-exchange operation? How is it implemented in parallel
What is the compare-split operation? How much time does it require?
What is a comparator? How do you draw one? What is a increasing or decreasing comparator and how do you represent it (draw it).

Be able to define a bitonic sequence, bitonic split and bitonic merge.
What is a bitonic merging network? What does it do? What is its size? What is its depth? What does it look like?
What is a bitonic sorting network? What does it do? What is its size? What is its depth? What does it look like?
How do we map a bitonic sorting network onto a hypercube? What does the communication patterns look like?
Be able to derive execution times for the bitonic sorting algorithm on a hypercube architecture. Be able to discuss both mapping one element per processor and many elements per processor.
Be able to perform a scalability analysis of the bitonic sorting algorithm on the hypercube architecture.

Why is it difficult to develop an efficient parallel quick-sort algorithm by simply running independent subproblems in parallel?
For the shared memory quicksort algorithm, how is the array partitioned?
For the shared memory quicksort algorithm, how is load balanced?
What is the prefix sum operation used for in the shared memory quicksort algorithm?
Be able to derive the running time of the quicksort algorithm.
Be able to derive the iso-efficiency function of the quicksort algorithm.
How do we modify the shared memory quicksort algorithm to generate a message-passing formulation?
How does pivot selection affect the algorithm performance? Is pivot selection more or less important for parallel quicksort? Why?

What is one dimensional and two dimensional partitioning?
Be able to describe and analyze one dimensional partitioned dense matrix vector multiplication for both row and column decomposition.
Review the concurrency constraints on the isoefficiency function. Be able to derive the isoefficiency bounds due to concurrency for all of the parallel matrix algorithms.
Be able to describe and analyze the two dimensional partitioned dense matrix vector multiplication algorithm.
Be able to describe and analyze the simple (naive) dense matrix-matrix multiplication algorithm.
Be able to describe and analyze Cannon's algorithm for dense matrix-matrix multiplication.
Be able to describe and analyze the DNS algorithm for matrix-matrix multiplication.
Be able to compare and contrast various algorithms (from the set above) for dense matrix-matrix multiply.

Last modified: Sun Nov 4 11:57:31 CST 2012