Project #1: High Performance Dense Matrix Multiplication
Objectives
To understand and apply:
- code generation techniques
- cache oblivious algorithms
- cache aware techniques
- analysis techniques for HPC
- empirical analysis methods
Methods
Compare the following algorithms:
- The non-optimal naive matrix-matrix multiply
- A blocked cache aware matrix-matrix multiply as described in the Cache Oblivious Paper
- An optimized cache oblivious algorithm with generated optimized base cases and a planner that optimizes the recursive assembly of the base cases.
Measure the performance of these algorithms on any candidate
architecture. Describe your measurement methodology: How many
measurements did you take? What was the variation in the
measurements? What architecture did you use? What properties did
this architecture have and does it match your model? Use statistical
techniques if necessary.
In your analysis, provide the asymptotic analysis of the
algorithms work and cache complexity. Compare these models to your
analysis. Explain the behavior you see, to the extent that it is
applicable, using the models developed in the analysis.
Group Project
Project will be conducted in groups of 3. Divide task into sensible
subtasks: e.g. Codelet generator, planner, analysis, etc. Document
how the labor was divided in your research report.
What do you turn in?
- A research report documenting your work. Include references
where necessary. This report should document: analysis of algorithms,
software techniques used, experimental methods and results, comparison
of analysis to experimental results. A hardcopy of this report will
be turned in during class time.
- The software you created. Include all source files and
makefiles and an instruction file on how to compile and run. Do not
include binaries or .o files! Use tar to create a single file and send
this file as an attachment to luke@cse.msstate.edu. The
project source code is due by 2pm on the due date.
Due Date:
The project is due October 14th at 2:00pm.
Examples:
Example of template meta-programming for bitonic sort including an example
Makefile, and measurement method.
template_example.tar or go to template_example
luke@cse.msstate.edu
Last modified: Tue Sep 9 13:14:43 CDT 2003