Parallel-Sparse-Matrix-Vector-Multiplication-MPI-CSR-

Project Overview

Goal: Implement and evaluate Sparse Matrix–Vector multiplication (y = A·x) using MPI with Compressed Sparse Row (CSR) storage.
Matrix structure: Symmetric, 11-banded (main diagonal ±5).
Focus: Wall-clock time, speed-up, efficiency, and effective metrics; separation of computation vs communication costs.
The program logs per-run metrics (e.g., results.txt) for different matrix sizes and core counts.

CSR Construction
- fillRowStart – row pointers (row_start)
- fillColIdx – column indices within ±5 band (col_idx)
- fillValues – nonzeros (keeps symmetry)
SpMV Kernel
- SpMV_Mult(B, C, col_idx, row_start, values, local_rows)
- For each row i: accumulate values[j] * B[col_idx[j]] over j ∈ [row_start[i], row_start[i+1])
MPI Parallelization (row partitioning)
- Nearly even row-wise distribution across ranks
- Data movement with MPI_Scatterv / MPI_Gatherv
- Measures total, computation, and communication times
Recording
- Appends matrix size, core count, and timing breakdowns to a results file

Note: Very large matrices + high core counts may hit memory limits; communication cost becomes dominant as processes increase.

Speed-up: Increases with core count but shows diminishing returns due to communication and synchronization overheads.
Efficiency: High at low–mid core counts; decreases as communication dominates at larger scales.
Problem size effect: Larger N improves compute/communication ratio, aiding scalability; memory bandwidth/patterns remain limiting.
Overall: Row-wise CSR partitioning is simple and effective; best results typically appear at mid-to-high core counts before communication bottlenecks dominate.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
results		results
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md