
Matrix Factorizations at Scale: A Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies
This work compares Apache Spark with traditional C and MPI implementations for NMF, PCA, and CX matrix factorizations on particle physics, climate modeling, and bioimaging data. The experiments scale to 1600 Cray XC40 nodes and provide tuning guidance for high-performance scientific data analytics.