Spark

Reducing Communication in Proximal Newton Methods for Sparse Least Squares Problems

This work proposes RC-SFISTA with iteration-overlapping and Hessian reuse for sparse least-squares problems. The method reduces latency costs by a factor of $k$ and demonstrates speedups up to 12x compared to ProxCoCoA on MPI and Spark implementations evaluated on 1 to 512 nodes.

Avoiding Communication in Proximal Methods for Convex Optimization Problems

This technical report studies communication-avoiding proximal methods for large-scale convex optimization problems. The methods use iteration overlap and Hessian reuse to reduce latency costs while preserving the bandwidth profile of the baseline proximal algorithms.

C+MPI and Spark parallel efficiency comparison

Matrix Factorizations at Scale: A Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies

This work compares Apache Spark with traditional C and MPI implementations for NMF, PCA, and CX matrix factorizations on particle physics, climate modeling, and bioimaging data. The experiments scale to 1600 Cray XC40 nodes and provide tuning guidance for high-performance scientific data analytics.