Speedups for CA-SFISTA and CA-SPNM

Reducing Communication in Proximal Newton Methods for Sparse Least Squares Problems

This work proposes RC-SFISTA with iteration-overlapping and Hessian reuse for sparse least-squares problems. The method reduces latency costs by a factor of $k$ and demonstrates speedups up to 12x compared to ProxCoCoA on MPI and Spark implementations evaluated on 1 to 512 nodes.

August 2018 · Saeed Soori, Aditya Devarakonda, Zachary Blanco, James Demmel, Mert Gurbuzbalaban, Maryam Mehri Dehnavi
Speedups for CA-SFISTA and CA-SPNM

Avoiding Communication in Proximal Methods for Convex Optimization Problems

This technical report studies communication-avoiding proximal methods for large-scale convex optimization problems. The methods use iteration overlap and Hessian reuse to reduce latency costs while preserving the bandwidth profile of the baseline proximal algorithms.

October 2017 · Saeed Soori, Aditya Devarakonda, Zachary Blanco, James Demmel, Mert Gurbuzbalaban, Maryam Mehri Dehnavi
C+MPI and Spark parallel efficiency comparison

Matrix Factorizations at Scale: A Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies

This work compares Apache Spark with traditional C and MPI implementations for NMF, PCA, and CX matrix factorizations on particle physics, climate modeling, and bioimaging data. The experiments scale to 1600 Cray XC40 nodes and provide tuning guidance for high-performance scientific data analytics.

December 2016 · Alex Gittens, Aditya Devarakonda, Evan Racah, Michael Ringenburg, Lisa Gerhardt, Jey Kottalam, Jialin Liu, Kristyn Maschhoff, Shane Canon, Jatin Chhugani, Pramod Sharma, Jianlin Yang, James Demmel, Jim Harrell, Vijay Krishnamurthy, Michael W. Mahoney, Prabhat