Using the Parallel Virtual Machine for Everyday Analysis

Extra keywords were included with PVM and MPI so as to cull false matches (e.g. with the Max Planck Institute). The keyword
xspec refers to the software program of the same name (Arnaud 1996), which is generally regarded as the most widely used application for modeling X-ray spectra. Queries in ADS on other modeling tools, or with other search engines such as Google, all yield similar trends: astronomers and astrophysicists do employ parallel computing, but mainly for highly customized, large-scale problems in simulation, image processing, or data reduction. Virtually no one is using parallelism for fitting models within established software systems, especially in the interactive context, even though a majority of papers published in observational astronomy result from exactly this form of analysis.

ISIS, S-Lang, PVM, and SLIRP

To exploit this opportunity we’ve extended ISIS, the Interactive Spectral Interpretation System (Houck 2002), with a dynamically importable module that provides scriptable access to the Parallel Virtual Machine (Geist et al 1994). PVM was selected (e.g. over MPI) for its robust fault tolerance in a networked environment. ISIS, in brief, was originally conceived as a tool for analyzing Chandra grating spectra, but quickly grew into a general-purpose analysis system. It provides a superset of the XSpec models and, by embedding the S-Lang interpreter, a powerful scripting envi

ronment complete with fast array-based mathematical capabilities rivaling commercial packages such as MatLab or IDL.
Custom user models may be loaded into ISIS as either scripts
or compiled code1, without any recompilation of ISIS itself; because of the fast array manipulation native to S-Lang, scripted models suffer no needless performance penalties, while the SLIRP code generator (Noble 2003) can render the use of compiled C, C++, and FORTRAN models a nearly instantaneous, turnkey process.

Parallel Modeling

Using the PVM module we’ve parallelized a number of the numerical modeling tasks in which astronomers engage daily, and summarize them here as a series of case studies. Many of the scientific results stemming from these efforts are already appearing elsewhere in the literature.

Kerr Disk Line
Relativistic Kerr disk models are computationally expensive. Historically, implementors have opted to use precomputed tables to gain speed at the expense of limiting flexibility in searching parameter space. However, by recognizing that contributions from individual radii may be computed independently we’ve parallelized the model to avoid this tradeoff. To gauge the performance benefits2 we tested the sequential execution of a single model evaluation, using a small, faked test dataset, on our fastest CPU (a 2Ghz AMD Opteron), yielding a median runtime of 33.86 seconds. Farming the same computation out to 14 CPUs on our network reduced the median runtime to 8.16s, yielding a speedup of 4.15. While 30% efficiency seems unimpressive at first glance, this result actually represents 67% of the peak speedup of 6.16 predicted by Amdahl’s Law (5.5 of the 33.86 seconds runtime on 1 CPU was not parallelizable in the current implementation), on CPUs of mixed speeds and during normal working hours. Reducing the model evaluation time to 8 seconds brings it into the realm of interactive use, with the result that fits requiring 3-4 hours to converge (on “real” datasets such as the long XMM-Newton observation of MCG--6-30-15 by Fabian) may now be done in less than 1 hour. The model evaluation is initiated in ISIS through the S-Lang hook function

public define pkerr_fit (lo, hi, par)
{
variable klo, khi;
(klo, khi) = _A(lo, hi); % convert angstroms to KeV
return par[0] * reverse ( master (klo, khi, par));
}

where lo and hi are arrays (of roughly 800 elements) representing the left and right edges of each bin within the model grid, and par is a 10 element array of the Kerr model parameters. Use of the PVM module is hidden within the master call (which partitions the disk radii computation into slave tasks), allowing ISIS to remain unaware that the model has even been parallelized. This is an important point: parallel models are installed and later invoked using precisely the same mechanisms employed for sequential models.
For each task the slaves invoke a FORTRAN kerr model implementation, by Laura Breneman at the University of Maryland, wrapped by SLIRP as follows:
linux% slirp -make kerr.f
Starter make file generated to kerr.mf
linux% make -f kerr.mf

Confidence Contours and Error Bars

Error analysis is ripe for exploitation with parallel methods. In the 1D case, an independent search of χ2 space may be made for each of the I model parameters, using N=I slaves, with each treating one parameter as thawed and I-1 as fixed. Note that superlinear speedups are possible here, since a slave finding a lower χ2 value can immediately terminate its N-1 brethren and restart them with updated parameters values. Parallelism in the 2D case is achieved by a straightforward partition of the parameter value grid into J independently-evaluated rectangles, where J >> N (again, the number of slaves) is typical on our cluster. Our group and collaborators have already published several results utilizing this technique. For example, Allen et al 2004 describes joint X-ray, radio, and γ-ray fits of SN1006, containing a synchrotron radiation component modeled as

This also makes it easy for ISIS to employ an MPI module for parallelism, if desired.

The physics of this integral is not important here; what matters is that the cost of evaluating it over a 2D grid is prohibitive (even though symmetry and precomputed tables have reduced the integral from 3D to 1D), since it must be computed once per spectral bin, hundreds of times per model evaluation, and potentially millions of times per confidence grid. A 170x150 contour grid (of electron spectrum exponential cutoff energy versus magnetic field strength) required 10 days to compute on 20-30 CPUs (the fault tolerance of PVM is critical here), and would scale linearly to a 6-10 month job on a single workstation.

Temperature Mapping

Temperature mapping is another problem that is straightforward to parallelize and for which we have already published results. For instance, Wise & Houck 2004 provides a map of heating in the intracluster medium of Perseus, computed from 10,000 spectral extractions and fits on 20+ CPUs in just several hours.

Going Forward

It is important to note that in the two previous studies the models themselves were not parallelized, so the usual entry barrier of
converting serial codes to parallel does not apply. One consequence is that the community should no longer feel compelled to compute error analyses or temperature maps serially. Another consequence is that the independence between partitions of the data and the computation being performed, which makes the use of sequential models possible in the parallel context, also lurks within other areas of the modeling problem. In principle it should be possible to evaluate an arbitrary sequential model in parallel by partitioning the model grid over which it’s evaluated, or by evaluating over each dataset independently (when multiple datasets are fit), or in certain cases even by evaluating non-tied components in parallel. We are implementing these techniques with an eye towards rendering their use as transparent as possible for the non-expert. With simple models or small datasets these measures may be not be necessary, but the days of simple models and small datasets are numbered. Reduced datasets have already hit the gigabyte scale, and multi-wavelength analysis such as we describe above is fast becoming the norm. These trends will only accelerate as newer instruments are deployed and the Virtual Observatory is more widely utilized, motivating scientists to tackle more ambitious analysis problems that may have been shunned in the past due to their computational expense.

M.S. Noble, J. C. Houck, J. E. Davis, A. Young, M. Nowak

This work was supported by NASA through the AISRP grant NNG05GC23G and Smithsonian Astrophysical Observatory contract SV3-73016 for the Chandra X-Ray Center.

References
Allen, G. E., Houck, J. C., & Sturner, S. J. 2004, Advances in
Space Research, 33, 440
Kurtz, M.J., Karakashian, T., Grant, C.S., Eichhorn, G., Murray, S.S., Watson, J.M., Ossorio, P.G., & Stoner, J.L. 1993 ADASS II
Arnaud, K.A. 1996 ADASS V
A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, & V. Sunderam 1994, PVM: Parallel Virtual Machine, A User’s Guide and Tutorial for Networked Parallel Computing
Houck, J. C. 2002, ISIS: The Interactive Spectral Interpretation System, High Resolution X-ray Spectroscopy with XMM-Newton and Chandra
Noble, M. S. 2003, http://space.mit.edu/cxc/software/slang/modules/slirp
Wise, M., & Houck, J. 2004, 35th COSPAR Scientific Assembly, 3997

Keywords	Number of Hits
parallel AND pvm	38
message AND passing AND mpi	21
xspec	832
xspec AND parallel AND pvm	0
xspec AND message AND passing AND mpi	0