Halo World: Tools for Parallel Cluster Finding in Astrophysical N-body Simulations

In the Journal of Data Mining and Knowledge Discovery special issue on Scalable High-Performance Computing for KDD


David Pfitzner, John Salmon and Thomas Sterling


Cosmological N-body simulations on parallel computers produce large datasets --- gigabytes at each instant of simulated cosmological time, and hundreds of gigabytes over the course of a simulation. These large datasets require further analysis before they can be compared to astronomical observations. The ``Halo World'' tools include two methods for performing halo finding: identifying all of the gravitationally stable clusters in a point-sampled density field. One of these methods is a parallel implementation of the friends of friends (FOF) algorithm, widely used in the field of N-body cosmology. The new IsoDen method based on isodensity surfaces has been developed to overcome some of the shortcomings of FOF. Parallel processing is the only viable way of obtaining the necessary performance and storage capacity to carry out these analysis tasks. Ultimately, we must also plan to use disk storage as the only economically viable alternative for storing and manipulating such large data sets. Both IsoDen and friends of friends have been implemented on a variety of computer systems, with parallelism up to 512 processors, and successfully used to extract halos from simulations with up to 16.8 million particles.

Other Publications

John Salmon