The development of out-of-core techniques suitable for processing
highly irregular, unstructured data, such as are generated by
cosmological *N*-body simulations, is crucial to manipulating a dataset
of sufficient size. We have recently obtained extremely promising
results applying parallel, out-of-core techniques to the irregular and
dynamic treecode methods that are used within the *N*-body integrator
itself [9]. Essentially, the problem reduces to one of
guaranteeing spatial and temporal locality so that access to secondary
storage is necessary only rarely, and once loaded from secondary
storage, a given datum is likely to be re-used many times in primary
storage before being flushed. The solution is exactly the same one
that has been employed to good effect to achieve load balance and
tolerate latency in parallel implementations of treecodes -- organize
and divide the data along a space-filling curve that approximately
preserves spatial locality even if the underlying data is highly
clustered and irregular. The crucial observation is that by placing
the pages dynamically on disk in an order determined by a continuous
space-filling curve, e.g., a Peano-Hilbert curve, (see
Figure 7) and ordering our access to data along the same
curve, we preserve the spatial and temporal locality in the algorithm.
Since the space-filling curve is continuous, objects that are near in
space, and hence likely to be linked together by the algorithms,
are also near in primary
or secondary storage.
By visiting objects in order along the space-filling
curve, objects that were recently brought into primary storage, either because
they were directly addressed, or because they are on the same page as
a recently addressed object, are likely to be reused. Since the
parallel implementation of the spatial halo finder uses the same data
and control structures as the *N*-body code to tolerate irregularity
and achieve parallelism, it can be adapted to use out-of-core storage
in the same way that has worked for the *N*-body code itself.

**Figure:**
A path through the positions of 10000 highly clustered particles in
two dimensions is shown in (a). The path is obtained by locating
each particle along a space-filling Peano-Hilbert curve. Levels 4,
6 and 8 of an adaptive quad-tree are shown in (b), (c), and (d).
The squares represent ``cells'' in an adaptive quad-tree that is
used to compute densities and locate neighbors in time.
The whole tree is obtained by ``stacking'' these levels, along with
those not shown.
Again, the path connecting the cells is derived from a Peano-Hilbert
curve. In this case, the curve indicates the order in which data
are stored in both primary and secondary memory. Each level of the
tree is stored separately, so that data within the same level is always
contiguous.
When particles are
visited in the order implied by (a), they cause pages to be moved
from secondary storage which hold data that is contiguous along the
curves represented in (b), (c), (d). The orderings guarantee that
the data is reused many times before being discarded, allowing the
algorithm to tolerate the modest bandwidth and extreme latencies
characteristic of out-of-core calculation. (Two dimensions are used
purely for ease of graphical representation. The implementation is
fully three-dimensional.)

Sat Sep 27 18:44:36 PDT 1997