The calculations discussed on these pages are all computationally expensive.
Of course, computers are becoming faster all the time, which means that more
complicated calculations will become possible in the future. However, we feel
that we can obtain better throughput for our calculations (i.e. tackle larger
problems) by harnessing the power of parallel computers. To do this efficiently,
new algorithms are needed.

One such algorithm is the DDPHP algorithm. The key idea behind this
algorithm is to distribute the entire wave packet *twice*, so that each
processor in the calculation contains two *different* slices of the wave
packet. This means that no communication between the processors is required
for all computations. The wave packet only needs to be resynchronized after
each **HΨ** iteration. The result is that the percentage of the wave
packet that has to be transferred in each iteration *decreases* with an
increasing number of processors. This is in contrast to an algorithm in which
the wave packet is only distributed once, because in that case the percentage
of the wave packet to be transferred in each iteration actually
*increases* with increasing number of processors. The efficient
communication characteristics of the DDPHP algorithm combined with an
efficient computational layout means that the algorithm scales linearly with
an increasing number of processors. We have used the DDPHP algorithm already in
a number of applications in gas-surface scattering.

One disadvantage of the DDPHP method is that it requires a fast interconnect
between the processor of the machine, since the amount of data that has to be
is still quite significant. Therefore, we are currently working on methods, which
are more suited to more loosely connected parallel computers, such as
Beowulf clusters. In the long-term we like to connect these methods with the DDPHP method
and the Coriolis coupled method we used earlier to obtain a general parallel method that
can be used for a variety of systems from large SMP machines with a large number of
processors per node to clusters of workstations.