We follow two complementary approaches to these challenges. First, we are working with domain scientists to identify specific problems that require massive computation and developing specialized tools for these problems. The initial results include astronomical and cosmological applications. We are now also looking at problems in bioinformatics, sustainability, and gathering data on the web.
Second, we aim to develop more general toolkits, which will enable domain scientists to build their own applications for massive data processing, much in the same way as Excel enables users to build numeric applications. The long-term purpose is to create a distributed database for storing and indexing various scientific data, along with tools for querying and integrating these data. The initial results include techniques for distributed indexing of astronomical catalogs and cosmological simulations.