Argonne technology enables high-speed data transferBy Eleanor Taylor • June 17, 2009
GridFTP, a protocol developed by researchers at Argonne National Laboratory, has been used to transfer unprecedented amounts of data over the Department of Energy's (DOE) Energy Sciences Network (ESnet), which provides a reliable, high-performance communications infrastructure to facilitate large-scale, collaborative science endeavors.
The Argonne-developed system proved key to enabling research groups at Oak Ridge National Laboratory in Tennessee and the National Energy Research Scientific Computing Center in California to move large data sets between the facilities at a rate of 200 megabytes per second.
The deployment of GridFTP at the two computing facilities is part of a major project to optimize wide-area network data transfers between sites hosting DOE leadership-class computers.
According to Ian Foster, co-director of the Globus Alliance project responsible for designing GridFTP, large-scale data transfer places an enormous burden on networks. "Conventional protocols have proven unable to handle the increasing demand of large-scale data transfer," he said. "The result has been delays in obtaining data, or even lost data as the network becomes overwhelmed. GridFTP changes that."
As large-scale collaborative science projects become increasingly common, the need to transfer unprecedented amounts of data is becoming critical. Having GridFTP on ESnet will enable the sharing of data between supercomputer centers in disciplines such as climate modeling and nuclear physics that require secure, robust, high-speed bulk data transfer.
"Our goal is to enable the scientists to rapidly move large-scale data sets between supercomputer centers as dictated by the needs of the science," said Eli Dart, a network engineer for ESnet, which is managed by Lawrence Berkeley National Laboratory. "High-performance networking has become critical to science due to the size of the data sets and the wide scope of collaboration characteristic of today's large science projects such as climate research and high energy physics."
GridFTP offers several advantages over other data transfer systems. For example, with Secure Copy, or scp, bulk transfer of a 33-gigabyte dataset between the two remote hosts could take up to eight hours. With GridFTP, almost 20 times that amount of data can be transferred in the same amount of time. And, unlike the transfer application FTP, GridFTP uses multiple data channels for improving the transfer speed.
"The data tsunami problem has been a major bottleneck to scientific advancement," said Raj Kettimuthu, who is the technical lead and technology coordinator of the GridFTP project at Argonne. "With GridFTP computational scientists can analyze their simulated and derived data in real time."
More information on GridFTP is available at www.globus.org/grid_software/data/gridftp.php.
More information on ESNet is available at www.es.net/.