Skip to main content

WAN Techniques for Data Intensive Science

SciStream

Big data science requires the coordination of multi-site scientific instruments, computing facilities, storage repositories, and other resources to generate and process data among them at high-performance rates. This goal involves orchestrating online processing, for example, for data reduction, feature detection, and experiment steering over memory-to-memory streaming from multiple source instruments to remote high-performance computers (HPC).

Currently, a systematic approach to orchestrating online processing over multiple data sources and processors does not exist. SciStream, a solution developed by Argonne National Laboratory and supported by iCAIR, is an embedded middlebox-based architecture with control protocols to enable efficient and secure memory-to-memory data streaming between producers and consumers without direct network connectivity. SciStream operates at the transport layer to be application agnostic, supporting well-known protocols such as TCP, UDP, and QUIC.

This investigation has demonstrated an emulation of multi-site online data processing over national WANs using ESnet testbed, StarLight, FABRIC, and other resources.

PetaTrans with NVMe-over-Fabrics

The PetaTrans with NVMe-over-Fabrics as a microservice is a research project aimed at improving large-scale WAN microservices for streaming and transferring large amounts of data among high-performance data transfer nodes (DTNs).

Building on earlier initiatives, for SC22, we are designing, implementing, and experimenting with NVMe-over-Fabrics on 400 Gbps Data Transfer Nodes (DTNs) over large-scale, long-distance networks with direct NVMe-to-NVMe service over RoCE and TCP fabrics using SmartNICs. NVMe-over-Fabrics microservice connects remote NVMe devices without userspace applications, reducing overhead in high-performance transfer and offloading NVMe-over-Fabrics initiators software stack in SmartNICs. 

The primary advantage of the NVMe-over-Fabrics microservice is that it can be deployed in multiple DTNs as a container with lower overhead. Although NVMe-over-Fabrics is mainly used in Data Center Networks, this work explores the possibility of using it for long-distance high-performance data movement over WANs.

Network Optimized for the Transport of Experimental Data (NOTED)

The Network Optimized for Transfer of Experimental Data (NOTED) initiative is being developed by CERN for potential use by the Large Hadron Collider (LHC) networking community. The NOTED project aims to optimize transfers of LHC data among sites by addressing problems such as saturation, contention, congestion, and other impairments.

The Worldwide LHC Computing Grid (WLCG - a global collaboration of approximately 200 interconnected computing centers) provides global computing, storage, distribution, and analytic resources supporting physics experiments using data generated by the (LHC) experiments at CERN. The WLCG’s three-tier structure (Tier 0 at CERN, Tier 1, and Tier 2 sites are interconnected by global high-performance multi-domain networks, the LHC Optical Private Network (LHCOPN), and the LHC Open Network Environment (LHCONE). Currently, the LHC networking community is preparing for a significant increase in the required network capacity to manage the flows expected from the High Luminosity LHC. 

The NOTED optimization method employs a combination of a) a deep understanding of the network traffic acquired by an analysis of the data flows and b) an appropriate response (e.g., dynamic allocation of additional capacity) to specific patterns detected among those flows, including by incorporating AI/ML/DP techniques.

Scitag Packet Marking Initiative

Managing large-scale scientific workflows over networks, especially WANs, is becoming increasingly complex as multiple science projects share the same foundation resources simultaneously while being governed by multiple divergent variables: requirements, constraints, configurations, technologies, etc. A key method to address this issue is employing techniques that provide high fidelity visibility into how science flows utilize network resources end-to-end. iCAIR is participating in developing one such method, Scientific network tags (Scitags). This initiative promotes the identification of the science domains and their high-level activities at the network level. This open system initiative provides open-source technologies to assist research and education networks in understanding resource utilization while providing information to scientific communities on the precise behavior of their network workflows.

Given the increasing complexity of scientific workflows over shared networks and the number of parameters they are subjected to, enhanced methods of visibility into flows are required. Such visibility enables efficient workflows by optimizing resource utilization and avoiding impairments such as channel saturation and/or congestion, contention, latency, and packet loss.

This Scitags initiative is developing multiple building blocks to achieve both enhanced visibility into network flows and enhanced efficiency based on the information provided by that enhanced visibility. Objectives include a) developing an overall architecture with a defined standard set of markings, b) providing methods for marking network traffic that would be easy to implement, e.g., reliance on VMs and containers, c) creating techniques for reading/monitoring/validating/analyzing those markings, d) determining appropriate responses/actions based on those analyses, and e) communication channels that provide information of these methods to wider communities, including recommendations on specific hardware, software, protocols, and configurations.

Named Data Networking for Data Intensive Science Experiments (N-DISE)

 

iCAIR provides services and infrastructure support for the NDN for Data Intensive Science Experiments (N-DISE) project (led by Northeastern University), which is directed at accelerating the pace of breakthroughs and innovations in data-intensive science fields such as the Large Hadron Collider (LHC) high energy physics program and the BioGenome and human genome projects. Based on Named Data Networking (NDN), a data-centric emerging Internet architecture, N-DISE is prototyping the deployment of a highly efficient and field-tested petascale data distribution, caching, access, and analysis system serving major science programs. 

The N-DISE project builds on recently developed high-throughput NDN caching and forwarding methods containerization techniques, leverage the integration of NDN and SDN systems concepts and algorithms with the mainstream data distribution, processing, and management systems of CMS, as well as the integration with Field Programmable Gate Arrays (FPGA) acceleration subsystems, to produce a system capable of delivering LHC and genomic data over a wide area network at throughputs approaching 100 Gbits per second, while dramatically decreasing download times. N-DISE leverages existing infrastructure to build an enhanced testbed with high-performance NDN data cache servers at participating institutions.

To achieve high performance, N-DISE leverages the following key components: (1) the transparent integration of NDN with the current CMS software stack via an NDN-based XRootD Open Storage System plugin, (2) joint caching and multipath forwarding capabilities of the VIP algorithm, (3) integration with FPGA acceleration subsystems, (4) SDN support for NDN, including through AutoGOLE/SENSE.

In-Band (In-Network Computing)

iCAIR is participating in several research projects exploring the potential of in-band or in-networking computing. Modern scientific instruments, such as radio telescopes, mass sensor arrays, and detectors at synchrotron light sources, generate data at such high rates that in-band or online processing is needed for data reduction, feature detection, analytics, and experiment steering. The same high data rates also require memory-to-memory streaming from instrument to remote high-performance computers (HPC) because local computational capacity is limited, and data transmissions that engage the file systems using traditional methods introduce unacceptable latencies. 

To address these issues, several research projects are exploring the potential of in-band computing. One such project led by Argonne National Laboratory (ANL) has developed SciStream, a middlebox-based architecture with control protocols to enable efficient and secure memory-to-memory data streaming between producers and consumers without direct network connectivity. SciStream operates at the transport layer to be application agnostic, supporting well-known protocols such as TCP, UDP, and QUIC. To demonstrate this technique at SC23, ANL emulated a light source data acquisition workflow streaming data from the StarLight booth across a SCinet WAN link to the StarLight facility in Chicago to an ANL supercomputer. A workflow that used the FABRIC testbed to connect StarLight with the Chameleon Cloud Testbed infrastructure was also demonstrated. SciStream enabled the mem-to-mem data streaming over the wide area network (WAN).