POSTED ON FEBRUARY 9, 2026 TO Data Center Engineering
Building Prometheus: How Backend Aggregation Enables Gigawatt-Scale AI Clusters

- Details are being shared regarding the role of backend aggregation (BAG) in constructing gigawatt-scale AI clusters, such as Prometheus.
- BAG enables the seamless connection of thousands of GPUs across various data centers and regions.
- The BAG implementation connects two distinct network fabrics: Disaggregated Schedule Fabric (DSF) and Non-Scheduled Fabric (NSF).
Upon completion, the AI cluster known as Prometheus is projected to provide 1-gigawatt of capacity. This capacity will enhance and facilitate new and existing AI experiences across various products. Prometheus’ infrastructure is designed to span multiple data center buildings within a single large region, interconnecting tens of thousands of GPUs.
Backend aggregation (BAG) is a crucial component for scaling and connecting this infrastructure. It is utilized to seamlessly link GPUs and data centers through robust, high-capacity networking. By employing modular hardware, advanced routing, and resilient topologies, BAG guarantees both performance and reliability on an unprecedented scale.
As AI clusters continue to expand, BAG is anticipated to play a significant role in addressing future demands and fostering innovation across the global network.
What Is Backend Aggregation?
BAG represents a centralized Ethernet-based super spine network layer. Its primary function is to interconnect multiple spine layer fabrics across various data centers and regions within large clusters. In the context of Prometheus, for instance, the BAG layer acts as the aggregation point between regional networks and the backbone, facilitating the formation of mega AI clusters. BAG is engineered to accommodate substantial bandwidth requirements, with inter-BAG capacities capable of reaching the petabit range (e.g., 16-48 Pbps per region pair).
Backend aggregation (BAG) is employed to interconnect data center regions, allowing for the sharing of compute and other resources within large clusters.
How BAG Enables Gigawatt-Scale AI Clusters
To address the challenge of interconnecting tens of thousands of GPUs, distributed BAG layers are being deployed regionally.
Interconnecting BAG Layers
BAG layers are strategically distributed across regions to serve subsets of L2 fabrics, adhering to constraints related to distance, buffer, and latency. Inter-BAG connectivity employs either a planar (direct match) or spread connection topology, with the choice determined by site size and fiber availability.
- Planar topology connects BAG switches one-to-one between regions following the plane, offering simplified management but concentrating potential failure domains.
- Spread connection topology distributes links across multiple BAG switches/planes, enhancing path diversity and resilience.
An example of an inter-BAG network topology.
Connecting a BAG Layer to L2 Fabrics
The interconnection of BAG layers has been discussed; now, the connection of a BAG layer downstream to L2 fabrics will be examined.
Two primary fabric technologies, Disaggregated Schedule Fabric (DSF) and Non-Scheduled Fabric (NSF), have been utilized to construct L2 networks.
An example of DSF L2 zones across five data center buildings is shown below, connected to the BAG layer through a dedicated backend edge pod in each building.
A BAG inter-building connection for DSF fabric across five data centers.
An example of NSF L2 connected to BAG planes is provided below. Each BAG plane connects to corresponding Spine Training Switches (STSWs) from all spine planes, resulting in an effective oversubscription of 4.98:1.
A BAG inter-building connection for NSF fabric.
Careful management of oversubscription ratios helps balance scale and performance. Typical oversubscription from L2 to BAG is approximately 4.5:1, whereas BAG-to-BAG oversubscription differs according to regional requirements and link capacity.
Hardware and Routing
The BAG implementation utilizes a modular chassis equipped with Jericho3 (J3) ASIC line cards. Each card offers up to 432x800G ports, enabling high-capacity, scalable, and resilient interconnectivity. The central hub BAG employs a larger chassis to support numerous spokes and long-distance links, incorporating varied cable lengths for optimized buffer utilization.
Routing within BAG employs eBGP with link bandwidth attributes, which facilitates Unequal Cost Multipath (UCMP) for efficient load balancing and robust failure handling. BAG-to-BAG connections are secured using MACsec, in accordance with network security requirements.
Designing the Network for Resilience
The network design meticulously details port striping, IP addressing schemes, and comprehensive failure domain analysis. This ensures high availability and minimizes the impact of failures. Various strategies are also employed to mitigate blackholing risks, such as draining affected BAG planes and conditional route aggregation.
Considerations for Long Cable Distances
A significant advantage of BAG’s distributed architecture is its ability to maintain a short distance from the L2 edge, which is crucial for shallow buffer NSF switches. Longer BAG-to-BAG cable distances necessitate the use of deep buffer switches for the BAG role. This provides a substantial headroom buffer to support lossless congestion control protocols such as PFC.
Building Prometheus and Beyond
As a technology, BAG holds an important role in the next generation of AI infrastructure. By centralizing the interconnection of regional networks, BAG facilitates the gigawatt-scale Prometheus cluster, ensuring seamless, high-capacity networking across tens of thousands of GPUs. This thoughtful design, which leverages modular hardware and resilient topologies, positions BAG to not only satisfy the demands of Prometheus but also to propel future innovation and scalability of the global AI network for years to come.

