We are open-sourcing the initial version of RCCLX – an enhanced version of RCCL that we developed and tested on Meta’s internal workloads. RCCLX is fully integrated with Torchcomms and aims to empower researchers and developers to accelerate innovation, regardless of their chosen backend. Communication patterns for AI models are constantly evolving, as are hardware [...] Read More... The post…
#networking traffic
7 posts
24 Feb
20 Oct 2025
Disaggregated Schedule Fabric (DSF) is Meta’s next-generation network fabric technology for AI training networks that addresses the challenges of existing Clos-based networks. We’re sharing the challenges and innovations surrounding DSF and discussing future directions, including the creation of mega clusters through DSF and non-DSF region interconnectivity, as well as the exploration of alternative switching technologies. [...] Read More... The post…
14 Oct 2025
At Open Compute Project Summit (OCP) 2025, we’re sharing details about the direction of next-generation network fabrics for our AI training clusters. We’ve expanded our network hardware portfolio and are contributing new disaggregated network platforms to OCP. We look forward to continued collaboration with OCP to open designs for racks, servers, storage boxes, and motherboards [...] Read More... The post…
26 Sept 2025
AI is everywhere and, as network engineers, we are right in the thick of it: building the network infrastructure for AI. This year, at our largest @Scale:Networking ever, engineers from Meta, ByteDance, Google, Microsoft, Oracle, AMD, Broadcom, Cisco, and NVIDIA came together to share our latest experiences in architecting, designing, operating, and debugging our AI [...] Read More... The post…
1 May 2025
Meta develops infrastructure all across the globe to transport information and content for the billions of people using our services around the world. At the core of this infrastructure are aggregation points – like data centers – and the digital cables that connect them. Subsea cables – the unseen digital highways of the internet – [...] Read More... The post…
14 Feb 2025
Today, we’re announcing our most ambitious subsea cable endeavor yet: Project Waterworth. Once complete, the project will reach five major continents and span over 50,000 km (longer than the Earth’s circumference), making it the world’s longest subsea cable project using the highest-capacity technology available. Project Waterworth will bring industry-leading connectivity to the U.S., India, Brazil, [...] Read More... The post…
3 Feb 2025
We’ve previously described why we think it’s time to leave the leap second in the past. In today’s rapidly evolving digital landscape, introducing new leap seconds to account for the long-term slowdown of the Earth’s rotation is a risky practice that, frankly, does more harm than good. This is particularly true in the data center [...] Read More... The post…