Simulating CXL disaggregated memory at scale with fabric-level replication
Author(s)
Shan, Kevin Joshua
Advisor(s)
Editor(s)
Collections
Supplementary to:
Permanent Link
Abstract
Compute Express Link (CXL) enables low-latency coherent access to disaggregated memory, allowing pooled memory to be shared across multiple hosts with load/store semantics. This blurs the boundary between scale-up shared memory and scale-out isolation guarantees and introduces new challenges in modeling performance, contention, and data reliability at rack scale, since data-side memory failures can affect multiple compute nodes simultaneously. Existing CXL simulators primarily target single-host systems and cannot capture multi-host fabrics, shared memory behavior, or data-side fault tolerance.
This thesis presents a scalable, trace-driven simulation framework for multi-host CXL pooled-memory systems built on the Structural Simulation Toolkit (SST) and ChampSim. The framework models host microarchitecture, fabric-level interconnects, switches, and pooled memory as modular components, enabling detailed evaluation of latency, bandwidth contention, and routing under configurable topologies. Using this simulator, we model a multi-node switched pooling system under a variety of workloads, and characterize the behavior of scale-out CXL systems with and without replication. Through our experiments, we found that replication provides limited benefit for store-heavy workloads and greater performance benefit as workloads become more read-heavy, shifting the optimal operating region toward read-dominated traffic. The simulator artifact can be found at https://github.com/albertycho/cscore_sstelem.
Sponsor
Date
2026-05
Extent
Resource Type
Text
Resource Subtype
Undergraduate Thesis