Remove libcudf dependency from cuCascade
Summary
cuCascade should be a generic tiered GPU memory/data manager. Currently some components of the library are tightly coupled to cudf, introducing a dependency that many consumers do not want. The cudf-specific concrete representations (gpu_table_representation, host_data_representation, host_data_packed_representation) are, for the moment, specific to Sirius and should be moved there. Currently, cuCascade's public API exposes cudf::table, cudf::table_view, and cudf::type_id directly, and links cudf::cudf as a PUBLIC dependency. As part of the removal of cudf-specific types from cuCascade, the public APIs should also avoid exposing these cudf types. In the future if there is a common use case for cudf-aware representations in other projects, they can be added back as separate plugins without forcing every cuCascade consumer to depend on cudf.
Motivation
- Build isolation: cuCascade consumers should only need RMM + CUDA, not the entire libcudf
- Clean architecture: cuCascade's abstractions (
idata_representation, converter registry, data_batch, disk I/O backends) are already generic
- Separation of concerns: The 1800-line
representation_converter.cpp understands cudf's nested column model (STRING/LIST/STRUCT/DICTIONARY reconstruction) — this is Sirius domain logic, not generic memory management
Note that RMM stays as a dependency. rmm::mr::device_memory_resource and rmm::cuda_stream_view are core parts of cuCascade's memory management abstractions.
Current State
What's already generic (stays in cuCascade)
| Component |
Location |
Notes |
idata_representation interface |
include/cucascade/data/common.hpp |
Abstract base, no cudf types |
representation_converter_registry |
include/cucascade/data/representation_converter.hpp |
Type-erased plugin registry |
data_batch / read_only_data_batch / mutable_data_batch |
include/cucascade/data/data_batch.hpp |
Generic batch lifecycle |
data_repository / data_repository_manager |
include/cucascade/data/data_repository.hpp |
Generic collections |
idisk_io_backend + pipeline backend |
include/cucascade/data/disk_io_backend.hpp, src/data/pipeline_io_backend.cpp |
Generic disk I/O (CUDA runtime + POSIX O_DIRECT) |
disk_data_representation |
include/cucascade/data/disk_data_representation.hpp |
No cudf types in header |
disk_file_format.hpp |
include/cucascade/data/disk_file_format.hpp |
Only defines alignment |
| Entire memory subsystem |
include/cucascade/memory/ (except host_table.hpp) |
RMM-only |
| Topology discovery |
include/cucascade/memory/topology_discovery.hpp |
Neither RMM nor cudf |
What must be extracted (moves to Sirius)
| Component |
Location |
cudf Symbols Used |
gpu_table_representation |
include/cucascade/data/gpu_data_representation.hpp |
cudf::table, cudf::table_view |
host_data_representation |
include/cucascade/data/cpu_data_representation.hpp |
Semantically cudf column-tree oriented |
host_data_packed_representation |
include/cucascade/data/cpu_data_representation.hpp |
Based on cudf::pack output |
host_table.hpp (column_metadata) |
include/cucascade/memory/host_table.hpp |
cudf::type_id, cudf::size_type |
host_table_packed.hpp |
include/cucascade/memory/host_table_packed.hpp |
Semantically tied to cudf::pack |
| Built-in converters |
src/data/representation_converter.cpp |
Deep cudf: pack/unpack, column reconstruction, type dispatch, STRING/LIST/STRUCT/DICTIONARY handling |
register_builtin_converters() |
src/data/representation_converter.cpp |
Registers all cudf-specific conversions |
gpu_data_representation.cpp |
src/data/gpu_data_representation.cpp |
cudf::table, cudf::copying |
bandwidth_profiler.cpp |
src/data/bandwidth_profiler.cpp |
Uses cudf to generate synthetic tables |
CMake coupling (must change)
These requirements should be removed from cuCascade's CMakeLists.txt and config:
# CMakeLists.txt:64 - currently REQUIRED
find_package(cudf REQUIRED CONFIG)
# CMakeLists.txt:153 - currently PUBLIC link
set(CUCASCADE_PUBLIC_LINK_LIBS rmm::rmm cudf::cudf CUDA::cudart_static Threads::Threads ${NUMA_LIB})
# cmake/cuCascadeConfig.cmake.in:9 - propagates to consumers
find_dependency(cudf REQUIRED CONFIG)
Sirius Breakage Surface
Sirius consumes cuCascade as a git submodule (add_subdirectory(cucascade ...)).Sirius must be updated to include the necessary types before cuCascade can remove them.
End State
After all phases:
- cuCascade depends on: RMM, CUDA runtime, NVML, libnuma
- cuCascade does NOT depend on: libcudf
- cuCascade provides: tiered memory management, generic data batch lifecycle, disk I/O backends, converter registry infrastructure
- Sirius provides: cudf-specific representations, cudf-specific converters, all cudf domain logic
- Sirius depends on: cuCascade (for infrastructure) + libcudf (for its own data types)
Notes for Implementation
- The
disk_data_representation header is already generic and stays — but the DISK converters that serialize cudf column trees to disk format move to Sirius
- The disk file format itself (
disk_file_header, 4KB alignment) stays in cuCascade — it's not cudf-specific
disk_data_representation's destructor (RAII file deletion) stays in cuCascade
- The
representation_converter_fn signature uses idata_representation — it doesn't need cudf types
Acceptance Criteria
Remove libcudf dependency from cuCascade
Summary
cuCascade should be a generic tiered GPU memory/data manager. Currently some components of the library are tightly coupled to cudf, introducing a dependency that many consumers do not want. The cudf-specific concrete representations (
gpu_table_representation,host_data_representation,host_data_packed_representation) are, for the moment, specific to Sirius and should be moved there. Currently, cuCascade's public API exposescudf::table,cudf::table_view, andcudf::type_iddirectly, and linkscudf::cudfas a PUBLIC dependency. As part of the removal of cudf-specific types from cuCascade, the public APIs should also avoid exposing these cudf types. In the future if there is a common use case for cudf-aware representations in other projects, they can be added back as separate plugins without forcing every cuCascade consumer to depend on cudf.Motivation
idata_representation, converter registry,data_batch, disk I/O backends) are already genericrepresentation_converter.cppunderstands cudf's nested column model (STRING/LIST/STRUCT/DICTIONARY reconstruction) — this is Sirius domain logic, not generic memory managementNote that RMM stays as a dependency.
rmm::mr::device_memory_resourceandrmm::cuda_stream_vieware core parts of cuCascade's memory management abstractions.Current State
What's already generic (stays in cuCascade)
idata_representationinterfaceinclude/cucascade/data/common.hpprepresentation_converter_registryinclude/cucascade/data/representation_converter.hppdata_batch/read_only_data_batch/mutable_data_batchinclude/cucascade/data/data_batch.hppdata_repository/data_repository_managerinclude/cucascade/data/data_repository.hppidisk_io_backend+ pipeline backendinclude/cucascade/data/disk_io_backend.hpp,src/data/pipeline_io_backend.cppdisk_data_representationinclude/cucascade/data/disk_data_representation.hppdisk_file_format.hppinclude/cucascade/data/disk_file_format.hppinclude/cucascade/memory/(excepthost_table.hpp)include/cucascade/memory/topology_discovery.hppWhat must be extracted (moves to Sirius)
gpu_table_representationinclude/cucascade/data/gpu_data_representation.hppcudf::table,cudf::table_viewhost_data_representationinclude/cucascade/data/cpu_data_representation.hpphost_data_packed_representationinclude/cucascade/data/cpu_data_representation.hppcudf::packoutputhost_table.hpp(column_metadata)include/cucascade/memory/host_table.hppcudf::type_id,cudf::size_typehost_table_packed.hppinclude/cucascade/memory/host_table_packed.hppcudf::packsrc/data/representation_converter.cppregister_builtin_converters()src/data/representation_converter.cppgpu_data_representation.cppsrc/data/gpu_data_representation.cppcudf::table,cudf::copyingbandwidth_profiler.cppsrc/data/bandwidth_profiler.cppCMake coupling (must change)
These requirements should be removed from cuCascade's CMakeLists.txt and config:
Sirius Breakage Surface
Sirius consumes cuCascade as a git submodule (
add_subdirectory(cucascade ...)).Sirius must be updated to include the necessary types before cuCascade can remove them.End State
After all phases:
Notes for Implementation
disk_data_representationheader is already generic and stays — but the DISK converters that serialize cudf column trees to disk format move to Siriusdisk_file_header, 4KB alignment) stays in cuCascade — it's not cudf-specificdisk_data_representation's destructor (RAII file deletion) stays in cuCascaderepresentation_converter_fnsignature usesidata_representation— it doesn't need cudf typesAcceptance Criteria
find_package(cudf)does not appear in cuCascade's CMakeLists.txtcudf::cudfdoes not appear in any link target#include <cudf/...>in any cuCascade source or header filepixi run cmake-release && pixi run build-releasesucceeds without cudf installed