Skip to content

Add cgroup-cpuset-aware worker pinning to dbtest (--cgroup-aware)#693

Open
yans3meta wants to merge 1 commit into
facebookresearch:v2-betafrom
yans3meta:export-D109870007-to-v2-beta
Open

Add cgroup-cpuset-aware worker pinning to dbtest (--cgroup-aware)#693
yans3meta wants to merge 1 commit into
facebookresearch:v2-betafrom
yans3meta:export-D109870007-to-v2-beta

Conversation

@yans3meta

Copy link
Copy Markdown

Summary:
Silo's dbtest derives each worker's NUMA node from its logical core id
(effectively always node 0) and pins via numa_run_on_node(), so under a cgroup
v2 cpuset that excludes node 0 it aborts. This adds an opt-in --cgroup-aware
runtime flag and drops the explicit NUMA memory hint so this benchpress package
can run inside an off-node cpuset. Default behavior (no flag) is unchanged from
stock Silo.

The C++ change ships as a build-time patch
(packages/silo/patches/silo-cgroup-aware-no-numa.patch) applied during install
against the pinned upstream commit:

  • allocator.cc: remove numa_hint_memory_placement()
    (mbind/numa_interleave_memory); the DB pool is placed by first-touch.
  • core.{cc,h}: read the process CPU-affinity mask once via sched_getaffinity
    (reflects cgroup cpuset / numactl / taskset); add phys_cpu(),
    num_allowed_cpus(), and a cgroup_aware() runtime toggle.
  • rcu.cc: when cgroup-aware, sched_setaffinity each worker to exactly its
    granted physical CPU (memory follows by first-touch); otherwise stock
    numa_run_on_node().
  • bench.{cc,h}, dbtest.cc, ycsb.cc: plumb the --cgroup-aware flag; size loaders
    and workers off num_allowed_cpus() when set; require num-threads <= allowed
    CPUs under strict pinning; log the worker pin map.

Package wiring:

  • install_silo.sh: git apply --check then git apply the patch after submodule init.
  • README.md: document the no-NUMA-mempolicy difference and the --cgroup-aware flag.

Differential Revision: D109870007

Summary:
Silo's dbtest derives each worker's NUMA node from its *logical* core id
(effectively always node 0) and pins via numa_run_on_node(), so under a cgroup
v2 cpuset that excludes node 0 it aborts. This adds an opt-in --cgroup-aware
runtime flag and drops the explicit NUMA memory hint so this benchpress package
can run inside an off-node cpuset. Default behavior (no flag) is unchanged from
stock Silo.

The C++ change ships as a build-time patch
(packages/silo/patches/silo-cgroup-aware-no-numa.patch) applied during install
against the pinned upstream commit:

- allocator.cc: remove numa_hint_memory_placement()
  (mbind/numa_interleave_memory); the DB pool is placed by first-touch.
- core.{cc,h}: read the process CPU-affinity mask once via sched_getaffinity
  (reflects cgroup cpuset / numactl / taskset); add phys_cpu(),
  num_allowed_cpus(), and a cgroup_aware() runtime toggle.
- rcu.cc: when cgroup-aware, sched_setaffinity each worker to exactly its
  granted physical CPU (memory follows by first-touch); otherwise stock
  numa_run_on_node().
- bench.{cc,h}, dbtest.cc, ycsb.cc: plumb the --cgroup-aware flag; size loaders
  and workers off num_allowed_cpus() when set; require num-threads <= allowed
  CPUs under strict pinning; log the worker pin map.

Package wiring:
- install_silo.sh: git apply --check then git apply the patch after submodule init.
- README.md: document the no-NUMA-mempolicy difference and the --cgroup-aware flag.

Differential Revision: D109870007
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 26, 2026
@meta-codesync

meta-codesync Bot commented Jun 26, 2026

Copy link
Copy Markdown

@yans3meta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D109870007.

meta-codesync Bot pushed a commit that referenced this pull request Jun 26, 2026
Summary:
Pull Request resolved: #693

Silo's dbtest derives each worker's NUMA node from its *logical* core id
(effectively always node 0) and pins via numa_run_on_node(), so under a cgroup
v2 cpuset that excludes node 0 it aborts. This adds an opt-in --cgroup-aware
runtime flag and drops the explicit NUMA memory hint so this benchpress package
can run inside an off-node cpuset. Default behavior (no flag) is unchanged from
stock Silo.

The C++ change ships as a build-time patch
(packages/silo/patches/silo-cgroup-aware-no-numa.patch) applied during install
against the pinned upstream commit:

- allocator.cc: remove numa_hint_memory_placement()
  (mbind/numa_interleave_memory); the DB pool is placed by first-touch.
- core.{cc,h}: read the process CPU-affinity mask once via sched_getaffinity
  (reflects cgroup cpuset / numactl / taskset); add phys_cpu(),
  num_allowed_cpus(), and a cgroup_aware() runtime toggle.
- rcu.cc: when cgroup-aware, sched_setaffinity each worker to exactly its
  granted physical CPU (memory follows by first-touch); otherwise stock
  numa_run_on_node().
- bench.{cc,h}, dbtest.cc, ycsb.cc: plumb the --cgroup-aware flag; size loaders
  and workers off num_allowed_cpus() when set; require num-threads <= allowed
  CPUs under strict pinning; log the worker pin map.

Package wiring:
- install_silo.sh: git apply --check then git apply the patch after submodule init.
- README.md: document the no-NUMA-mempolicy difference and the --cgroup-aware flag.

Reviewed By: excelle08

Differential Revision: D109870007

fbshipit-source-id: 64fd7be4a86a414dfb172408b138f7eec868b2f8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant