Add cgroup-cpuset-aware worker pinning to dbtest (--cgroup-aware)#693
Open
yans3meta wants to merge 1 commit into
Open
Add cgroup-cpuset-aware worker pinning to dbtest (--cgroup-aware)#693yans3meta wants to merge 1 commit into
yans3meta wants to merge 1 commit into
Conversation
Summary:
Silo's dbtest derives each worker's NUMA node from its *logical* core id
(effectively always node 0) and pins via numa_run_on_node(), so under a cgroup
v2 cpuset that excludes node 0 it aborts. This adds an opt-in --cgroup-aware
runtime flag and drops the explicit NUMA memory hint so this benchpress package
can run inside an off-node cpuset. Default behavior (no flag) is unchanged from
stock Silo.
The C++ change ships as a build-time patch
(packages/silo/patches/silo-cgroup-aware-no-numa.patch) applied during install
against the pinned upstream commit:
- allocator.cc: remove numa_hint_memory_placement()
(mbind/numa_interleave_memory); the DB pool is placed by first-touch.
- core.{cc,h}: read the process CPU-affinity mask once via sched_getaffinity
(reflects cgroup cpuset / numactl / taskset); add phys_cpu(),
num_allowed_cpus(), and a cgroup_aware() runtime toggle.
- rcu.cc: when cgroup-aware, sched_setaffinity each worker to exactly its
granted physical CPU (memory follows by first-touch); otherwise stock
numa_run_on_node().
- bench.{cc,h}, dbtest.cc, ycsb.cc: plumb the --cgroup-aware flag; size loaders
and workers off num_allowed_cpus() when set; require num-threads <= allowed
CPUs under strict pinning; log the worker pin map.
Package wiring:
- install_silo.sh: git apply --check then git apply the patch after submodule init.
- README.md: document the no-NUMA-mempolicy difference and the --cgroup-aware flag.
Differential Revision: D109870007
|
@yans3meta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D109870007. |
meta-codesync Bot
pushed a commit
that referenced
this pull request
Jun 26, 2026
Summary: Pull Request resolved: #693 Silo's dbtest derives each worker's NUMA node from its *logical* core id (effectively always node 0) and pins via numa_run_on_node(), so under a cgroup v2 cpuset that excludes node 0 it aborts. This adds an opt-in --cgroup-aware runtime flag and drops the explicit NUMA memory hint so this benchpress package can run inside an off-node cpuset. Default behavior (no flag) is unchanged from stock Silo. The C++ change ships as a build-time patch (packages/silo/patches/silo-cgroup-aware-no-numa.patch) applied during install against the pinned upstream commit: - allocator.cc: remove numa_hint_memory_placement() (mbind/numa_interleave_memory); the DB pool is placed by first-touch. - core.{cc,h}: read the process CPU-affinity mask once via sched_getaffinity (reflects cgroup cpuset / numactl / taskset); add phys_cpu(), num_allowed_cpus(), and a cgroup_aware() runtime toggle. - rcu.cc: when cgroup-aware, sched_setaffinity each worker to exactly its granted physical CPU (memory follows by first-touch); otherwise stock numa_run_on_node(). - bench.{cc,h}, dbtest.cc, ycsb.cc: plumb the --cgroup-aware flag; size loaders and workers off num_allowed_cpus() when set; require num-threads <= allowed CPUs under strict pinning; log the worker pin map. Package wiring: - install_silo.sh: git apply --check then git apply the patch after submodule init. - README.md: document the no-NUMA-mempolicy difference and the --cgroup-aware flag. Reviewed By: excelle08 Differential Revision: D109870007 fbshipit-source-id: 64fd7be4a86a414dfb172408b138f7eec868b2f8
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
Silo's dbtest derives each worker's NUMA node from its logical core id
(effectively always node 0) and pins via numa_run_on_node(), so under a cgroup
v2 cpuset that excludes node 0 it aborts. This adds an opt-in --cgroup-aware
runtime flag and drops the explicit NUMA memory hint so this benchpress package
can run inside an off-node cpuset. Default behavior (no flag) is unchanged from
stock Silo.
The C++ change ships as a build-time patch
(packages/silo/patches/silo-cgroup-aware-no-numa.patch) applied during install
against the pinned upstream commit:
(mbind/numa_interleave_memory); the DB pool is placed by first-touch.
(reflects cgroup cpuset / numactl / taskset); add phys_cpu(),
num_allowed_cpus(), and a cgroup_aware() runtime toggle.
granted physical CPU (memory follows by first-touch); otherwise stock
numa_run_on_node().
and workers off num_allowed_cpus() when set; require num-threads <= allowed
CPUs under strict pinning; log the worker pin map.
Package wiring:
Differential Revision: D109870007