Skip to content

perf!: add explicit layout of externally provided ids#224

Merged
baszalmstra merged 10 commits into
prefix-dev:mainfrom
dralley:optimizations2
Jun 2, 2026
Merged

perf!: add explicit layout of externally provided ids#224
baszalmstra merged 10 commits into
prefix-dev:mainfrom
dralley:optimizations2

Conversation

@dralley
Copy link
Copy Markdown
Contributor

@dralley dralley commented May 20, 2026

When enabled, stores solvable_to_variable as a Vec in VariableMap.

Both SolvableId and VariableId are sequential arena IDs (u32), so HashMap lookups can be replaced with direct Vec indexing. This eliminates hashing overhead in intern_solvable and origin() lookups, and removes the HashMap insert/rehash costs.

This shouldn't necessarily be the default, because solvable ID assignment is controlled by the user and is not guaranteed to be dense.

Comment thread src/solver/variable_map.rs
@dralley
Copy link
Copy Markdown
Contributor Author

dralley commented May 20, 2026

I found this to be beneficial to performance, since it cuts out the hashing / rehashing overhead.

The code I'm working with creates a new solvable for every package in an RPM repository, so solvables are assigned sequentially, and the RPM ecosystem has a handful of use cases that would necessarily touch a large fraction of the solvables in the pool, e.g. full-distro upgrade or repository dependency-closed-ness testing. So in those cases I would imagine this would reduce churn even more significantly than the few simple test cases I've done where it was ~5% overhead.

It's hard for me to judge just how beneficial or what a representative test case looks like at the moment though, too much is in flux.

@baszalmstra
Copy link
Copy Markdown
Contributor

I can run a benchmark tomorrow.

I prefer to use a marker type on the dependency provider though. features are transitively enabled which means that if your implementation doesnt enable the feature but a transitive dependency does, the feature is also enabled for your implementation. That would cause serious issues.

I can also give that a go if its unclear.

@dralley
Copy link
Copy Markdown
Contributor Author

dralley commented May 20, 2026

Alright. I hope to get patches posted for resolvo-rpm soon, I'll submit PRs upstream for review.

@baszalmstra
Copy link
Copy Markdown
Contributor

baszalmstra commented May 20, 2026

I benchmarked your code, and it shows a clear, substantial win of ~12%, and no regressions:

metric baseline (HashMap) dense (Vec) speedup
total wall-clock 24.4m 22.6m 1.082× (1.8m saved)
mean / problem 1465.6ms 1355.0ms 1.082×
median / problem 678.2ms 584.0ms 1.161×
p75 1.44s 1.24s 1.160×
p90 2.07s 1.86s 1.112×
p99 14.9s 14.3s 1.042×
per-problem median ratio 1.122×

Would you be OK if I pushed some changes to make this a property of the DependencyProvider instead of as a cargo feature?

Something along the lines of:

impl DependencyProvider for MyProvider {
    type SolvableStorage = resolvo::solvable_ids::Sparse;   // or ::Dense
    // ...
}

Or do you want to tackle this yourself?

@dralley
Copy link
Copy Markdown
Contributor Author

dralley commented May 20, 2026

@baszalmstra I just pushed, let me know if this is what you intended. I can squash the commits once you're confident in the change.

Claude helped with a first pass, which looks mostly OK.. I tweaked a few things, and a couple of types that were previously declared pub in pub(crate) modules became exposed and so the visibility needed to be changed.. not sure if that needs to be tweaked a bit further.. I need to log off though.

@dralley
Copy link
Copy Markdown
Contributor Author

dralley commented May 20, 2026

I had started before your comment, so here's another commit that uses your suggested naming

I benchmarked your code, and it shows a clear, substantial win of ~12%, and no regressions

That's good to hear, it would be interesting to figure out exactly where the boundary is w/r/t the conditions in which one or the other strategy is better. With a sufficiently small pool, it's likely that Vec storage is nearly always superior, but I'm sure there would be a point at which that is no longer the case.

It's a memory / compute tradeoff as well I guess, so TBH it could be faster in nearly all scenarios while still not being the appropriate solution in some. But I don't really have an intuition for where that line would be since I just started working on this stuff.

@dralley
Copy link
Copy Markdown
Contributor Author

dralley commented May 21, 2026

What's your benchmarking harness?

And do you think it would be worthwhile to have micro benchmarks that could be added for e.g. all the various kinds of interning, all the various kinds of resolve-ing, etc.? Or do you have a preference for holistic test cases?

@baszalmstra
Copy link
Copy Markdown
Contributor

Claude helped with a first pass, which looks mostly OK.. I tweaked a few things, and a couple of types that were previously declared pub in pub(crate) modules became exposed and so the visibility needed to be changed.. not sure if that needs to be tweaked a bit further.. I need to log off though.

Yeah, this is almost what I had in mind! Except that I dont want to expose the internals to the public API in any way. Do you mind if I amend you commit with what I had in mind?

What's your benchmarking harness?

The repository contains a benchmark tool in tools/solve-snapshot. It requires an ecosystem specific snapshot. You can create one from any implementation of a DependencyProvider using utilities in the snapshot module. I have one created for the conda ecosystem generated by this tool that I refresh from time-to-time. I'm not adding it to the repository because it's several hundred MB.

Using the snapshot the benchmark tool creates a (deterministic) random combination of inputs, solves them, and outputs the result to a CSV file. There is a jupyter notebook in the solve-snapshot directory that then produces some statistics and charts. Usually, I also ask claude to give me more insights about why certain cases regressed (if they did).

Depending on your machine, it takes a little while to run the entire benchmark.

I'm aware that this is not a perfect solution because it's not actually based on real-world cases, but I have found that it comes close enough. That being said, I do not yet have a proper snapshot that also includes loads of conditional dependencies (support for that only just landed in the conda world), so there might still be rough edges there.

And do you think it would be worthwhile to have micro benchmarks that could be added for e.g. all the various kinds of interning, all the various kinds of resolve-ing, etc.? Or do you have a preference for holistic test cases?

This is what we previously did, and it was not great. The reason is that we optimized for a relatively small subset of specific use cases (20ish). Even when we did optimize those usecases, that often came at a cost of some other scenario for which we didnt have a benchmark. Using this random approach has worked much better. It flagged many optimizations that on the surface looked great, but then seriously regressed 2% of other cases.

But probably having both would be better. I think the best solution would be to have a large set of real-world use cases that also contains a large variety. And maybe combine that with targeted and random cases.

But accompanying niche optimizations with micro benchmarks is always appreciated.

@dralley
Copy link
Copy Markdown
Contributor Author

dralley commented May 21, 2026

Yeah, this is almost what I had in mind! Except that I dont want to expose the internals to the public API in any way. Do you mind if I amend you commit with what I had in mind?

Go for it. Which naming pattern did you prefer? I can squash or delete the 3rd commit

The repository contains a benchmark tool in tools/solve-snapshot. It requires an ecosystem specific snapshot. You can create one from any implementation of a DependencyProvider using utilities in the snapshot module. I have one created for the conda ecosystem generated by this tool that I refresh from time-to-time. I'm not adding it to the repository because it's several hundred MB.

Cool. Well, when I'm a bit further along with resolvo-rpm I'll try that out. I'm currently redesigning rpmrepo_metadata to expose a visitor API and get rid of all unnecessary allocations, just intern directly into the pool.

@dralley
Copy link
Copy Markdown
Contributor Author

dralley commented May 21, 2026

Should I go ahead and rebase or is that going to cause problems (if you've already started)

@baszalmstra
Copy link
Copy Markdown
Contributor

Im almost done with the changes :)

@dralley
Copy link
Copy Markdown
Contributor Author

dralley commented May 21, 2026

(rebased) edit: shit, comment didn't load

whatever lol, force-push if you need to

@dralley
Copy link
Copy Markdown
Contributor Author

dralley commented May 21, 2026

Would you like me to squash this?

@baszalmstra
Copy link
Copy Markdown
Contributor

So it turns out that we were quietly assuming the dense layout in a few places. I have now replaced all those instances. Please let me know what you think of bf396ec !

Ill now run a new benchmark and toggling both options.

Would you like me to squash this?

We can squash when the PR is merged.

Comment thread src/internal/id.rs
impl ArenaId for DependenciesId {
fn from_usize(x: usize) -> Self {
Self(x as u32)
let raw: u32 = (x + 1).try_into().expect("dependencies id too big");
Copy link
Copy Markdown
Contributor Author

@dralley dralley May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity's sake it might be better to assign a const ROOT_ID: u32 = 1, and structure this as ROOT_ID + x

Then do the same below

i.e. VariableId::root() would be defined as Self( unsafe { NonZeroU32::new_unchecked(ROOT_ID) })

The documentation makes it clear enough already, but this would be better IMO

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@baszalmstra Still applies

Comment thread src/internal/id.rs Outdated
/// Returns `true` if this variable represents the root.
pub fn is_root(self) -> bool {
self.0 == 0
self.0.get() == 1
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

Comment thread src/internal/id.rs Outdated
/// A unique identifier for a variable in the solver.
///
/// Uses a non-zero representation so that `Option<VariableId>` is the same
/// size as `VariableId`.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume this makes the "dense" vec even denser? How was the performance?

Comment thread src/lib.rs Outdated
/// this hint purely as an optimization; correctness is unaffected.
///
/// Use [`solvable_id::Dense`] when `SolvableId`s are allocated
/// contiguously from 0 (the common case with a single pool). If you are
Copy link
Copy Markdown
Contributor Author

@dralley dralley May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose "from 0" is not strictly true any longer.

edit: well, for variables, but this is solvables, so not the same

Comment thread src/lib.rs Outdated
// `SolverState<{ D::DENSE_SOLVABLE_IDS }>` internally.
type SolvableStorage: SolvableStorage;
/// Describes how this provider allocates [`SolvableId`]s. The solver uses
/// this hint purely as an optimization; correctness is unaffected.
Copy link
Copy Markdown
Contributor Author

@dralley dralley May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"hint" is maybe not quite accurate, since it authoritatively selects the strategy. This could imply that the solver could choose otherwise at runtime, which is not the case.

Comment thread src/snapshot.rs Outdated

impl DependencyProvider for SnapshotProvider<'_> {
type SolvableStorage = SparseSolvableStorage;
type SolvableIdLayout = solvable_id::Sparse;
Copy link
Copy Markdown
Contributor Author

@dralley dralley May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

solvable_id::Sparse / solvable_id::Dense is a bit awkward since the import path does not connote any sense that the type is a layout option, it seems more like a type of SolvableId. On the other hand, resolvo::solvable_id::layout::Dense is pretty verbose.

I don't particularly object to this naming, just pointing it out.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to just have Sparse and Dense as enum variants provided as a const param, but that doesn't look like it will be stable any time soon unfortunately

rust-lang/rust#95174

Comment thread src/solvable_id.rs Outdated
}
}

pub(crate) struct SparseSet<K>(HashSet<K>);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are SparseSet and DenseSet meant to actually be used somewhere, or is it more future-planning?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type proliferation does make the code more difficult to wrap the head around unfortunately

Comment thread src/solvable_id.rs Outdated
}
}

pub(crate) struct DenseSet<K>(IndexedSet<K>);
Copy link
Copy Markdown
Contributor Author

@dralley dralley May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these type wrappers actually necessary? Could this be acheived with IndexedSet directly? (which could be renamed)

@baszalmstra
Copy link
Copy Markdown
Contributor

Yeah I think this could be better. Looking back this approach doesnt feel right either. Maybe we should make SolvableId a generic type instead. 🤔

@dralley dralley changed the title perf: Add a dense-solvable-ids feature perf: Add a selectable "dense" layout for solvables in VariableMap May 22, 2026
dralley and others added 4 commits May 22, 2026 17:17
When enabled, stores solvable_to_variable as a Vec in VariableMap.

Both SolvableId and VariableId are sequential arena IDs (u32), so
HashMap lookups can be replaced with direct Vec indexing. This
eliminates hashing overhead in intern_solvable and origin() lookups,
and removes the HashMap insert/rehash costs.

This shouldn't necessarily be the default, because solvable ID
assignment is controlled by the user and is not guaranteed to be dense.

Assisted-By: Claude Opus 4.6
Each DependencyProvider now declares its solvable ID density via an
associated type, controlling the data structure used for
solvable-to-variable mapping in the solver:

    type SolvableMap: SolvableMap;

Two implementations are provided:
- DenseSolvableMap: Vec-based O(1) lookup, best when SolvableIds are
  sequential from 0 (common with a single pool)
- SparseSolvableMap: HashMap-based, efficient when the pool is large
  but only a fraction of solvables are visited during solving

Solvable ID assignment is controlled by the user and is not guaranteed
to be dense, so this cannot be a one-size-fits-all default. The
associated type lets each provider choose the right tradeoff.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SolvableStorage better communicates that this is a storage strategy
choice, not a map API. The Dense/Sparse variants read naturally as
SolvableStorage::Dense and SolvableStorage::Sparse.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replaces the public SolvableStorage trait + DenseSolvableStorage /
SparseSolvableStorage impls with a sealed-marker design that doesn't
leak internal types.

Public surface:
- solvable_id::{Layout, Sparse, Dense} markers
- DependencyProvider::SolvableIdLayout: solvable_id::Layout

Layout is sealed via a pub(crate) Sealed supertrait in the same
module, so external crates cannot implement it.

Internal layer (GAT-based):
- Sealed::Map<V: Copy>: SolvableMap with Sparse/Dense impls
- Sealed::Set<K: ArenaId + Eq + Hash>: SolvableSet with Sparse/Dense impls
- Frozen<T> wrapper exposes &self interior-mut access to a SolvableMap,
  used in place of FrozenCopyMap for solvable-keyed caches

Niche-optimised ids:
- VariableId is now NonZeroU32 with index+1 encoding
- DependenciesId is now NonZeroU32 with index+1 encoding
- Halves Option<...> storage in DenseMap<VariableId/DependenciesId>

Layout-aware fields:
- VariableMap::solvable_to_variable
- SolverCache::solvable_to_dependencies
- SolverCache::hint_dependencies_available
- SolverState::clauses_added_for_solvable

Test provider switched to Dense so the integration suite exercises
the dense codepath. Added size_of and roundtrip tests to lock in the
niche optimisations.
@baszalmstra
Copy link
Copy Markdown
Contributor

Oke, gave it another go with another design. It all became a little bigger than I hoped, but I think this makes a lot more sense. Let me know what you think!

Also rerunning benchmarks as we speak.

Comment thread src/internal/solver_id.rs
/// The synthetic solver root.
Root,
/// A provider-owned solvable ID.
Solvable(S),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you happen to know if this can take advantage of the same optimization as Option<T> when it comes to NonZeroU32? Or is that compiler-specific to Option?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same question, according to my clanker the niche optimization also applies here if S allows it.

Comment thread src/internal/solver_id.rs Outdated
/// Interior-mutable wrapper around an [`IdMap`].
///
/// This keeps the same single-threaded, no-reference-returning aliasing model
/// as the solver's previous specialized frozen storage.
Copy link
Copy Markdown
Contributor Author

@dralley dralley May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a claude-ism? What the solver previously did isn't relevant, should just be a description of what it currently does

Comment thread src/id.rs Outdated
/// Returns the variable id representing the root of the decision tree.
pub fn root() -> Self {
// SAFETY: 1 is non-zero.
Self(unsafe { NonZeroU32::new_unchecked(1) })
Copy link
Copy Markdown
Contributor Author

@dralley dralley May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same suggestion as before: 1 should be a named constant ROOT_ID instead of just a free constant sprinkled across multiple functions

Comment thread src/solver_id.rs
//! These traits are intentionally narrower than general-purpose maps and sets.
//! They model small pieces of solver state keyed by provider-owned IDs. The
//! storage implementation is selected by the ID type, so dense IDs can keep
//! using vector/bitset-backed storage while sparse IDs use hash-backed storage.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would know better than I would what the benefits are, but this particular bit is not particularly simpler at least

The genericization of `Clause<N>` and surrounding types in PR prefix-dev#224 made
rustc more conservative about cross-crate inlining of these small leaf
methods. Restore inlining explicitly on:

- `Literal::new`/`negate`/`satisfying_value`/`from_index`/`to_index`,
  `VariableId::positive`/`negative`, `Clause::is_binary` (clause.rs)
- `DecisionMap::reset`/`set`/`level` (decision_map.rs)
- `DecisionTracker::map`/`try_add_decision`/`next_unpropagated`
  (decision_tracker.rs)
- `VariableMap::origin` (variable_map.rs)
- `IndexedSet::insert`/`contains` (indexed_set.rs)
- `WatchMap::cursor` and 6 `WatchMapCursor` accessors (watch_map.rs)

All are leaf functions on the BCP loop, propagation entry, and conflict
analysis. No behavior change.
The inner BCP loop walks a linked list of clauses watching a literal,
following an arbitrary pointer chain that is essentially random with
respect to cache locality. Each step waits ~hundreds of cycles for the
next `WatchedLiterals` line to land in L1.

`WatchMapCursor::prefetch_next` issues a `_MM_HINT_T0` software prefetch
on x86_64 (no-op fallback / volatile read on aarch64). Called once per
iteration of the propagate loop after taking the current watched
literals, it overlaps memory latency with the compute on the current
clause. Pure CPU hint, no observable side-effects.
The `requirement_to_sorted_candidates` lookup is on the BCP hot path,
called for every Requires clause as part of `try_fold_literals` /
`visit_literals` / `next_unwatched_literal`. `FrozenMap` adds a hash +
RefCell + interior allocation per lookup.

`RequirementMap` splits the keyspace on the `Requirement` discriminant
and stores each variant in its own `Mapping`. Lookup becomes a branch
on the enum tag + a direct chunked-vec index — no hashing, no RefCell.
The inner ID types (`VersionSetId`, `VersionSetUnionId`) are dense `u32`
arena IDs, so the `Mapping` representation is a good fit.
@baszalmstra
Copy link
Copy Markdown
Contributor

Benchmarks are good.

p10 p25 p50 p75 p95 p99
1.09× 1.22× 1.30× 1.36× 1.43× 1.55×

88 % of problems land in the 1.1 - 1.5× bucket; the speedup is uniformly spread across the workload rather than concentrated on a few outliers.

Ill look at your comments soon!

image image

@dralley
Copy link
Copy Markdown
Contributor Author

dralley commented May 26, 2026

Nice!

Note that these other optimizations should probably go into another PR so that this one can be squashed more easily (unless you plan on doing that manually). W/ these other ones we probably do not want to just squash them in with the rest.

A few other ideas that had at least some improvement locally, but which I have not fully evaluated for their tradeoffs yet:


commit c1677a26c11378e9e67bbdcf4444a092d5a5f661
Author: Daniel Alley <dalley@redhat.com>
Date:   Wed May 20 23:00:14 2026 -0400

    perf: Increase Arena chunk size from 128 to 1024
    
    Reduces the number of heap allocations for large pools. A pool with
    70k entries previously allocated ~547 separate Vec chunks; now it
    allocates ~68. Minimal memory impact for small pools since only the
    first chunk is pre-allocated.

diff --git a/src/internal/arena.rs b/src/internal/arena.rs
index 24ed1e4..d5267ad 100644
--- a/src/internal/arena.rs
+++ b/src/internal/arena.rs
@@ -5,7 +5,7 @@ use std::cmp;
 use std::marker::PhantomData;
 use std::ops::{Index, IndexMut};
 
-const CHUNK_SIZE: usize = 128;
+const CHUNK_SIZE: usize = 1024;
 
 /// An `Arena<TValue>` holds a collection of `TValue`s but allocates persistent `TId`s that are used
 /// to refer to an element in the arena. When adding an item to an `Arena` it returns a `TId` that

commit a8a720e8a059ec4e0ba409bc5ab1bd249cfc9ec9
Author: Daniel Alley <dalley@redhat.com>
Date:   Wed May 20 23:06:38 2026 -0400

    perf: Avoid double hashing on miss in intern_version_set and intern_condition
    
    Add FrozenCopyMap::get_copy_or_insert_with() which uses HashMap::entry()
    to hash the key once instead of twice (once for failed get, once for
    insert). Applied to intern_version_set and intern_condition which
    already have owned keys.
    
    intern_string and intern_package_name still use the two-step pattern
    because they benefit from looking up by &str on the hit path to avoid
    allocating an owned String.
    
    Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

diff --git a/src/internal/frozen_copy_map.rs b/src/internal/frozen_copy_map.rs
index ffbefdd..2585b60 100644
--- a/src/internal/frozen_copy_map.rs
+++ b/src/internal/frozen_copy_map.rs
@@ -1,5 +1,6 @@
 use std::borrow::Borrow;
 use std::cell::UnsafeCell;
+use std::collections::hash_map::Entry;
 use std::collections::HashMap;
 use std::hash::{BuildHasher, Hash};
 
@@ -27,6 +28,23 @@ impl<K: Eq + Hash, V: Clone, S: BuildHasher> FrozenCopyMap<K, V, S> {
             (*map).get(k).cloned()
         }
     }
+
index ffbefdd..2585b60 100644
--- a/src/internal/frozen_copy_map.rs
+++ b/src/internal/frozen_copy_map.rs
@@ -1,5 +1,6 @@
 use std::borrow::Borrow;
 use std::cell::UnsafeCell;
+use std::collections::hash_map::Entry;
 use std::collections::HashMap;
 use std::hash::{BuildHasher, Hash};
 
@@ -27,6 +28,23 @@ impl<K: Eq + Hash, V: Clone, S: BuildHasher> FrozenCopyMap<K, V, S> {
             (*map).get(k).cloned()
         }
     }
+
+    /// Looks up `key` in the map. On hit, returns the existing value. On miss,
+    /// calls `make_val` with a reference to the key (already stored in the
+    /// entry), inserts the result, and returns it. Hashes the key only once.
+    pub fn get_copy_or_insert_with(&self, key: K, make_val: impl FnOnce(&K) -> V) -> V {
+        unsafe {
+            let map = self.map.get();
+            match (*map).entry(key) {
+                Entry::Occupied(e) => e.get().clone(),
+                Entry::Vacant(e) => {
+                    let v = make_val(e.key());
+                    e.insert(v.clone());
+                    v
+                }
+            }
+        }
+    }
 }
 
 impl<K: Eq + Hash, V, S: Default> Default for FrozenCopyMap<K, V, S> {
diff --git a/src/utils/pool.rs b/src/utils/pool.rs
index 02e2cae..747e5e0 100644
--- a/src/utils/pool.rs
+++ b/src/utils/pool.rs
@@ -181,13 +181,8 @@ impl<VS: VersionSet, N: PackageName> Pool<VS, N> {
     /// is inserted twice they will share the same [`VersionSetId`].
     pub fn intern_version_set(&self, package_name: NameId, version_set: VS) -> VersionSetId {
         let key = (package_name, version_set);
-        if let Some(entry) = self.version_set_to_id.get_copy(&key) {
-            entry
-        } else {
-            let id = self.version_sets.alloc(key.clone());
-            self.version_set_to_id.insert_copy(key, id);
-            id
-        }
+        self.version_set_to_id
+            .get_copy_or_insert_with(key, |k| self.version_sets.alloc(k.clone()))
     }
 
     /// Returns the version set associated with the provided id
@@ -240,12 +235,8 @@ impl<VS: VersionSet, N: PackageName> Pool<VS, N> {
     /// Conditions are deduplicated, so if the same condition is inserted
     /// twice the same `ConditionId` will be returned.
     pub fn intern_condition(&self, condition: Condition) -> ConditionId {
-        if let Some(id) = self.condition_to_id.get_copy(&condition) {
-            return id;
-        }
-        let id = self.conditions.alloc(condition.clone());
-        self.condition_to_id.insert_copy(condition, id);
-        id
+        self.condition_to_id
+            .get_copy_or_insert_with(condition, |c| self.conditions.alloc(*c))
     }
 }

@baszalmstra
Copy link
Copy Markdown
Contributor

Note that these other optimizations should probably go into another PR so that this one can be squashed more easily (unless you plan on doing that manually). W/ these other ones we probably do not want to just squash them in with the rest.

I was planning on just squash merging this entire PR. Unless you have a strong opinion against this?

If you make PRs of those additions, I'll happily run the benchmarks.

@dralley
Copy link
Copy Markdown
Contributor Author

dralley commented May 26, 2026

It's up to you, I just figured that keeping the context is good, since this goes beyond simply the dense layout change itself.

I can make a separate PR with those two. Maybe once this is merged

@dralley
Copy link
Copy Markdown
Contributor Author

dralley commented May 26, 2026

The new changes look reasonable (to the extent I am able to judge these things, being very new to the codebase), but the previous comments still apply re: changes

@baszalmstra baszalmstra enabled auto-merge (squash) June 2, 2026 13:14
@baszalmstra baszalmstra disabled auto-merge June 2, 2026 13:15
@baszalmstra baszalmstra changed the title perf: Add a selectable "dense" layout for solvables in VariableMap perf!: add explicit layout of externally provided ids Jun 2, 2026
@baszalmstra baszalmstra merged commit 3db2ca0 into prefix-dev:main Jun 2, 2026
9 checks passed
@dralley dralley deleted the optimizations2 branch June 2, 2026 13:24
@dralley
Copy link
Copy Markdown
Contributor Author

dralley commented Jun 4, 2026

@baszalmstra Would you mind documenting this in a little mini README in the tools/ directory?

What's your benchmarking harness?

The repository contains a benchmark tool in tools/solve-snapshot. It requires an ecosystem specific snapshot. You can create one from any implementation of a DependencyProvider using utilities in the snapshot module. I have one created for the conda ecosystem generated by this tool that I refresh from time-to-time. I'm not adding it to the repository because it's several hundred MB.

Using the snapshot the benchmark tool creates a (deterministic) random combination of inputs, solves them, and outputs the result to a CSV file. There is a jupyter notebook in the solve-snapshot directory that then produces some statistics and charts. Usually, I also ask claude to give me more insights about why certain cases regressed (if they did).

Depending on your machine, it takes a little while to run the entire benchmark.

I'm aware that this is not a perfect solution because it's not actually based on real-world cases, but I have found that it comes close enough. That being said, I do not yet have a proper snapshot that also includes loads of conditional dependencies (support for that only just landed in the conda world), so there might still be rough edges there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants