apache · dcapwell · May 22, 2026 · May 22, 2026 · May 22, 2026 · May 22, 2026
diff --git a/.agents/skills/cassandra-testing-property/SKILL.md b/.agents/skills/cassandra-testing-property/SKILL.md
diff --git a/.agents/skills/cassandra-testing-stateful/SKILL.md b/.agents/skills/cassandra-testing-stateful/SKILL.md
@@ -0,0 +1,262 @@
+---
+name: cassandra-testing-stateful
+description: Write stateful property-based tests for Apache Cassandra using the Property.stateful() framework. Use when verifying systems behave correctly through sequences of operations, modeling interactions with stateful systems, or testing CRUD/state-machine correctness with random command sequences.
+---
+
+# Stateful Property-Based Testing with `stateful()`
+
+Write stateful property-based tests for Apache Cassandra using the `Property.stateful()` framework. This is the "CRUD testing" or "model-based testing" approach where random command sequences exercise state machines and verify invariants hold throughout.
+
+## When to Use
+
+- Testing stateful data structures (trees, indexes, caches) against a simple model
+- Verifying CRUD operations maintain consistency
+- Testing multi-step sequences (topology changes, schema mutations, journal operations)
+- Testing systems that survive restart/flush/compact cycles
+- Any scenario where you need to verify a sequence of operations, not just individual properties
+
+## Key Concepts
+
+### Entry Point: `stateful()`
+
+```java
+import static accord.utils.Property.stateful;
+import static accord.utils.Property.commands;
+```
+
+Returns a `StatefulBuilder` for configuring and running stateful property tests.
+
+### Configuration
+
+Inherits from `qt()` plus:
+
+| Method | Purpose | Default |
+|--------|---------|---------|
+| `withExamples(int)` | Number of full test runs | 500 |
+| `withSteps(int)` | Max commands per test run | 1000 |
+| `withStepTimeout(Duration)` | Timeout per individual step | none |
+| `withSeed(long)` | Pin seed for reproducibility | random |
+| `withPure(boolean)` | Fresh seed per example | true |
+| `withTimeout(Duration)` | Timeout for entire example | none |
+
+### Core Types
+
+- **`State`** - Model/tracking state for the test (e.g., a `TreeMap` as oracle)
+- **`SystemUnderTest`** (SUT) - The actual implementation being tested (optional - often `Void` when State acts as both)
+- **`Command<State, SystemUnderTest, Result>`** - A single operation applied to both model and SUT
+
+### Command Interface
+
+```java
+interface Command<State, SystemUnderTest, Result> {
+    // Guard: should this command execute given current state?
+    default PreCheckResult checkPreconditions(State state) { return PreCheckResult.Ok; }
+    // Apply to the model - return expected result
+    Result apply(State state) throws Throwable;
+    // Apply to the real system - return actual result
+    Result run(SystemUnderTest sut) throws Throwable;
+    // Verify model and SUT agree
+    default void checkPostconditions(State state, Result expected,
+                                     SystemUnderTest sut, Result actual) throws Throwable {}
+    // Human-readable description for history logging
+    default String detailed(State state) { return this.toString(); }
+}
+```
+
+Convenience interfaces:
+- **`UnitCommand<State, SUT>`** - When apply/run return void (use `applyUnit`/`runUnit`)
+- **`StateOnlyCommand<State>`** - When there's no separate SUT (extends `UnitCommand<State, Void>`)
+- **`SimpleCommand<State>`** - Inline lambda for `StateOnlyCommand` with a name
+
+### Commands Builder
+
+```java
+commands(() -> stateGen)              // State-only (no SUT)
+commands(() -> stateGen, Sut::new)    // State + separate SUT
+```
+
+Builder methods:
+- `.add(cmd)` / `.add(gen)` / `.add((rs, state) -> cmd)` - Add command with random weight
+- `.add(weight, cmd)` - Add command with fixed weight
+- `.addIf(predicate, cmd)` - Conditional command (only when predicate is true)
+- `.addAllIf(predicate, builder -> {...})` - Multiple conditional commands
+- `.preCommands(state -> {...})` - Run before each step
+- `.destroyState((state, cause) -> {...})` - Cleanup state after each example
+- `.destroySut((sut, cause) -> {...})` - Cleanup SUT after each example
+- `.onSuccess((state, sut, history) -> {...})` - Callback on successful example
+- `.onFailure((state, sut, history, cause) -> {...})` - Callback on failure
+- `.commandsTransformer((state, gen) -> newGen)` - Transform command generator per state
+- `.build()` - Produce the `Commands` object
+
+## Patterns
+
+### Pattern 1: State-only with SimpleCommand (most common, simplest)
+
+When the State IS the system under test (model + SUT combined).
+
+```java
+import static accord.utils.Property.stateful;
+import static accord.utils.Property.commands;
+
+stateful().check(commands(() -> State::new)
+    .add(MyTest::insertCommand)
+    .add(MyTest::readCommand)
+    .addIf(s -> !s.isEmpty(), MyTest::deleteCommand)
+    .build());
+
+// Command factory method
+private static Property.Command<State, Void, ?> insertCommand(RandomSource rs, State state) {
+    int key = rs.nextInt();
+    int value = rs.nextInt();
+    return new Property.SimpleCommand<>(
+        "Insert(" + key + ", " + value + ")",
+        s -> {
+            s.model.put(key, value);
+            s.sut.put(key, value);
+            assertThat(s.sut.get(key)).isEqualTo(value);
+        });
+}
+```
+
+### Pattern 2: Separate State and SUT with full Command interface
+
+When State (model) and SUT (real implementation) are different objects.
+
+```java
+stateful().check(commands(() -> State::new, state -> new Sut(state))
+    .add((rs, state) -> new Create(nextRange(rs), rs.nextInt()))
+    .add((rs, state) -> new Read(nextRange(rs)))
+    .addAllIf(state -> !state.isEmpty(), b -> b
+        .add((rs, state) -> new Update(rs.pickOrderedSet(state.keys()), rs.nextInt()))
+        .add((rs, state) -> new Delete(rs.pickOrderedSet(state.keys()))))
+    .destroyState(State::close)
+    .destroySut(Sut::close)
+    .build());
+
+// Full Command with apply/run/checkPostconditions
+static class Read implements Command<State, Sut, List<Integer>> {
+    private final Range range;
+    Read(Range range) { this.range = range; }
+
+    @Override public List<Integer> apply(State state) { return state.search(range); }
+    @Override public List<Integer> run(Sut sut) { return sut.tree.search(range); }
+    @Override public void checkPostconditions(State state, List<Integer> expected,
+                                             Sut sut, List<Integer> actual) {
+        Assertions.assertThat(actual).isEqualTo(expected);
+    }
+    @Override public String detailed(State state) { return "Read(" + range + ")"; }
+}
+```
+
+### Pattern 3: Multistep commands
+
+Group multiple commands as an atomic sequence in history.
+
+```java
+import static accord.utils.Property.multistep;
+
+Gen<Command<State, Void, ?>> topologyCommand = rs -> multistep(
+    new SimpleCommand<>("Stop Node", s -> s.stopNode(node)),
+    new SimpleCommand<>("Replace Host", s -> s.replaceHost(node)),
+    new SimpleCommand<>("Reconfigure CMS", s -> s.reconfigureCMS())
+);
+```
+
+### Pattern 4: Using destroyState for final validation
+
+```java
+stateful().withExamples(50).withSteps(500).check(commands(() -> State::new)
+    .add(MyTest::addTable)
+    .addIf(s -> !s.tables.isEmpty(), MyTest::dropTable)
+    .destroyState(state -> {
+        // Run after all steps - finish pending work and validate final state
+        state.finishPendingSequences();
+        state.validateFinalSchema();
+    })
+    .build());
+```
+
+## Error Reporting
+
+On failure, `PropertyError` reports:
+- Seed for reproducibility
+- Number of examples / steps configured
+- The specific failing step number
+- State at time of failure (via `toString()`)
+- Full command history (numbered, with `detailed()` strings)
+- If `withStepTimeout` is used, duration per step is appended to history
+
+## State Design Guidelines
+
+1. **Merged State+SUT** (most common): State holds both the model (e.g., `TreeMap`) and the SUT (e.g., `RangeTree`). Commands operate on both within `applyUnit`.
+
+2. **Separate State and SUT**: Use when the SUT has complex lifecycle (journal, cluster). State owns the model, SUT wraps the real thing. `destroyState`/`destroySut` handle cleanup.
+
+3. **State should implement useful `toString()`**: It appears in error reports.
+
+4. **Commands should have descriptive `detailed(State)`**: These form the history trace shown on failure.
+
+## Imports
+
+```java
+import static accord.utils.Property.stateful;
+import static accord.utils.Property.commands;
+import static accord.utils.Property.multistep;
+import static accord.utils.Property.ignoreCommand;
+import accord.utils.Property.Command;
+import accord.utils.Property.UnitCommand;
+import accord.utils.Property.StateOnlyCommand;
+import accord.utils.Property.SimpleCommand;
+import accord.utils.Property.PreCheckResult;
+import accord.utils.Gens;
+import accord.utils.Gen;
+import accord.utils.RandomSource;
+```
+
+## Framework Location
+
+- `modules/accord/accord-core/src/test/java/accord/utils/Property.java` - Core framework
+- `modules/accord/accord-core/src/test/java/accord/utils/Gens.java` - Generator utilities
+- `modules/accord/accord-core/src/test/java/accord/utils/README.md` - Documentation
+
+## Example: SingleNodeTableWalkTest (complex real-world stateful test)
+
+`test/distributed/org/apache/cassandra/distributed/test/cql3/SingleNodeTableWalkTest.java` is a complex stateful test that exercises CQL read/write paths against random table schemas with random data. It is a good reference for writing non-trivial stateful tests.
+
+Key design patterns demonstrated:
+- **State with overridable hooks**: The `State` inner class defines boolean predicates (`supportTokens()`, `allowNonPartitionQuery()`, `allowPartitionQuery()`, etc.) that control which commands are eligible via `addIf`. Subclasses override these hooks to alter behavior without duplicating the test structure.
+- **Overridable factory methods**: `createState()`, `createCluster()`, `defineTable()`, `supportedTypes()`, `supportedPrimaryColumnTypes()`, `supportedIndexers()`, and `preCheck()` are all `protected` methods that subclasses override to customize schema generation, cluster topology, and test configuration.
+- **Extensive use of `addIf`/`addAllIf`**: Commands are conditionally included based on current state (e.g., only select existing rows when partitions exist, only compact when there are enough SSTables).
+
+The test has a subclass hierarchy that reuses the same stateful structure for different configurations:
+- `SingleNodeTableWalkTest` -- single-node, base test
+- `MultiNodeTableWalkBase` -- multi-node (overrides `createCluster()`, adjusts consistency)
+  - `MultiNodeTableWalkWithReadRepairTest` / `MultiNodeTableWalkWithoutReadRepairTest`
+  - `CasMultiNodeTableWalkBase` -- CAS (Paxos) transactions
+  - `AccordInteropMultiNodeTableWalkBase` -- Accord transaction interop
+    - `FullAccordInteropMultiNodeTableWalkTest`, `MixedReadsAccordInteropMultiNodeTableWalkTest`
+
+## Important Notes
+
+- Default examples for `stateful()` is **500** (not 1000 like `qt()`)
+- Default steps per example is **1000**
+- Commands with `addIf` are only included when the predicate is true for the current state
+- **Prefer `addIf` over `ignoreCommand()`**: Use `addIf(predicate, cmd)` to conditionally include commands rather than having command factories return `Property.ignoreCommand()`. The framework fails the test if too many consecutive commands are ignored, making `ignoreCommand()` flaky-prone. `addIf` avoids this by excluding the command from selection entirely when the predicate is false.
+- `onSuccess` and `onFailure` callbacks are useful for logging history on completion
+- When writing `detailed()`, include the command parameters (key, range, etc.) for debuggability
+
+## Deterministic Execution Requirement
+
+Stateful tests are especially vulnerable to non-deterministic SUTs because failures depend on the exact sequence of states traversed. If the SUT uses internal randomness (e.g., `ThreadLocalRandom`, `Math.random()`), replaying a seed reproduces the same command sequence but the SUT may take a different internal path, making the failure non-reproducible.
+
+**Before writing a stateful test, audit the SUT for internal randomness.** See the `cassandra-testing-property` skill for a full table of common randomness sources and mitigations.
+
+**Key principle**: Every aspect of the test that affects the outcome must be derived from `RandomSource`. This includes:
+- Values generated for commands (covered by the command factory receiving `RandomSource rs`)
+- The SUT's internal behavior (NOT covered if the SUT uses `ThreadLocalRandom` etc.)
+- Any initialization parameters (covered if `State(RandomSource rs)` constructor is used)
+- **Iteration order of collections in the SUT** -- if the SUT iterates a `HashMap`/`HashSet` and the order affects behavior, the test is non-reproducible across machines/JVM versions even though no random API is called
+
+Stateful tests are particularly sensitive to iteration order because a different traversal order can change which command's precondition is met, which element gets selected, or which state transition fires -- producing a completely different state sequence from the same seed. For example, if a command iterates `HashMap.entrySet()` to find the first element matching a condition, a different iteration order means a different element is found, leading to divergent state. See the `cassandra-testing-property` skill for a full list of common patterns that break reproducibility.
+
+If the SUT cannot be made deterministic, document the limitation prominently in the test class Javadoc.
diff --git a/.agents/skills/cassandra-testing/SKILL.md b/.agents/skills/cassandra-testing/SKILL.md
@@ -0,0 +1,60 @@
+---
+name: cassandra-testing
+description: Guide for writing tests in Apache Cassandra. Use whenever writing new tests, adding test coverage, or deciding what testing approach to use for a change. Prioritizes property-based and stateful property testing over hand-written example-based tests.
+---
+
+# Cassandra Testing Strategy
+
+When writing tests for Apache Cassandra, prefer property-based testing over hand-written example-based tests. Property tests generate thousands of random inputs and catch edge cases that humans miss.
+
+## Decision Flow
+
+Ask these questions about what you're testing:
+
+### 1. Does it involve sequences of operations on mutable state?
+
+Examples: data structures (trees, indexes, caches), CRUD workflows, schema mutations, topology changes, journal lifecycle (flush/compact/restart).
+
+**Use the `cassandra-testing-stateful` skill** - it provides `Property.stateful()`, which gives you:
+- Random command sequences with weighted selection
+- Conditional commands (only delete when non-empty)
+- Model-based verification (compare SUT against a simple oracle)
+- Full command history in failure reports for debugging
+- Lifecycle hooks (destroyState, destroySut, onSuccess)
+
+### 2. Does it validate a property/invariant?
+
+Examples: serialization round-trips, algebraic laws (idempotency, commutativity), encoding correctness, range/boundary behavior, type validation.
+
+**Use the `cassandra-testing-property` skill** - it provides `Property.qt()` with typed generators, which gives you:
+- `forAll(gen).check(value -> ...)` for clean property assertions
+- `Serializers.testSerde()` for automated serialization testing
+- Rich generator library (Accord `Gens`, Cassandra `Generators`, `AbstractTypeGenerators`, `CassandraGenerators`)
+- Meta-randomness (`mixedDistribution`) for bias-aware testing
+
+### 3. Even "simple" tests benefit from property testing
+
+A test that looks simple often hides complexity. A straightforward `INSERT INTO ... SELECT * FROM` test seems trivial with `(pk int, v int)`, but breaks with `(pk varint, ck frozen<udt>, v int, PRIMARY KEY(pk, ck)) WITH CLUSTERING ORDER BY (ck DESC)`. The schema, types, clustering order, and value expressions all interact in ways no human enumerates completely.
+
+Rather than writing a single example and hoping it covers enough, generate the schema and values randomly:
+
+```
+// pseudocode
+qt().check(rs -> {
+    schema = randomSchema(rs)       // random types, clustering order, indexes, etc.
+    row    = randomRow(rs, schema)  // random values matching the schema
+    insert(schema, row)
+    result = select(schema, row.pk)
+    assertEqual(result, row)
+});
+```
+
+This approach finds edge cases across type combinations, encoding paths, and query planning that hand-written examples miss entirely.
+
+## Why Property Tests Over Example Tests
+
+- A single `qt().forAll(gen).check(...)` replaces dozens of hand-written test cases
+- Random generation finds edge cases humans don't think of (boundary values, empty inputs, overflow, type interactions)
+- Meta-randomness varies the distribution per example, catching bugs that require specific patterns (e.g., mostly-deletes followed by a read)
+- Failures are reproducible via seed - just add `withOnlySeed(seed)` to replay
+- The generator libraries already exist for most Cassandra types - you just compose them
diff --git a/src/java/org/apache/cassandra/utils/DynamicList.java b/src/java/org/apache/cassandra/utils/DynamicList.java
@@ -21,6 +21,9 @@
 import java.util.HashMap;
 import java.util.TreeSet;
 import java.util.concurrent.ThreadLocalRandom;
+import java.util.function.IntSupplier;
+
+import com.google.common.annotations.VisibleForTesting;
 
 // simple thread-unsafe skiplist that permits indexing/removal by position, insertion at the end
 // (though easily extended to insertion at any position, not necessary here)
@@ -85,19 +88,26 @@ private Node parent(int parentHeight)
         }
     }
 
+    private final IntSupplier nextInt;
     private final int maxHeight;
     private final Node<E> head;
     private int size;
 
     public DynamicList(int maxExpectedSize)
     {
+        this(maxExpectedSize, () -> ThreadLocalRandom.current().nextInt());
+    }
+
+    public DynamicList(int maxExpectedSize, IntSupplier nextInt)
+    {
+        this.nextInt = nextInt;
         this.maxHeight = 3 + Math.max(0, (int) Math.ceil(Math.log(maxExpectedSize) / Math.log(2)));
         head = new Node<>(maxHeight, null);
     }
 
     private int randomLevel()
     {
-        return 1 + Integer.bitCount(ThreadLocalRandom.current().nextInt() & ((1 << (maxHeight - 1)) - 1));
+        return 1 + Integer.bitCount(nextInt.getAsInt() & ((1 << (maxHeight - 1)) - 1));
     }
 
     public Node<E> append(E value)
@@ -198,10 +208,8 @@ public int size()
         return size;
     }
 
-    // some quick and dirty tests to confirm the skiplist works as intended
-    // don't create a separate unit test - tools tree doesn't currently warrant them
-
-    private boolean isWellFormed()
+    @VisibleForTesting
+    boolean isWellFormed()
     {
         for (int i = 0 ; i < maxHeight ; i++)
         {

diff --git a/src/java/org/apache/cassandra/utils/Hex.java b/src/java/org/apache/cassandra/utils/Hex.java
@@ -106,7 +106,12 @@ public static long parseLong(String hex, int start, int end)
         for (int i = start ; i < end ; ++i)
         {
             char c = hex.charAt(i);
-            result |= (long)(c - (c >= 'a' ? 'a' - 10 : '0')) << shift;
+            if (c >= charToByte.length)
+                throw new NumberFormatException("Invalid hex character '" + c + "' at index " + i + " in: " + hex.substring(start, end));
+            int digit = charToByte[c];
+            if (digit < 0)
+                throw new NumberFormatException("Invalid hex character '" + c + "' at index " + i + " in: " + hex.substring(start, end));
+            result |= (long) digit << shift;
             shift -= 4;
         }
         return result;