Skip to content

fix: add bounds check for action index in AtariEnv::Step#422

Closed
hobostay wants to merge 1 commit into
sail-sg:mainfrom
hobostay:fix/atari-action-bounds
Closed

fix: add bounds check for action index in AtariEnv::Step#422
hobostay wants to merge 1 commit into
sail-sg:mainfrom
hobostay:fix/atari-action-bounds

Conversation

@hobostay
Copy link
Copy Markdown
Contributor

Summary

  • Add runtime bounds checking for the action index in AtariEnv::Step() to prevent undefined behavior from out-of-range actions

Problem

In atari_env.h, the action index act from action["action"_] is used directly as an index into action_set_ without any bounds validation:

int act = action["action"_];
// ...
reward += env_->act(action_set_[act]);  // UB if act is out of range

If act is negative or >= action_set_.size(), this results in undefined behavior via out-of-bounds vector access. While the Python binding normally constrains action values, a malformed action array (e.g., from a buggy policy or adversarial input) would trigger UB.

Changes

  • Added bounds check that clamps act to [0, action_set_.size())
  • Logs a warning to stderr when an invalid action is detected
  • Added #include <iostream> for the warning message

Test plan

  • Run existing Atari test suite to confirm no regressions
  • Verify that a valid action (e.g., 0-3 for Pong) still works normally
  • Verify that an out-of-range action is clamped and produces a warning instead of crashing

🤖 Generated with Claude Code

The action index `act` was used directly as an index into `action_set_`
without any bounds validation. An out-of-range action value (negative
or >= action_set_.size()) would cause undefined behavior via
out-of-bounds vector access.

Add a bounds check that clamps the action to the valid range and logs a
warning to stderr when an invalid action is detected, similar to how
other RL environments handle invalid actions gracefully.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aa34a4eda2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread envpool/atari/atari_env.h
Comment on lines +208 to +210
std::cerr << "AtariEnv::Step: action " << act
<< " is out of range [0, " << action_set_.size()
<< "), clamping to valid range." << std::endl;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid logging from the Atari step hot path

When a policy or malformed caller produces invalid Atari actions over a vectorized/async rollout, this branch writes to std::cerr and flushes with std::endl on every single environment step. That turns the new safety path into a severe throughput/log-volume problem and can stall training jobs or fill CI logs; prefer failing fast, silently clipping, or rate-limiting the warning outside the per-step hot path.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator

@Trinkle23897 Trinkle23897 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should fail out loudly instead of swallow silently

@hobostay
Copy link
Copy Markdown
Contributor Author

Thanks for the review! You're right — the silent clamping hides bugs and std::cerr in the hot path is problematic.

Here's the fix: replace the silent clamp + std::cerr with a CHECK assertion that fails loudly. This matches the existing error handling pattern in the codebase (uses absl/log/check.h which is already included transitively via envpool/core/env.henvpool/core/array.h).

diff --git a/envpool/atari/atari_env.h b/envpool/atari/atari_env.h
--- a/envpool/atari/atari_env.h
+++ b/envpool/atari/atari_env.h
@@ -19,7 +19,6 @@
 
 #include <algorithm>
 #include <deque>
-#include <iostream>
 #include <memory>
 #include <random>
 #include <sstream>
@@ -204,12 +203,9 @@ class AtariEnv : public Env<AtariEnvSpec>, public RenderableEnv {
     float reward = 0.0;
     done_ = false;
     int act = action["action"_];
-    if (act < 0 || act >= static_cast<int>(action_set_.size())) {
-      std::cerr << "AtariEnv::Step: action " << act
-                << " is out of range [0, " << action_set_.size()
-                << "), clamping to valid range." << std::endl;
-      act = std::clamp(act, 0, static_cast<int>(action_set_.size()) - 1);
-    }
+    CHECK(act >= 0 && act < static_cast<int>(action_set_.size()))
+        << "AtariEnv::Step: action " << act
+        << " is out of range [0, " << action_set_.size() << ")";
     int skip_id = frame_skip_;
     int pooled_frame_count = std::min(frame_skip_, 2);

Changes:

  • Removed #include <iostream> (no longer needed)
  • Replaced silent clamping + std::cerr logging with CHECK() assertion
  • On invalid action, the process now aborts with a clear error message instead of silently continuing

🤖 Generated with Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants