fix: set COCOINDEX_SOURCE_MAX_INFLIGHT_ROWS=256 to prevent "too many open files" error#300
Merged
Merged
Conversation
…open files" error Fixes #293. Users experienced "Too many open files (os error 24)" during indexing because CocoIndex's default concurrency (1024 inflight rows) opens more file handles than OS limits allow (typically 256-1024). Set COCOINDEX_SOURCE_MAX_INFLIGHT_ROWS=256 in both code paths that invoke cocoindex (pipeline.run_cocoindex_update and server._cocoindex_subprocess_env). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
HumanBean17
added a commit
that referenced
this pull request
Jun 13, 2026
The #293 fix (#300) set COCOINDEX_SOURCE_MAX_INFLIGHT_ROWS — an env var CocoIndex never reads. The real semaphore var is COCOINDEX_MAX_INFLIGHT_COMPONENTS (default 1024, see cocoindex/_internal/app.py), so the throttle was a no-op and the EMFILE "Too many open files (os error 24)" recurred (#306). Layer A (correctness): centralize the throttle in cocoindex_subprocess_env_defaults() using the real env var; both cocoindex subprocess sites (pipeline.run_cocoindex_update + server._cocoindex_subprocess_env) apply it via setdefault so an operator override still wins. Layer B (deterministic): raise_fd_limit() raises the process soft RLIMIT_NOFILE toward its hard limit (capped 65536, never infinity) at cli.main / server.main startup. rlimits are inherited across fork+exec, so cocoindex children get headroom regardless of launch context — macOS GUI/launchd/IDE-launched processes inherit a 256 FD ceiling, not the shell's raised limit, which is why the error recurred even on hosts whose terminal shows a high ulimit. No ontology/schema/re-index impact. Fixes #306. Co-Authored-By: Claude <noreply@anthropic.com>
HumanBean17
added a commit
that referenced
this pull request
Jun 13, 2026
The #293 fix (#300) set COCOINDEX_SOURCE_MAX_INFLIGHT_ROWS — an env var CocoIndex never reads. The real semaphore var is COCOINDEX_MAX_INFLIGHT_COMPONENTS (default 1024, see cocoindex/_internal/app.py), so the throttle was a no-op and the EMFILE "Too many open files (os error 24)" recurred (#306). Layer A (correctness): centralize the throttle in cocoindex_subprocess_env_defaults() using the real env var; both cocoindex subprocess sites (pipeline.run_cocoindex_update + server._cocoindex_subprocess_env) apply it via setdefault so an operator override still wins. Layer B (deterministic): raise_fd_limit() raises the process soft RLIMIT_NOFILE toward its hard limit (capped 65536, never infinity) at cli.main / server.main startup. rlimits are inherited across fork+exec, so cocoindex children get headroom regardless of launch context — macOS GUI/launchd/IDE-launched processes inherit a 256 FD ceiling, not the shell's raised limit, which is why the error recurred even on hosts whose terminal shows a high ulimit. No ontology/schema/re-index impact. Fixes #306. Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
COCOINDEX_SOURCE_MAX_INFLIGHT_ROWS=256before invoking cocoindexjava_codebase_rag/pipeline.pyandserver.pyRoot Cause
CocoIndex's default concurrency (1024 inflight rows) opens too many file handles simultaneously, exceeding OS file descriptor limits (typically 256-1024). This is not user error - even with fresh OS startup and single terminal window.
Changes
java_codebase_rag/pipeline.py: Set default inrun_cocoindex_update()server.py: Set default in_cocoindex_subprocess_env()Test plan
🤖 Generated with Claude Code