Skip to content

fix(service): serialize RepopulateUsers with an asyncio lock#44

Open
lArtiquel wants to merge 1 commit into
marzneshin:masterfrom
lArtiquel:fix/serialize-repopulate-users
Open

fix(service): serialize RepopulateUsers with an asyncio lock#44
lArtiquel wants to merge 1 commit into
marzneshin:masterfrom
lArtiquel:fix/serialize-repopulate-users

Conversation

@lArtiquel

Copy link
Copy Markdown

Problem

RepopulateUsers reads the full user set, updates every user, then removes any storage user not in the incoming set — all without serialization:

users_data = (await stream.recv_message()).users_data
for user_data in users_data:
    await self._update_user(user_data)
user_ids = {user_data.user.id for user_data in users_data}
for storage_user in await self._storage.list_users():
    if storage_user.id not in user_ids:
        await self._remove_user(storage_user, storage_user.inbounds)
        await self._storage.remove_user(storage_user)

SyncUsers and other streams mutate the same xray user table concurrently. When the panel opens overlapping streams (its normal reconnect behavior), two RepopulateUsers passes can interleave: one adds a user the other is midway through removing, or both race the add/remove of the same email. The observable result is duplicate-add errors and users flapping in and out until a pass happens to win cleanly.

Fix

Guard the whole RepopulateUsers body with a per-service asyncio.Lock so passes run one at a time. asyncio is stdlib; no new dependency.

Testing

  • python -m py_compile clean.
  • Running in production across several marznode instances since 2026-05 (alongside the companion idempotent-add fix): the user-flapping and duplicate-add churn on reconnect stopped.

Note

Companion to #43 (make add_user idempotent). This PR removes the interleaving; #43 absorbs the residual duplicate-add cases (reconnects, SyncUsers/RepopulateUsers overlap). Complementary but each stands alone.

@khodedawsh @erfjab — review appreciated whenever you have time. If you'd rather serialize at the storage layer than in the service, happy to rework it that way.

RepopulateUsers reads the full user set, updates every user, then removes
any storage user not in the incoming set — all without serialization.
SyncUsers and other streams mutate the same xray user table concurrently,
so when the panel opens overlapping streams (its normal reconnect
behavior) two RepopulateUsers passes can interleave: one adds a user the
other is midway through removing, or both race the add/remove of the same
email. The observable result is duplicate-add errors and users flapping in
and out until a pass happens to win cleanly.

Guard the whole RepopulateUsers body with a per-service asyncio.Lock so
passes run one at a time. This is complementary to making add_user
idempotent: the lock removes the interleaving, the idempotent add absorbs
the residual duplicate-add cases (reconnects, SyncUsers overlap).
@sonarqubecloud

sonarqubecloud Bot commented Jul 2, 2026

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant