Skip to content

WIP: missing capabilities lead to NoSupportedApiException#6317

Draft
mahibi wants to merge 1 commit into
masterfrom
bugfix/noid/capabilitiesNullLeadToNoApiException
Draft

WIP: missing capabilities lead to NoSupportedApiException#6317
mahibi wants to merge 1 commit into
masterfrom
bugfix/noid/capabilitiesNullLeadToNoApiException

Conversation

@mahibi
Copy link
Copy Markdown
Collaborator

@mahibi mahibi commented Jun 4, 2026

crash is:

  Exception com.nextcloud.talk.utils.NoSupportedApiException:
      at com.nextcloud.talk.utils.ApiUtils.getConversationApiVersion (ApiUtils.kt:127)
      at com.nextcloud.talk.chat.data.network.RetrofitChatNetwork.getRoom (RetrofitChatNetwork.kt:31)
      at com.nextcloud.talk.conversationlist.data.network.OfflineFirstConversationsRepository$getRoom$1.invokeSuspend (OfflineFirstConversationsRepository.kt:69)
      at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith (ContinuationImpl.kt:34)
      at kotlinx.coroutines.DispatchedTask.run (DispatchedTask.kt:100)
      at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run (LimitedDispatcher.kt:124)
      at kotlinx.coroutines.scheduling.TaskImpl.run (Tasks.kt:89)
      at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely (CoroutineScheduler.kt:586)
      at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask (CoroutineScheduler.kt:820)
      at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker (CoroutineScheduler.kt:717)
      at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run (CoroutineScheduler.kt:704)

Claude suggests to use defer to avoid the crash. But it will jus hide the root cause. Some snippets from the claude chat:

● The root cause: getConversationApiVersion throws NoSupportedApiException synchronously during Observable construction (before .subscribe() is called). This exception bypassed the onError handler entirely and crashed the coroutine as an unhandled exception.

The fix wraps all three affected methods (getRoom, getCapabilities, joinRoom) in Observable.defer {}, which defers the Observable construction to subscription time — so any NoSupportedApiException now flows through onError and is handled gracefully (falling back to the local database in
OfflineFirstConversationsRepository.getRoom).

❯ why does getConversationApiVersion crash at all? ● The exception throws when none of the checked capability strings match anything in the user's spreed features list. That happens in two real scenarios:

  1. capabilities is null — hasSpreedFeatureCapability silently returns false for every version checked, so the loop exhausts all candidates and hits the throw.
  2. Stale/empty capabilities — the user object was loaded from the database before capabilities were fetched or refreshed, so spreedCapability?.features is empty or missing the expected "conversation-v4" / "conversation-v3" / "conversation-v2" / "mention-flag" entries.

In both cases the loop just falls through every version without matching, and throw NoSupportedApiException() is the only exit path. The capabilities not being present is the underlying trigger — the exception itself is the symptom.

❯ look why the capabilities could be null

● The stack trace came from OfflineFirstConversationsRepository, let me check NotificationWorker too since it directly calls chatNetworkDataSource.getRoom. Let me look at how the user object is obtained in the key callers.

Here's the root cause chain:

Capabilities come from CapabilitiesWorker — a background WorkManager job that fetches capabilities from the server and writes them to the database. They are null in the database by default and only populated after this worker succeeds.

Three scenarios where capabilities end up null when getRoom is called:

  1. Race on first install / account add — CapabilitiesWorker is queued after account verification (AccountVerificationActivity.kt:473), but if a push notification arrives before the worker finishes (or on the next app cold start before it runs), NotificationWorker loads the user via getUserWithId and gets a User with capabilities = null straight from the database.
  2. CapabilitiesWorker failure — it retries 3 times (line 129), but on network failure it just posts an EventStatus failure event and exits without persisting anything. Capabilities stay null or stale forever until the next successful run.
  3. App startup race in ChatViewModel — getRoom at line 1418 uses currentUser which is set at initialization. If getRoom is triggered before the CapabilitiesWorker result is persisted to the database and re-read into the in-memory User object, the user carried in memory still has capabilities = null.

The Observable.defer fix you now have handles all three cases correctly — the NoSupportedApiException flows to onError instead of crashing the coroutine, allowing graceful fallback to the local database.

❯ the root cause will just be hidden when i do this..? ● Yes, you're right. The Observable.defer fix stops the crash and falls back to local DB, but the API call still silently fails — the user with null capabilities never gets a refreshed conversation from the server.

The actual root cause is that getRoom is being called with a user whose capabilities haven't been loaded yet. The real fix depends on what's causing that:

  • If it's a race (CapabilitiesWorker hasn't finished yet), the callers should wait for capabilities before calling getRoom, or trigger a capabilities fetch first.
  • If it's the NotificationWorker path specifically, it could fetch capabilities inline before calling getRoom.
  • If it's a persistent null (CapabilitiesWorker keeps failing), that's a separate bug that needs its own fix.

Before deciding the fix direction — do you know in which caller context this crash actually occurred? The stack trace points to OfflineFirstConversationsRepository, which is called from ChatViewModel.getRoom and NotificationWorker. Knowing which one triggered it would narrow down the right fix.

🏁 Checklist

  • ⛑️ Tests (unit and/or integration) are included or not needed
  • 🔖 Capability is checked or not needed
  • 🔙 Backport requests are created or not needed: /backport to stable-xx.x
  • 📅 Milestone is set
  • 🌸 PR title is meaningful (if it should be in the changelog: is it meaningful to users?)

crash is:

  Exception com.nextcloud.talk.utils.NoSupportedApiException:
      at com.nextcloud.talk.utils.ApiUtils.getConversationApiVersion (ApiUtils.kt:127)
      at com.nextcloud.talk.chat.data.network.RetrofitChatNetwork.getRoom (RetrofitChatNetwork.kt:31)
      at com.nextcloud.talk.conversationlist.data.network.OfflineFirstConversationsRepository$getRoom$1.invokeSuspend (OfflineFirstConversationsRepository.kt:69)
      at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith (ContinuationImpl.kt:34)
      at kotlinx.coroutines.DispatchedTask.run (DispatchedTask.kt:100)
      at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run (LimitedDispatcher.kt:124)
      at kotlinx.coroutines.scheduling.TaskImpl.run (Tasks.kt:89)
      at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely (CoroutineScheduler.kt:586)
      at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask (CoroutineScheduler.kt:820)
      at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker (CoroutineScheduler.kt:717)
      at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run (CoroutineScheduler.kt:704)

Claude suggests to use defer to avoid the crash. But it will jus hide the root cause.
Some snippets from the claude chat:

● The root cause: getConversationApiVersion throws NoSupportedApiException synchronously during Observable construction (before .subscribe() is called). This exception bypassed the onError handler entirely and crashed the coroutine as an unhandled exception.

  The fix wraps all three affected methods (getRoom, getCapabilities, joinRoom) in Observable.defer {}, which defers the Observable construction to subscription time — so any NoSupportedApiException now flows through onError and is handled gracefully (falling back to the local database in
  OfflineFirstConversationsRepository.getRoom).

❯ why does getConversationApiVersion crash at all?
● The exception throws when none of the checked capability strings match anything in the user's spreed features list. That happens in two real scenarios:

  1. capabilities is null — hasSpreedFeatureCapability silently returns false for every version checked, so the loop exhausts all candidates and hits the throw.
  2. Stale/empty capabilities — the user object was loaded from the database before capabilities were fetched or refreshed, so spreedCapability?.features is empty or missing the expected "conversation-v4" / "conversation-v3" / "conversation-v2" / "mention-flag" entries.

  In both cases the loop just falls through every version without matching, and throw NoSupportedApiException() is the only exit path. The capabilities not being present is the underlying trigger — the exception itself is the symptom.

❯ look why the capabilities could be null

● The stack trace came from OfflineFirstConversationsRepository, let me check NotificationWorker too since it directly calls chatNetworkDataSource.getRoom. Let me look at how the user object is obtained in the key callers.

Here's the root cause chain:

  Capabilities come from CapabilitiesWorker — a background WorkManager job that fetches capabilities from the server and writes them to the database. They are null in the database by default and only populated after this worker succeeds.

  Three scenarios where capabilities end up null when getRoom is called:

  1. Race on first install / account add — CapabilitiesWorker is queued after account verification (AccountVerificationActivity.kt:473), but if a push notification arrives before the worker finishes (or on the next app cold start before it runs), NotificationWorker loads the user via getUserWithId and gets
  a User with capabilities = null straight from the database.
  2. CapabilitiesWorker failure — it retries 3 times (line 129), but on network failure it just posts an EventStatus failure event and exits without persisting anything. Capabilities stay null or stale forever until the next successful run.
  3. App startup race in ChatViewModel — getRoom at line 1418 uses currentUser which is set at initialization. If getRoom is triggered before the CapabilitiesWorker result is persisted to the database and re-read into the in-memory User object, the user carried in memory still has capabilities = null.

  The Observable.defer fix you now have handles all three cases correctly — the NoSupportedApiException flows to onError instead of crashing the coroutine, allowing graceful fallback to the local database.

❯ the root cause will just be hidden when i do this..?
● Yes, you're right. The Observable.defer fix stops the crash and falls back to local DB, but the API call still silently fails — the user with null capabilities never gets a refreshed conversation from the server.

  The actual root cause is that getRoom is being called with a user whose capabilities haven't been loaded yet. The real fix depends on what's causing that:

  - If it's a race (CapabilitiesWorker hasn't finished yet), the callers should wait for capabilities before calling getRoom, or trigger a capabilities fetch first.
  - If it's the NotificationWorker path specifically, it could fetch capabilities inline before calling getRoom.
  - If it's a persistent null (CapabilitiesWorker keeps failing), that's a separate bug that needs its own fix.

  Before deciding the fix direction — do you know in which caller context this crash actually occurred? The stack trace points to OfflineFirstConversationsRepository, which is called from ChatViewModel.getRoom and NotificationWorker. Knowing which one triggered it would narrow down the right fix.

Signed-off-by: Marcel Hibbe <dev@mhibbe.de>
@mahibi mahibi self-assigned this Jun 4, 2026
@mahibi mahibi added the 2. developing Work in progress label Jun 4, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

APK file: https://github.com/nextcloud/talk-android/actions/runs/26969118712/artifacts/7418983494
To test this change/fix you can simply download above APK file and install and test it in parallel to your existing Nextcloud app.
qrcode (please click on link to get QR code displayed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2. developing Work in progress

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant