feat(react-native): Android crash/ANR symbolication, bundle id, and native stack delivery#116
feat(react-native): Android crash/ANR symbolication, bundle id, and native stack delivery#116yduartep wants to merge 17 commits into
Conversation
…nd Gradle bundle-id upload
| * Human-readable exception value for ApplicationExitInfo crash reports. | ||
| * Priority: parsed trace message → system description → exception class → Flutter-style fallback. | ||
| */ | ||
| export function resolveCrashErrorMessage( |
There was a problem hiding this comment.
Is this android specific crash-error-message logic?
If yes, did we guard it so only applies to Android?
There was a problem hiding this comment.
Good question! This logic is Android-specific and is only invoked from Android native modules. The CrashReportingInstrumentation class is only instantiated on Android (via FaroCrashReporter.getCrashReports() in native Java code). iOS uses a different crash reporting path. The functions are in shared TS files for easier testing, but the entry point is platform-gated at the native module level.
| private sendCrashReport(crash: CrashReport): void { | ||
| const errorMessage = this.getErrorMessage(crash); | ||
| const parsedTrace = crash.trace | ||
| ? parseAndroidCrashTrace(crash.trace, { |
There was a problem hiding this comment.
Like, it looks like we also call this for iOS. Or maybe I am missing something
There was a problem hiding this comment.
You're right to check! The parseAndroidCrashTrace function is Android-specific and is only called from the Android code path. The CrashReportingInstrumentation is only instantiated when FaroCrashReporter.getCrashReports() (Android native module) returns crash data. iOS doesn't call into this instrumentation at all - it uses a different crash reporting mechanism. The Android-specific parsing stays in shared TS for testing convenience, but the native module layer acts as the platform guard.
|
|
||
| if (!stackFrames.length && crash.trace) { | ||
| this.logWarn( | ||
| 'Native crash trace present but no frames parsed; check Android trace format or UncaughtExceptionHandler cache' |
There was a problem hiding this comment.
Again here, looks like we assume this is only going to run for Android
There was a problem hiding this comment.
Correct observation! This warning message is Android-specific because it references Android's ApplicationExitInfo traceInputStream and the UncaughtExceptionHandler cache. This code path is only reached when processing Android crash reports from FaroCrashReporter.getCrashReports() (Android native module). iOS doesn't call into this instrumentation - it has a separate crash reporting implementation. The platform guard happens at the native module level where getCrashReports() only exists on Android.
…eshold When a cached crash trace is older than 10s relative to the ApplicationExitInfo timestamp, it's now automatically cleared from SharedPreferences instead of persisting indefinitely. This prevents stale crash data from accumulating across multiple app sessions. Co-authored-by: Cursor <cursoragent@cursor.com>
Verifies that parseAndroidCrashTrace captures the deepest 'Caused by:' exception type and message when parsing wrapped exception chains. This documents the intentional behavior: we prefer the root cause over the outer wrapper exception for more actionable error titles in the UI. Co-authored-by: Cursor <cursoragent@cursor.com>
Changed warning from suggesting to 'install UncaughtExceptionHandler cache' to accurately reflect that the trace was unavailable at crash time. The handler is already installed automatically by the SDK. Co-authored-by: Cursor <cursoragent@cursor.com>
Changed visibility from public to internal as this is an implementation detail that should not be part of the public API. Matches the visibility pattern used by FaroCrashTraceCache. Co-authored-by: Cursor <cursoragent@cursor.com>
…ignore entry Added inline comment in build.gradle warning that sourcemaps.config.json is for local dev only and credentials should use env vars in CI/production. Also added sourcemaps.config.json to .gitignore to prevent accidental commit of API keys. Co-authored-by: Cursor <cursoragent@cursor.com>
…sages Simplified documentation by removing 'Flutter-style' and 'Flutter-aligned' references. Changed to 'generic fallback' which is clearer and doesn't require knowledge of other SDK implementations. Co-authored-by: Cursor <cursoragent@cursor.com>
…rm folders Reorganized crash reporting code to make platform-specific logic explicit: - Created android/ and ios/ subfolders under crashReporting/ - Moved Android-specific files (parseAndroidCrashTrace, crashErrorMessage) to android/ - Implemented iOS crash reporting (parseIosCrashTrace, crashErrorMessage) - Updated instrumentation.ts to use Platform.OS for platform detection - Added platform-specific parsing and error message resolution iOS implementation: - Parses PLCrashReporter stack format (frame number, library, instruction pointer) - Maps BSD signals (SIGSEGV, SIGABRT, etc.) to human-readable messages - Handles uncaught NSException with name and reason - Includes comprehensive test coverage This addresses concerns about mixing platform-specific code and prepares for iOS crash reporting when PLCrashReporter dependency is added. Co-authored-by: Cursor <cursoragent@cursor.com>
Refactor crash reporting to eliminate conditional platform logic by using the Strategy Pattern with platform-specific implementations: - Create abstract BaseCrashReportingInstrumentation extending BaseInstrumentation - Implement AndroidCrashReportingInstrumentation for Android-specific logic (ApplicationExitInfo API, R8 retracing, Java/Kotlin stack parsing) - Implement IosCrashReportingInstrumentation for iOS-specific logic (PLCrashReporter, signal mapping, native frame parsing) - Export platform-aware CrashReportingInstrumentation via Platform.OS check - Move platform-specific code to android/ and ios/ subfolders - Update ANR instrumentation to import from android/ subfolder - Add comprehensive test coverage for iOS crash error messages - Add reason/status fields to CrashReport type for iOS compatibility Benefits: - Eliminates all if/else Platform.OS checks from instrumentation code - Provides clear separation between Android and iOS implementations - Makes it easier to add iOS crash reporting in the future - Improves code organization and maintainability - Each platform can evolve independently Usage remains unchanged - users instantiate CrashReportingInstrumentation and the correct platform implementation is selected automatically at module load time. Co-authored-by: Cursor <cursoragent@cursor.com>
Platform-Specific Crash Reporting Refactor@robert-northmind Thank you for the excellent feedback on code organization! I've refactored the crash reporting instrumentation to address your concerns about platform-specific code clarity. What ChangedBefore: Single After: Clean separation using the Strategy Pattern: Key Improvements
Implementation Details
Testing
This architecture makes it explicit which code is platform-specific and provides a clear path for future iOS crash reporting implementation. |
Integrated changes from main into the Android crash symbols feature branch: - Adopted PreloadedMobileMeta structure for structured mobile meta fields - Added fatal: true flag to ANR and crash error reports (matching Flutter pattern) - Added device attribute collection comments for clarity - Merged DeviceInfo mock methods (getBundleId, getBuildNumber, getApiLevel, getBuildId) - Preserved all crash reporting refactoring (BaseCrashReportingInstrumentation and platform-specific implementations) Key changes from main: - Structured mobile meta fields (device, os, app) replacing flat session attributes - loadMobileMetaForInit() replacing loadSessionDeviceAttributesForInit() - Fatal flag on errors for better crash/ANR severity classification Key changes from feature branch: - appSymbolsBundleId parameter for server-side symbol retrace - Platform-specific crash reporting architecture (Android/iOS separation) - Enhanced crash error message resolution All conflicts resolved to preserve both sets of functionality. Co-authored-by: Cursor <cursoragent@cursor.com>
Fixed test failures introduced by the PreloadedMobileMeta structure changes: 1. Updated crash reporting test to import from index instead of non-existent instrumentation file 2. Refactored symbolsBundleIdMeta to resolveAppBundleId helper that: - Properly merges bundleId from config, Metro preamble, or DeviceInfo with preloaded app meta - Includes Metro preamble bundleId in metas instead of relying on faro-core registration 3. Added app meta to metas array to ensure preloaded installationId appears in faro.metas.value 4. Merged structured mobile meta (device, os) into metas while preserving app meta separately All tests now passing. Co-authored-by: Cursor <cursoragent@cursor.com>
aranhave
left a comment
There was a problem hiding this comment.
Retested the latest head locally. Release build works, runtime bundle id is coming through and native crash + ANR payloads both include the expected fatal=true signal and source stack. I only left two small questions: one around passing the API key as a CLI arg, and one around repeated ANR detections for the same blocked period. Everything else I checked looks good to me :)
| "--endpoint", faroEndpoint, | ||
| "--app-id", faroAppId, | ||
| "--stack-id", faroStackId, | ||
| "--api-key", faroApiKey, |
There was a problem hiding this comment.
should we avoid passing the API key as a CLI arg here? Gradle --info / process listings can expose command-line args. env forwarding would keep the key out of the command line.
|
|
||
| // Persist first: survives process kill before JS/network can run. | ||
| applicationContext?.let { context -> | ||
| FaroAnrCache.savePendingAnr(context, payload) |
There was a problem hiding this comment.
in my local run one long ANR replayed as multiple ANR exception payloads plus an anr_count measurement. Should we collapse repeated tracker detections for the same blocked period?
| return JSONObject().apply { | ||
| put("type", "ANR") | ||
| put("timestamp", exitInfo.timestamp) | ||
| put("duration", ANRTracker.timeout) |
There was a problem hiding this comment.
This timeout object seems to be inited to a 5 second value and never updated. Is this what we want?
| return null | ||
| } | ||
|
|
||
| if (FaroAnrCache.hasNearbyAnrDetection(context, exitInfo.timestamp)) { |
There was a problem hiding this comment.
Having a nearby ANR does not mean app crashed for the same reason. Not a blocker, but worth tracking it somewhere.
robert-northmind
left a comment
There was a problem hiding this comment.
Nice 🤩
I left some minor comments. But all of mine are optional.
I think thought that maybe you should tackle the feedback from Ben and Vishwan. They know Andorid the best
| * threshold fires, so a process kill before the React bridge can emit still | ||
| * leaves a recoverable record for the next launch. | ||
| */ | ||
| internal object FaroAnrCache { |
There was a problem hiding this comment.
I think this class has no unit tests? Should we consider adding some?
Seems like a good way to ensure it works as expected.
There was a problem hiding this comment.
I'm not sure though how easy it is to add tests for kotlin code in a RN project.
| * Kept separate from [FaroCrashReporter] so ANR timeout rows are not replayed as | ||
| * generic previous-session `crash` events (no message / duplicate rows). | ||
| */ | ||
| object FaroAnrReporter { |
There was a problem hiding this comment.
I think same here. This seems like a pretty testable class. Maybe we would need to break out some dependencies and pass them in. Not sure how to best write/setup tests for Kotlin
| * ``` | ||
| */ | ||
| export const CrashReportingInstrumentation = | ||
| Platform.OS === 'android' ? AndroidCrashReportingInstrumentation : IosCrashReportingInstrumentation; |
There was a problem hiding this comment.
Does ReactNative more or less only run on Android and iOS/iPadOS etc?
If not, should we be a bit careful here any maybe do something like:
if (Platform.OS === 'android') {
return AndroidCrashReportingInstrumentation;
} else if (Platform.OS === 'ios') {
// RN reports both iOS and iPadOS under the "ios" name
return IosCrashReportingInstrumentation;
} else {
return NoOpCrashReportingInstrumentation;
}
Summary
Android release builds can upload R8
mapping.txtand native symbols, but the SDK must send the same bundle id at runtime and ship native Java/Kotlin stack text in a shape the collector can retrace. This PR wires bundle identity, Gradle upload, previous-session crash replay, and ANR reporting end to end.Runtime bundle id (
appBuildIdentity)meta.app.bundleIdas{applicationId}@{versionCode}@{versionName}fromreact-native-device-info.__faroBundleId_<appName>) and explicitconfig.app.bundleIdtake precedence.initializeFaro→makeRNConfig.Gradle composed source map upload (
android/build.gradle)app/build/faro/bundle-id-release.txtwritten byfaroWriteBundleIdRelease.applicationId@versionCode@versionNamebefore upload.FARO_SOURCEMAP_*env vars, withsourcemaps.config.jsonas fallback for endpoint/appId/stackId/apiKey.faroWriteBundleIdReleasewhen the Faro Gradle plugin is applied.Previous-session native crashes
Problem: After a fatal crash,
ApplicationExitInfoon the next launch often has notraceInputStream(seen on emulators and Android 16). Without a stack, ingest cannot R8-retrace and the plugin shows a generic crash with no useful title.FaroUncaughtExceptionHandler— installs as the defaultThread.UncaughtExceptionHandler. On an uncaught exception it serializesLog.getStackTraceString(throwable)(exception class, message, andat …frames) and writes it to disk before the process is killed, then forwards to the previous handler so crash behaviour is unchanged.FaroCrashTraceCache— stores that serialized trace plus a timestamp inSharedPreferencesusingcommit()(synchronous) so the write survives immediate process death. On the next launch,FaroCrashReportermatches the cached trace to the matchingApplicationExitInforow (±10s) when the OS trace is missing or lacks an exception header, then clears the cache after use.FaroCrashReporter— readsgetHistoricalProcessExitReasonsfor crash exits (not ANRs), builds JSON forCrashReportingInstrumentation, merges OS trace + cached trace, and drops rows that are ANR timeouts or already covered by ANR detection (avoids a second genericcrashline for the same incident).parseAndroidCrashTrace— parses ApplicationExitInfo /Log.getStackTraceStringtext into:module(R8 retrace input), andcrashErrorMessage— builds the error title from parsed trace, exception class, orApplicationExitInfodescription; skips rows with no stack and no useful title.CrashReportingInstrumentation— sendstype: 'crash'with parsedstackFrames,nativeExceptionTypein context, anderror.stackcleared so the JS reporter stack does not override native frames.Previous-session ANRs
Problem: A live ANR watchdog snapshot can capture the wrong thread (crash-killer frames) or be lost if the process dies before the React bridge sends telemetry.
FaroAnrReporter— on API 30+, readsApplicationExitInfowithREASON_ANR, pullstraceInputStream, and exports one JSON report per new ANR (deduped by timestamp). Exposed to JS asgetHistoricalAnrReports.AnrThreadDumpParser— extracts the"main"threadat …frames from the multi-thread OS dump (the blocked UI thread, not a background watchdog thread).FaroAnrCache— whenANRTrackerfires, persists the ANR JSON withcommit()before the process can be killed;acknowledgeANRsremoves entries after Faro sends them; detection timestamps preventFaroCrashReporterfrom emitting a duplicate crash for the same ANR.ANRInstrumentation— on init: draingetHistoricalAnrReportsfirst, then pending tracker cache as fallback; on API 30+ defer liveonANRDetectedto the next-launch AppExitInfo path; dedupe by timestamp window.anrStack— filters tracker stacks that contain Android/Faro crash-handler frames (handleApplicationCrash,KillApplicationHandler,FaroUncaughtExceptionHandler) instead of the blocked app code.Tests
parseAndroidCrashTrace.test.ts,crashErrorMessage.test.ts,appBuildIdentity.test.ts,initialize.test.ts,anrStack.test.ts.Dependencies
Requires
faro-android-gradle-pluginforbundle-id-release.txtand symbol upload, collector Android retrace (androidretrace), and a release app withmapping.txtuploaded.Verification
cd packages/react-native yarn quality:test yarn quality:lint yarn buildRelease build with symbols uploaded: trigger native crash or ANR, relaunch app. Crash rows should show retraced Kotlin/Java frames and a concrete exception title. ANR rows should show
type: ANRwith retraced main-thread frames and no extra genericcrashrow for the same trigger.Screenshots
*.ktfile:line frames.RuntimeExceptionwith R8-retraced Kotlin frames.