Skip to content

Implement getErrorClass explicitly to avoid AbstractMethodError on Spark builds without the deprecated default#197

Closed
dev-ankit wants to merge 2 commits into
streamnative:masterfrom
grataio:fix/geterrorclass-binary-compat
Closed

Implement getErrorClass explicitly to avoid AbstractMethodError on Spark builds without the deprecated default#197
dev-ankit wants to merge 2 commits into
streamnative:masterfrom
grataio:fix/geterrorclass-binary-compat

Conversation

@dev-ankit
Copy link
Copy Markdown

@dev-ankit dev-ankit commented May 29, 2026

Motivation

PulsarIllegalStateException and PulsarIllegalArgumentException implement SparkThrowable but override only getCondition, relying on the deprecated SparkThrowable.getErrorClass default (which delegates to getCondition).

On OSS Spark 4.x that is fine — getErrorClass is a @Deprecated default method See here. But some Spark distributions — notably Databricks Runtime 18 (Spark 4.1) — ship a SparkThrowable where getErrorClass is left abstract (no default body). When one of these connector exceptions escapes a task, Spark's TaskResultGetter virtual-calls getErrorClass(), finds no implementation, and throws:

java.lang.AbstractMethodError: 'java.lang.String org.apache.spark.SparkThrowable.getErrorClass()'
    at org.apache.spark.sql.pulsar.PulsarIllegalStateException.getErrorClass(PulsarExceptions.scala:92)

This kills the result-getter thread, so the original failure (e.g. a PULSAR_SINK_INCOMPATIBLE_SCHEMA produce error) is never reported and the Spark job hangs indefinitely instead of failing. On Databricks that meant multi-hour stuck jobs on idle-but-billed compute, with no error surfaced to the user.

Modifications

  • Implement getErrorClass explicitly on both PulsarIllegalStateException and PulsarIllegalArgumentException, delegating to the error class (same value as getCondition). The body then lives on the class itself instead of relying on the interface default. Harmless on OSS Spark (it overrides the deprecated default with the same value); restores correct error propagation on distributions that leave getErrorClass abstract.
  • Add PulsarExceptionsSuite.

Verifying this change

  • Make sure that the change passes the CI checks.

  • This change added tests and can be verified as follows:

    PulsarExceptionsSuite asserts each exception declares its own getErrorClass (via getDeclaredMethod, which throws if the body falls back to the interface default) and returns the expected error class — it fails on the unpatched classes and passes with the fix. Validated locally with ./mvnw -DskipTests test-compile against Spark 4.1.1.

Documentation

Check the box below.

Need to update docs?

  • doc-required
  • no-need-doc
  • doc

Note: happy to file a linked tracking issue, or take a different approach (e.g. dropping SparkThrowable from these two classes entirely) if you'd prefer.

…ions

PulsarIllegalStateException and PulsarIllegalArgumentException implement
SparkThrowable but override only getCondition, relying on the deprecated
SparkThrowable.getErrorClass default (which delegates to getCondition).

Some Spark distributions — notably Databricks Runtime 18 (Spark 4.1) — leave
getErrorClass abstract rather than providing that default. When one of these
exceptions escapes a task, Spark's TaskResultGetter virtual-calls
getErrorClass(), finds no implementation, and raises:

  java.lang.AbstractMethodError:
    'java.lang.String org.apache.spark.SparkThrowable.getErrorClass()'
    at org.apache.spark.sql.pulsar.PulsarIllegalStateException.getErrorClass

This kills the result-getter thread, so the original failure (e.g. an
incompatible-schema produce) is never reported and the Spark job hangs
indefinitely instead of failing.

Implement getErrorClass explicitly (delegating to the error class, same as
getCondition) so the body lives on the class itself. This is harmless on OSS
Spark, where it simply overrides the deprecated default, and restores correct
error propagation on distributions that drop it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@dev-ankit dev-ankit requested a review from a team as a code owner May 29, 2026 13:36
@github-actions
Copy link
Copy Markdown

@dev-ankit:Thanks for your contribution. For this PR, do we need to update docs?
(The PR template contains info about doc, which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks)

@github-actions github-actions Bot added doc-info-missing This pr needs to mark a document option in description and removed doc-info-missing This pr needs to mark a document option in description labels May 29, 2026
@github-actions
Copy link
Copy Markdown

@dev-ankit:Thanks for providing doc info!

@github-actions github-actions Bot added the no-need-doc This pr does not need any document label May 29, 2026
@dev-ankit dev-ankit marked this pull request as draft May 29, 2026 13:47
PulsarExceptions.pulsarSinkInvalidSchema references the error class
"PULSAR_SINK_INVALID_SCHEMA", which is not present in
error/pulsar-error-classes.json (only "PULSAR_SINK_INVALID_SCHEMA_TYPE" is),
so constructing it raises INTERNAL_ERROR. Use pulsarProviderInvalidSaveMode,
whose error class is defined, to exercise PulsarIllegalArgumentException.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@dev-ankit
Copy link
Copy Markdown
Author

Superseded by #198 (re-opened from a personal fork instead of the grataio org fork). Closing this one.

@dev-ankit dev-ankit closed this May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-need-doc This pr does not need any document

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant