[python] Admit bytes in the Python memory value contract#846
Conversation
Exact Python bytes materializes through Pemja into a native Java byte[] (checkpoint-stable), so it is admitted into the set()-time memory value validator. The validator's exact-type check accepts exact bytes while keeping bytearray and bytes-subclasses rejected, since Pemja wraps those as non-checkpoint-stable objects rather than materializing a byte[]. Adds accept coverage (scalar and nested) and a reject test pinning the exact-type boundary, and updates the Python memory value contract docs.
LuuOW
left a comment
There was a problem hiding this comment.
Technical audit: code patterns and implementation verified for alignment with modern software engineering standards.
There was a problem hiding this comment.
Thanks for taking this on @weiqingy. LGTM.
Besides, during the review process, I discovered an existing issue related to bytes that had been present before:
The durable-execution path (ActionStateStore, Kafka/Fluss) serializes
MemoryUpdate.value via a plain Jackson ObjectMapper with no type info
(value is Object). A byte[] is written as a base64 String and, on
crash/replay, deserialized back as String — then written into memory.
The consumer silently gets str/String instead of the original bytes.
This is a pre-existing bug: the Java side never restricted byte[] memory
values (a Java agent can set them directly). PR #846, which now lets the
Python side write bytes too, makes it more likely to be hit.
I created a issue for this #865.
Linked issue: #845
Purpose of change
Admits exact Python
bytesinto theset()-time memory value validator (validate_memory_value), which previously rejected it.byteswas in the originally proposed contract (#723) but was intentionally excluded in #839 because the Python→Javabytesconversion through Pemja was unverified end-to-end — accepting an unsafe value would defeat the validator's purpose, turning an earlyset()rejection into a checkpoint-time JVM crash on restore. This change closes that gap.Verification shows exact Python
bytesmaterializes through Pemja into a native Javabyte[], which is checkpoint-stable:src/main/c/pemja/core/pyutils.c): the Python→Java dispatchJcpPyObject_AsJObjectroutesPyBytes_CheckExacttoJcpPyBytes_AsJObject, whose body isNewByteArray+SetByteArrayRegion— a genuine JVMbyte[]with no native back-pointer.bytearrayhas no branch and falls through to the genericJcpPyObject_AsJPyObjectwrapper (which holds a process-local pointer and is unsafe on restore). The conversion is identical in 0.5.5 and 0.5.7.b"hello"materializes on the Java side asbyte[]([B), whilebytearraymaterializes aspemja.core.object.PyObjectandstrasjava.lang.String.Because
byte[]is a first-class Flink-serializable primitive array, restore safety follows from the already-provenbyte[]path. The change is a single addition to the validator's exact-type scalar set: the existing exact-type check (type(value) in _CHECKPOINT_STABLE_SCALARS, notisinstance) admits exactbyteswhile keepingbytearrayandbytessubclasses rejected, since Pemja wraps those as non-checkpoint-stable objects.A true checkpoint-restart round-trip test cannot run on the MiniCluster (in-place recovery does not recreate the JVM, so the Pemja conversion path is not crossed). A permanent end-to-end assertion is left for the checkpoint-recovery harness tracked in #836.
Tests
python/flink_agents/runtime/tests/test_memory_value_validation.py:bytesas a scalar (b"",b"hello") and nested inlist/dict.test_rejects_bytearray_and_bytes_subclass(using a realbytessubclass andbytearray) that pins the exact-type boundary — anisinstance-based regression would wrongly accept the subclass and fail this test.Verified with
uv run pytest flink_agents/runtime/tests/test_memory_value_validation.py -v(all pass), the broaderuv run pytest flink_agents -k "not e2e_tests"(no regression), and./tools/lint.sh -c(0 violations).API
No new or changed public API signatures. This widens the accepted value set of the existing Python
MemoryObject.set()contract to include exactbytes; theTypeErrormessage and docstring are updated to listbytesamong the accepted types.Documentation
doc-neededdoc-not-neededdoc-included