Skip to content

feat(dataconverter): add custom DataConverter samples#288

Merged
Bueller87 merged 11 commits into
cadence-workflow:masterfrom
Bueller87:update-data-converter-samples
May 18, 2026
Merged

feat(dataconverter): add custom DataConverter samples#288
Bueller87 merged 11 commits into
cadence-workflow:masterfrom
Bueller87:update-data-converter-samples

Conversation

@Bueller87

Copy link
Copy Markdown
Contributor

[Which sample(s) or area?]

dataconverter, root README.md, and dataconverter tests.

[What changed?]

  • Added a compression sample that wraps JSON payloads with gzip to reduce large repetitive workflow/activity payloads before they are written to Cadence history.
  • Added an encryption sample that wraps JSON payloads with AES-256-GCM, including demo key loading, nonce/tag handling, and guidance about logs and other disclosure surfaces.
  • Added a BlobStore / S3 claim-check offload sample that stores large payloads outside Cadence history, keeps only a small reference in history, and uses a zero-config local filesystem store by default.

[Why?]

  • Compression gives users a copyable pattern for reducing history size and storage overhead when workflows pass large, repetitive JSON payloads.
  • Encryption gives users a concrete starting point for keeping sensitive workflow and activity payloads unreadable in Cadence history while making the key-management and logging caveats explicit.
  • BlobStore / S3 offload gives users a claim-check pattern for payloads approaching Cadence size limits without forcing workflow or activity code to manage object-store references directly.

[How did you test it?]

Run the targeted tests:

./gradlew test --tests 'com.uber.cadence.samples.dataconverter.*'

Run the samples manually:

  1. Start Cadence locally.

  2. Register the sample domain if needed:

./gradlew -q execute -PmainClass=com.uber.cadence.samples.common.RegisterDomain

  1. Start the DataConverter worker in terminal 1:

./gradlew -q execute -PmainClass=com.uber.cadence.samples.dataconverter.DataConverterWorker

  1. Start each sample workflow from terminal 2:

./gradlew -q execute -PmainClass=com.uber.cadence.samples.dataconverter.CompressionStarter

./gradlew -q execute -PmainClass=com.uber.cadence.samples.dataconverter.EncryptionStarter

./gradlew -q execute -PmainClass=com.uber.cadence.samples.dataconverter.S3OffloadStarter

Also verified the final diff with:

git diff --check

[Potential risks]

Low. The changes are additive and isolated to the new dataconverter sample package plus README/test coverage. Existing samples should be unaffected; the main caveat is that users must adapt the demo encryption key and local BlobStore before using these patterns in production.

[Release notes]

  • Added a gzip compression DataConverter sample.
  • Added an AES-256-GCM encryption DataConverter sample.
  • Added a BlobStore / S3 claim-check offload DataConverter sample.

[Documentation Changes]

  • Updated the root README.md with the new DataConverter sample entry.
  • Added a package README with setup, run commands, and sample selection guidance.
  • Documented encryption key configuration, local BlobStore behavior, and S3 swap instructions.

Bueller87 added 7 commits May 11, 2026 09:44
Add a new  package with three production-ready
custom  implementations following the
take-and-go layout: gzip compression, AES-256-GCM encryption, and an
S3 / claim-check offload pattern (with a zero-config local-filesystem
default and a commented AWS SDK v2 stub). One worker hosts all three
on three task lists and prints per-sample stats banners on startup.

Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Comment on lines +68 to +76
try {
ByteArrayOutputStream out = new ByteArrayOutputStream();
try (GZIPOutputStream gzip = new GZIPOutputStream(out)) {
gzip.write(jsonBytes);
}
return out.toByteArray();
} catch (IOException e) {
throw new DataConverterException("Failed to gzip-compress JSON payload", e);
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
try {
ByteArrayOutputStream out = new ByteArrayOutputStream();
try (GZIPOutputStream gzip = new GZIPOutputStream(out)) {
gzip.write(jsonBytes);
}
return out.toByteArray();
} catch (IOException e) {
throw new DataConverterException("Failed to gzip-compress JSON payload", e);
}
ByteArrayOutputStream out = new ByteArrayOutputStream();
try (GZIPOutputStream gzip = new GZIPOutputStream(out)) {
gzip.write(jsonBytes);
} catch (IOException e) {
throw new DataConverterException("Failed to gzip-compress JSON payload", e);
}
return out.toByteArray();

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion and taken.

* production, use S3 object lifecycle policies to automatically expire old blobs.
* =============================================================================
*/
public final class S3OffloadDataConverter implements DataConverter {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use MinIO to integrate with real S3API. Otherwise, rename the example as largeblob to avoid confusion. There are users who are on GCP using GCS.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really good point. The implementation is already storage-agnostic (everything goes through a BlobStore interface), so the "S3" branding ends up being a bit of an apology.

I renamed the sample to something provider-neutral in this PR claimcheck since that's the formal pattern name, but happy to go with largeblob if you prefer.

Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Bueller87 added 3 commits May 15, 2026 12:45
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
Signed-off-by: “Kevin” <kevlar_ksb@yahoo.com>
@gitar-bot

gitar-bot Bot commented May 18, 2026

Copy link
Copy Markdown
Code Review ✅ Approved 1 resolved / 1 findings

Introduces Gzip compression, AES-256-GCM encryption, and S3-based claim-check DataConverter samples with updated documentation. The missing GZIPOutputStream import was resolved, and no other issues were found.

✅ 1 resolved
Bug: GZIPOutputStream removed — gzip variable is undefined

📄 src/main/java/com/uber/cadence/samples/compression/CompressedJsonDataConverter.java:68-76
The refactoring removed the GZIPOutputStream creation (try (GZIPOutputStream gzip = new GZIPOutputStream(out))) but left a reference to gzip.write(jsonBytes) on line 71. This code will not compile because the gzip variable is never declared. Additionally, return out.toByteArray() on line 75 is outside the try-catch block, so even if the variable existed, the data written to out would never be gzip-compressed (the stream is never flushed/closed). The original try-with-resources pattern was correct and should be restored.

Rules ✅ All requirements met

Repository Rules

PR Description Quality Standards: The PR description contains all required sections with substantive content, including clear explanations of what changed, why, and copyable test commands.
Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@shijiesheng shijiesheng left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense

@Bueller87 Bueller87 merged commit 8a8cb95 into cadence-workflow:master May 18, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants