I hope this message finds you well. First of all, I sincerely apologize for reaching out this way. I couldn't find a better channel, and I hope you don't mind.
I also want to take a moment to thank you for your work on pyzstd. It has served us reliably, and we genuinely appreciate the effort you've put into the library.
We are currently in the process of migrating our compression layer from pyzstd to Python's built-in compression.zstd module (introduced in Python 3.14), and during that transition, we noticed an interesting behavioral difference that we haven't been able to fully explain.
With pyzstd, the following call always produced an identical byte output across multiple invocations:
pyzstd.compress(
data=content,
zstd_dict=training_dict.as_digested_dict
)
However, with the new built-in module, the equivalent call:
zstd.compress(
data=content,
zstd_dict=training_dict.as_digested_dict
)
...produces byte objects of different sizes across two consecutive calls with the same input.
One additional detail that may be relevant: between the two compress() calls, we reload the dictionary from disk using:
training_dict = zstd.ZstdDict(zst_file)
It is the exact same .zst file each time, we simply call this line again before the second compress() call. We're not sure whether re-instantiating ZstdDict from the same file could have any effect on the resulting compressed output, but we wanted to mention it in case it's a factor.
Would you happen to know why this might be the case?
Thank you so much for your time, and apologies again for the interruption.
I hope this message finds you well. First of all, I sincerely apologize for reaching out this way. I couldn't find a better channel, and I hope you don't mind.
I also want to take a moment to thank you for your work on pyzstd. It has served us reliably, and we genuinely appreciate the effort you've put into the library.
We are currently in the process of migrating our compression layer from pyzstd to Python's built-in compression.zstd module (introduced in Python 3.14), and during that transition, we noticed an interesting behavioral difference that we haven't been able to fully explain.
With pyzstd, the following call always produced an identical byte output across multiple invocations:
However, with the new built-in module, the equivalent call:
...produces byte objects of different sizes across two consecutive calls with the same input.
One additional detail that may be relevant: between the two compress() calls, we reload the dictionary from disk using:
It is the exact same .zst file each time, we simply call this line again before the second compress() call. We're not sure whether re-instantiating ZstdDict from the same file could have any effect on the resulting compressed output, but we wanted to mention it in case it's a factor.
Would you happen to know why this might be the case?
Thank you so much for your time, and apologies again for the interruption.