fix(catalog/memory): relax strict ms inequality in test_update_table#2395
fix(catalog/memory): relax strict ms inequality in test_update_table#2395Kurtiscwright wants to merge 2 commits intoapache:mainfrom
Conversation
While running make unit-test for verifying the RC the update_table test in /src/memory/catalog.rs was failing because it checks for the table and updated_table to have different updated timestamps down to the ms, but on a fast enough machine the test can update the updated_table in the same ms as the table causing the test to incorrectly fail.
|
|
||
| // Assert the table doesn't contain the update yet | ||
| assert!(!table.metadata().properties().contains_key("key")); | ||
|
|
There was a problem hiding this comment.
how about just sleep 2 ms here?
Allowing last_updated_ms to be unchanged does not make sense to me
|
Hey, one thing worth considering imho, the root issue is in self.metadata.last_updated_ms = self.last_updated_ms.unwrap_or_else(|| {
chrono::Utc::now()
.timestamp_millis()
.max(self.metadata.last_updated_ms + 1)
}); |
|
@CTTY & @toutane thank you both for commenting. I will send out a follow up commit later today. I agree philosophically with the proposal from @toutane, but I think both solutions leave an implementation detail that could be missed by future use cases. I don't like adding "latency" to something that is supposed to be fine grained. I want to understand why no timestamp is not set explicitly then propose a solution from there. Otherwise I think |
|
@CTTY @toutane I traced the full call and checked the spec + Java implementation, I'm feel strongly that we should keep the current <= approach. The spec defines last-updated-ms as a wall-clock observation, not a logical clock. From the Table Metadata Fields (https://iceberg.apache.org/spec/#table-metadata-fields) section: │ last-updated-ms Timestamp in milliseconds from the unix epoch when the table was last updated. Each table metadata file should update this field just before writing. The spec makes no monotonicity guarantee. Two updates within the same millisecond producing equal values is spec-compliant behavior. The Java implementation has the same behavior. In TableMetadata.Builder.build() (https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/TableMetadata.java), the Java code does: if (lastUpdatedMillis == null) {
this.lastUpdatedMillis = System.currentTimeMillis();
}No +1, no The Java project explicitly chose tolerance over strict ordering. Issue #1109 (apache/iceberg#1109) documents a real-world case where lastUpdatedMillis ended up lower than a previous metadata-log entry due to multi-machine clock skew. The fix (PR #1110 (apache/iceberg#1110)) added a 1-minute tolerance with the comment: // commits can happen concurrently from different machines.
// A tolerance helps us avoid failure for small clock skew
lastUpdatedMillis - last.timestampMillis() >= -ONE_MINUTEEqual timestamps pass this validation trivially (difference = 0, which is >= -60000). last-updated-ms is not used for any operational decision. No TableRequirement variant checks last-updated-ms. Conflict detection uses UUID, schema ID, snapshot ref, spec ID, sort order ID, and field IDs. Metadata version ordering is determined by the filename (v1.metadata.json → v2.metadata.json), not timestamps. Why not sleep(2ms) or max(now, previous + 1)?
The original < assertion was over-specifying an invariant the system doesn't depend on. The <= correctly reflects the spec's semantics: the timestamp records when the update happened, and two updates in the same millisecond are valid. |
While running make unit-test for verifying the RC the update_table test in /src/memory/catalog.rs was failing because it checks for the table and updated_table to have different updated timestamps down to the ms, but on a fast enough machine the test can update the updated_table in the same ms as the table causing the test to incorrectly fail.
Which issue does this PR close?
What changes are included in this PR?
tableandupdated_tableas its possible for both to be updated in the same msAre these changes tested?